Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hive fails opening split for zst compressed files #173

Open
benrifkind opened this issue Jul 27, 2023 · 0 comments
Open

Hive fails opening split for zst compressed files #173

benrifkind opened this issue Jul 27, 2023 · 0 comments

Comments

@benrifkind
Copy link

This issue is copied from trinodb/trino#17792 since I believe this repo is where zstd de/compression is handled.

I have a Hive table built on top of zst compressed data. On a Trino 419 cluster I get the following error when trying to read this table from Trino. That version of Trino uses aircompressor 0.23.

Query 20230607_172621_00003_5hpz7 failed: Error opening Hive split s3://path/to/file.csv.access.log.zst (offset=0, length=1544108): Window size too large (not yet supported): offset=3084

We are currently running Trino 405 and this query executes without an issue. We also have been running previous versions of Trino/Presto and this executed without an issue in the past. That version of Trino uses 0.21

Did something change between aircompressor 0.21 and 0.24 that might have caused this? And is there anything I can do to get past this error? Thanks in advance for your help!

Full stack trace

io.trino.spi.TrinoException: Error opening Hive split s3://path/to/file.csv.access.log.zst (offset=0, length=1544108): Window size too large (not yet supported): offset=3084
at io.trino.plugin.hive.line.LinePageSourceFactory.createPageSource(LinePageSourceFactory.java:179)
at io.trino.plugin.hive.HivePageSourceProvider.createHivePageSource(HivePageSourceProvider.java:218)
at io.trino.plugin.hive.HivePageSourceProvider.createPageSource(HivePageSourceProvider.java:156)
at io.trino.plugin.base.classloader.ClassLoaderSafeConnectorPageSourceProvider.createPageSource(ClassLoaderSafeConnectorPageSourceProvider.java:48)
at io.trino.split.PageSourceManager.createPageSource(PageSourceManager.java:61)
at io.trino.operator.TableScanOperator.getOutput(TableScanOperator.java:298)
at io.trino.operator.Driver.processInternal(Driver.java:402)
at io.trino.operator.Driver.lambda$process$8(Driver.java:305)
at io.trino.operator.Driver.tryWithLock(Driver.java:701)
at io.trino.operator.Driver.process(Driver.java:297)
at io.trino.operator.Driver.processForDuration(Driver.java:268)
at io.trino.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:888)
at io.trino.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:187)
at io.trino.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:556)
at io.trino.$gen.Trino_18f7842____20230607_162211_2.run(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: io.airlift.compress.MalformedInputException: Window size too large (not yet supported): offset=3084
at io.airlift.compress.zstd.Util.verify(Util.java:45)
at io.airlift.compress.zstd.ZstdFrameDecompressor.decodeCompressedBlock(ZstdFrameDecompressor.java:303)
at io.airlift.compress.zstd.ZstdIncrementalFrameDecompressor.partialDecompress(ZstdIncrementalFrameDecompressor.java:236)
at io.airlift.compress.zstd.ZstdInputStream.read(ZstdInputStream.java:89)
at io.airlift.compress.zstd.ZstdHadoopInputStream.read(ZstdHadoopInputStream.java:53)
at com.google.common.io.CountingInputStream.read(CountingInputStream.java:64)
at java.base/java.io.InputStream.readNBytes(InputStream.java:506)
at io.trino.hive.formats.line.text.TextLineReader.fillBuffer(TextLineReader.java:248)
at io.trino.hive.formats.line.text.TextLineReader.(TextLineReader.java:67)
at io.trino.hive.formats.line.text.TextLineReaderFactory.createLineReader(TextLineReaderFactory.java:77)
at io.trino.plugin.hive.line.LinePageSourceFactory.createPageSource(LinePageSourceFactory.java:171)
... 17 more

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant