You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
on trino-371, we can successfully query the hive table, however, we will have problem when querying the iceberg table select * from iceberg.my_db.my_tbl_ice, then following error is generated:
io.trino.spi.TrinoException: Error opening Iceberg split s3n://mask-out-parquet-file-location (offset=0, length=740): Metadata is missing for column: [configs, key_value, key] required binary key (STRING) = 2
at io.trino.plugin.iceberg.IcebergPageSourceProvider.createParquetPageSource(IcebergPageSourceProvider.java:713)
at io.trino.plugin.iceberg.IcebergPageSourceProvider.createDataPageSource(IcebergPageSourceProvider.java:276)
at io.trino.plugin.iceberg.IcebergPageSourceProvider.createPageSource(IcebergPageSourceProvider.java:207)
at io.trino.plugin.base.classloader.ClassLoaderSafeConnectorPageSourceProvider.createPageSource(ClassLoaderSafeConnectorPageSourceProvider.java:49)
at io.trino.split.PageSourceManager.createPageSource(PageSourceManager.java:68)
at io.trino.operator.TableScanOperator.getOutput(TableScanOperator.java:308)
at io.trino.operator.Driver.processInternal(Driver.java:388)
at io.trino.operator.Driver.lambda$processFor$9(Driver.java:292)
at io.trino.operator.Driver.tryWithLock(Driver.java:693)
at io.trino.operator.Driver.processFor(Driver.java:285)
at io.trino.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:1092)
at io.trino.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:163)
at io.trino.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:488)
at io.trino.$gen.Trino_0a00079____20220306_050111_2.run(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: io.trino.parquet.ParquetCorruptionException: Metadata is missing for column: [configs, key_value, key] required binary key (STRING) = 2
at io.trino.parquet.reader.ParquetReader.getColumnChunkMetaData(ParquetReader.java:413)
at io.trino.parquet.reader.ParquetReader.<init>(ParquetReader.java:183)
at io.trino.parquet.reader.ParquetReader.<init>(ParquetReader.java:134)
at io.trino.plugin.iceberg.IcebergPageSourceProvider.createParquetPageSource(IcebergPageSourceProvider.java:671)
... 16 more
The text was updated successfully, but these errors were encountered:
puchengy
changed the title
[Iceberg] Trino 371 not able to read nested map type written by parquet-mr 1.10.1
[Iceberg Parquet] Trino 371 not able to read nested map type written by parquet-mr 1.10.1
Mar 6, 2022
findepi
changed the title
[Iceberg Parquet] Trino 371 not able to read nested map type written by parquet-mr 1.10.1
Trino not able to read nested map type written as Parquet Hive table by parquet-mr 1.10.1 and converted to Iceberg table
Mar 7, 2022
I found out that Trino 371 is not able to read nested map type written by parquet-mr 1.10.1.
This is the way to reproduce: in spark-sql 2.4.4 (which is using parquet 1.10.1), create a hive table with a nested map type column
Then, using spark-3.2 to create a iceberg table on top of the hive table with the snapshot procedure
on trino-371, we can successfully query the hive table, however, we will have problem when querying the
iceberg table select * from iceberg.my_db.my_tbl_ice
, then following error is generated:Below is the parquet file information
If the hive table is created with spark-3.2 using parquet-1.12.2, such problem won't exist. And the parquet meta info is as such
The text was updated successfully, but these errors were encountered: