Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Spark] Support List and Map columns in Uniform #2459

Closed

Conversation

LukasRupprecht
Copy link
Collaborator

Which Delta project/connector is this regarding?

  • Spark
  • Standalone
  • Flink
  • Kernel
  • Other (fill in here)

Description

This PR adds support for List and Map columns in Uniform. To support these types, Delta column mapping needs to write additional field IDs to the parquet schema. List columns require one additional field ID for the 'element' subfield and Map columns require two additional field IDs for the 'key' and 'value' subfields inside the parquet file. These nested field IDs are added to the table schema during the generation of the IDs and physical names for column mapping. They are added to the parquet schema through a new class, DeltaParquetWriteSupport, that hooks into Spark's parquet write path and rewrites the parquet schema based on the additional field IDs.

This PR is part of #2297.

How was this patch tested?

Unit tests will be added soon in a separate PR.

Does this PR introduce any user-facing changes?

No

Copy link
Contributor

@lzlfred lzlfred left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very excited to see Map and List support for Uniform Iceberg !

@LukasRupprecht LukasRupprecht deleted the list-map-support branch April 3, 2024 00:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants