Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Selective array and map column reader #10448

Closed

Conversation

HuamengJiang
Copy link
Contributor

Summary:
Implement selective array and map column reader. This is another type of top level column without independent null streams, hence requiring some new functionalities for loading nullable encoding.

There is another nuance in the diff where selective reader currently always loads the nulls first and then the values, and passes the combined nulls into readLengths methods instead of just the top level incoming nulls for scattering. We have 3 more ideal options

  1. a materializeNonNull api for encodings
  2. a materialize materializeNullable api for encodings for combined nulls
  3. a way to have selective reader not having to materialize combined nulls without compromising efficiency.

For now we have added a hack in NimbleData to load values along with the nulls for nullable encodings and return the cached value when calling readLengths later. In order to fit this access pattern, we also override the skip methods.

Differential Revision: D58937281

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 11, 2024
Copy link

netlify bot commented Jul 11, 2024

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit f64b60a
🔍 Latest deploy log https://app.netlify.com/sites/meta-velox/deploys/669f832c77ac0c00083d7195

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D58937281

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D58937281

HuamengJiang pushed a commit to HuamengJiang/velox-1 that referenced this pull request Jul 14, 2024
Summary:
Pull Request resolved: facebookincubator#10448

Implement selective array and map column reader. This is another type of top level column without independent null streams, hence requiring some new functionalities for loading nullable encoding.

There is another nuance in the diff where selective reader currently always loads the nulls first and then the values, and passes the combined nulls into readLengths methods instead of just the top level incoming nulls for scattering. We have 3 more ideal options
1) a materializeNonNull api for encodings
2) a materialize materializeNullable api for encodings for combined nulls
3) a way to have selective reader not having to materialize combined nulls without compromising efficiency.

For now we have added a hack in NimbleData to load values along with the nulls for nullable encodings and return the cached value when calling readLengths later. In order to fit this access pattern, we also override the skip methods.

Differential Revision: D58937281
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D58937281

HuamengJiang pushed a commit to HuamengJiang/velox-1 that referenced this pull request Jul 14, 2024
Summary:
Pull Request resolved: facebookincubator#10448

Implement selective array and map column reader. This is another type of top level column without independent null streams, hence requiring some new functionalities for loading nullable encoding.

There is another nuance in the diff where selective reader currently always loads the nulls first and then the values, and passes the combined nulls into readLengths methods instead of just the top level incoming nulls for scattering. We have 3 more ideal options
1) a materializeNonNull api for encodings
2) a materialize materializeNullable api for encodings for combined nulls
3) a way to have selective reader not having to materialize combined nulls without compromising efficiency.

For now we have added a hack in NimbleData to load values along with the nulls for nullable encodings and return the cached value when calling readLengths later. In order to fit this access pattern, we also override the skip methods.

Differential Revision: D58937281
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D58937281

HuamengJiang pushed a commit to HuamengJiang/velox-1 that referenced this pull request Jul 22, 2024
Summary:
Pull Request resolved: facebookincubator#10448

Implement selective array and map column reader. This is another type of top level column without independent null streams, hence requiring some new functionalities for loading nullable encoding.

There is another nuance in the diff where selective reader currently always loads the nulls first and then the values, and passes the combined nulls into readLengths methods instead of just the top level incoming nulls for scattering. We have 3 more ideal options
1) a materializeNonNull api for encodings
2) a materialize materializeNullable api for encodings for combined nulls
3) a way to have selective reader not having to materialize combined nulls without compromising efficiency.

For now we have added a hack in NimbleData to load values along with the nulls for nullable encodings and return the cached value when calling readLengths later. In order to fit this access pattern, we also override the skip methods.

Differential Revision: D58937281
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D58937281

HuamengJiang pushed a commit to HuamengJiang/velox-1 that referenced this pull request Jul 22, 2024
Summary:
Pull Request resolved: facebookincubator#10448

Implement selective array and map column reader. This is another type of top level column without independent null streams, hence requiring some new functionalities for loading nullable encoding.

There is another nuance in the diff where selective reader currently always loads the nulls first and then the values, and passes the combined nulls into readLengths methods instead of just the top level incoming nulls for scattering. We have 3 more ideal options
1) a materializeNonNull api for encodings
2) a materialize materializeNullable api for encodings for combined nulls
3) a way to have selective reader not having to materialize combined nulls without compromising efficiency.

For now we have added a hack in NimbleData to load values along with the nulls for nullable encodings and return the cached value when calling readLengths later. In order to fit this access pattern, we also override the skip methods.

Differential Revision: D58937281
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D58937281

HuamengJiang pushed a commit to HuamengJiang/velox-1 that referenced this pull request Jul 22, 2024
Summary:
Pull Request resolved: facebookincubator#10448

Implement selective array and map column reader. This is another type of top level column without independent null streams, hence requiring some new functionalities for loading nullable encoding.

There is another nuance in the diff where selective reader currently always loads the nulls first and then the values, and passes the combined nulls into readLengths methods instead of just the top level incoming nulls for scattering. We have 3 more ideal options
1) a materializeNonNull api for encodings
2) a materialize materializeNullable api for encodings for combined nulls
3) a way to have selective reader not having to materialize combined nulls without compromising efficiency.

For now we have added a hack in NimbleData to load values along with the nulls for nullable encodings and return the cached value when calling readLengths later. In order to fit this access pattern, we also override the skip methods.

Differential Revision: D58937281
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D58937281

HuamengJiang pushed a commit to HuamengJiang/velox-1 that referenced this pull request Jul 23, 2024
Summary:
Pull Request resolved: facebookincubator#10448

Implement selective array and map column reader. This is another type of top level column without independent null streams, hence requiring some new functionalities for loading nullable encoding.

There is another nuance in the diff where selective reader currently always loads the nulls first and then the values, and passes the combined nulls into readLengths methods instead of just the top level incoming nulls for scattering. We have 3 more ideal options
1) a materializeNonNull api for encodings
2) a materialize materializeNullable api for encodings for combined nulls
3) a way to have selective reader not having to materialize combined nulls without compromising efficiency.

For now we have added a hack in NimbleData to load values along with the nulls for nullable encodings and return the cached value when calling readLengths later. In order to fit this access pattern, we also override the skip methods.

Reviewed By: Yuhta

Differential Revision: D58937281
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D58937281

HuamengJiang pushed a commit to HuamengJiang/velox-1 that referenced this pull request Jul 23, 2024
Summary:
Pull Request resolved: facebookincubator#10448

Implement selective array and map column reader. This is another type of top level column without independent null streams, hence requiring some new functionalities for loading nullable encoding.

There is another nuance in the diff where selective reader currently always loads the nulls first and then the values, and passes the combined nulls into readLengths methods instead of just the top level incoming nulls for scattering. We have 3 more ideal options
1) a materializeNonNull api for encodings
2) a materialize materializeNullable api for encodings for combined nulls
3) a way to have selective reader not having to materialize combined nulls without compromising efficiency.

For now we have added a hack in NimbleData to load values along with the nulls for nullable encodings and return the cached value when calling readLengths later. In order to fit this access pattern, we also override the skip methods.

Reviewed By: Yuhta

Differential Revision: D58937281
Summary:
Pull Request resolved: facebookincubator#10448

Implement selective array and map column reader. This is another type of top level column without independent null streams, hence requiring some new functionalities for loading nullable encoding.

There is another nuance in the diff where selective reader currently always loads the nulls first and then the values, and passes the combined nulls into readLengths methods instead of just the top level incoming nulls for scattering. We have 3 more ideal options
1) a materializeNonNull api for encodings
2) a materialize materializeNullable api for encodings for combined nulls
3) a way to have selective reader not having to materialize combined nulls without compromising efficiency.

For now we have added a hack in NimbleData to load values along with the nulls for nullable encodings and return the cached value when calling readLengths later. In order to fit this access pattern, we also override the skip methods.

Reviewed By: Yuhta

Differential Revision: D58937281
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D58937281

@facebook-github-bot
Copy link
Contributor

This pull request has been merged in afd6753.

Copy link

Conbench analyzed the 1 benchmark run on commit afd6753a.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants