Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for using Delta Lake vacuum() and Delta Lake optimize() through Arrow Flight #239

Open
9 tasks
CGodiksen opened this issue Oct 7, 2024 · 0 comments
Assignees
Labels
enhancement New feature or request

Comments

@CGodiksen
Copy link
Collaborator

Delta Lake has functionality for removing stale files and optimizing files, however since we provide a layer over Delta Lake, this functionality is not available to the user of ModelarDB. Actions should be added to Arrow Flight that makes it possible to use this functionality.

When vacuuming or optimizing a table, the metadata Delta Lake should also be vacuumed or optimized since this is cheap and we do not want the user to have to manually vacuum or optimize the metadata Delta Lake.

  • Add new endpoint to server Apache Arrow Flight server to vacuum a table.
  • Add new endpoint to server Apache Arrow Flight server to optimize a table.
  • Consider how to handle passing extra arguments such as retention period to endpoint.
  • Add new endpoint to manager Apache Arrow Flight server to vacuum a table in the entire cluster.
  • Add new endpoint to manager Apache Arrow Flight server to optimize a table in the entire cluster.
  • Consider how to handle passing extra arguments to optimize endpoint.
  • Consider if we should have endpoints to vacuum or optimize all tables.
  • Add integration tests.
  • Add unit tests.
@CGodiksen CGodiksen added the enhancement New feature or request label Oct 7, 2024
@CGodiksen CGodiksen self-assigned this Oct 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant