Skip to content
This repository has been archived by the owner on Jan 11, 2021. It is now read-only.

Planning milestone 1 #40

Closed
sadikovi opened this issue Feb 6, 2018 · 9 comments
Closed

Planning milestone 1 #40

sadikovi opened this issue Feb 6, 2018 · 9 comments

Comments

@sadikovi
Copy link
Collaborator

sadikovi commented Feb 6, 2018

I compiled list of items that could be fixed in the near future, though it is not a complete list:

  • Adding read support, this includes adding a high level API for reading data, additionally tooling similar to parquet-mr/parquet-tools/cli.
  • Adding write support, this includes adding a high level API for writing data.
  • Improving dictionary encoding/decoding (I am not sure about the effort here, but might require a few changes).
  • Improving delta binary packed encoding/decoding (I made some mistakes earlier, so this might require quite a few changes).
  • Adding benchmarks for delta byte array and delta length byte array encoding/decoding.
  • Fixing TODO items.

Questions I have:

  • Do we need to add more encoders/decoders?
  • Are there any other items that I missed that should be listed?
  • What is the priority of tasks? What should be done first?
  • Are there any ideas/suggestions for tracking such progress?
@sadikovi
Copy link
Collaborator Author

sadikovi commented Feb 6, 2018

@sunchao I hope I am not being annoying:) Would you mind to have a look at the list of items and let me know what you think? Or come up with even better list! Thanks!

@sunchao
Copy link
Owner

sunchao commented Feb 7, 2018

I hope I am not being annoying:)
@sadikovi Absolutely not. Thanks for the contribution!

Overall I think the list looks good. In my opinion, to get milestone 1 (i.e., 0.1.0 release), we'll need to get the following done (in priority orders):

  1. read support as you suggested in bullet 1.
  2. hardening the existing encoding/decoding features (bullet 3, 4, 5 above) - we'll need to some benchmarking against parquet-cpp and parquet-mr on this.
  3. better memory management - currently there's barely any memory management. I have a branch for this but it needs polishing.
  4. expose a set of clean defined APIs
  5. publish to crates.io - ideally we don't want to include the generated thrift file in the repo, but need to find a way to do that.

The write support is another big chunk of work and IMHO we should postpone that to future releases. We can track the progress in the 'Projects' tab. What do you think?

@sadikovi
Copy link
Collaborator Author

sadikovi commented Feb 7, 2018

No worries, my pleasure!

Looks great! You are right, let's track the progress on "Projects" tab. Would be helpful to split the issues into small chunks, so they are not overwhelming:). Could you create all necessary stories in the Projects, when you have time? And we could close this issue then. Thanks!

Would be amazing to complete the first milestone!

P.S. I tried to work on read support, but it went nowhere, and I abandoned it. I tried doing it manually, similar to parquet-mr, but just could not figure out how it should work:). I might try again later.

@sunchao
Copy link
Owner

sunchao commented Feb 7, 2018

Yup. I'll create sub-tasks and attach them to the Projects tab. Also feel free to create tasks by yourself. :)

P.S. I tried to work on read support, but it went nowhere, and I abandoned it. I tried doing it manually, similar to parquet-mr, but just could not figure out how it should work:). I might try again later.

Feel free to share some details in the relevant tasks.

@sadikovi
Copy link
Collaborator Author

sadikovi commented Feb 7, 2018

Okay, thanks!

@sadikovi sadikovi changed the title Planning Planning milestone 1 Feb 7, 2018
@sunchao
Copy link
Owner

sunchao commented Feb 7, 2018

@sadikovi Could you add a task for adding read support and the CLI tools? you can then link it to the Projects.

@sadikovi
Copy link
Collaborator Author

sadikovi commented Feb 8, 2018

@sunchao I created #41 and #42 for read support and CLI. I do not have any option to link them to the Projects board, or assign milestone (looks like permissions issue) or tags. I also do not have + Add cards button. Could you link them? Thanks.

@sunchao
Copy link
Owner

sunchao commented Feb 8, 2018

Done. Thanks!

@sadikovi
Copy link
Collaborator Author

sadikovi commented Feb 8, 2018

Thanks! I am going to close this planning, since we have planned the future work!

@sadikovi sadikovi closed this as completed Feb 8, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants