-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to deal with latency with large number of columns? #80
Comments
Hi @Moelf, This is a known issue with TypedTables. TypedTables excels in situations with relatively few columns (< 20-30) and will otherwise be a burden on the compiler for wider tables, since the compiler needs to generate specialized code for each column. In the case you have 1000+ column tables we need to use a more "dynamic" representation of the data. I had some work-in-progress on this at #66, but it is not complete. As Dictionaries.jl is maturing, it should be possible to push forward with this work (and replace In the meanwhile I think your best bets are to use DataFrames.jl or else keep your data as a |
For now TypedTables is fairly good it takes my lazy column without complain and "real work" should only require <50 columns. Looping over typed table is blazing fast, I really appreciate the work done here. |
compiler got much faster in 1.8, I don't think this is a real concern for much |
We've also started to use TypedTables on HDF5-on-disk-columns (wrapped as arrays), seems to work well so far. |
Currently I have a column type that is lazy. It represents ~GB of stuff that needs to be read and decompressed on the fly and cached (by chunk). Turns out I can construct
Table
nicely and the laziness works.However, sometimes we have 1000+ columns, in this case the compiler struggles a lot.
Is it possible to have a less-typed but same interfaced
Table
?The text was updated successfully, but these errors were encountered: