Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Having a generic access to `invrefpool` is needed in JuliaData/DataFrames.jl#2612. Consider a short table and a long table joined on some column. In order to be fast we need to map values from short table key to ref values of long table key. This allows two things for `innerjoin`: 1. we immediately can drop values from short table not present in long table. 2. later we can do join on integer columns which is way faster than joining on e.g. string column. Also since we do mapping of short table this operation should be fast. In particular if short table defines `refarray` it is particularly fast, as we only need to map the reference values. For CategoricalArrays.jl and PooledArrays.jl `invrefpool` is simply `get` on the inverted pool `Dict` with `nothing` as a sentinel. I am not sure what would have to be defined in Arrow.jl.
- Loading branch information