-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add CDFQuantile type #416
Labels
type: enhancement
Improvement of existing feature or code
Comments
6 tasks
Jiaweihu08
pushed a commit
that referenced
this issue
Oct 1, 2024
JosepSampe
added a commit
that referenced
this issue
Oct 24, 2024
* Issue #424: Add sampling fraction option for optimization (#426) * Add sampling fraction option for optimization and remove analyze from QbeastTable * Issue #430: Simplify denormalized blocks creation (#431) * Simplify Denormalized Blocks * Issue #416: Add CDFQuantile Transformers and Transformations (#413) * Issue 264: Update qviz for multiblock files (#437) * Update Qbeast Visualiser (qviz) with multiblock files --------- Co-authored-by: Jorge Marín <jorge.marin.rodenas@estudiantat.upc.edu> Co-authored-by: Jorge Marín <100561030+jorgeMarin1@users.noreply.github.com> * Issue #441: Fix dataChange flag in optimize (#444) * Merge from main branch --------- Co-authored-by: jiawei <47899566+Jiaweihu08@users.noreply.github.com> Co-authored-by: Paola Pardo <paolapardoat@gmail.com> Co-authored-by: Jorge Marín <jorge.marin.rodenas@estudiantat.upc.edu> Co-authored-by: Jorge Marín <100561030+jorgeMarin1@users.noreply.github.com>
JosepSampe
pushed a commit
to JosepSampe/qbeast-spark
that referenced
this issue
Oct 24, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Right now, we split the
Transformations
(andTransformers
) into:LinearTransformation
HashTransformation
StringHistogramTransformation
NullToZeroTransformation
IdentityToZeroTransformation
We wanted to implement a
QuantileTransformation
(see closed issue #338), which will make the indexing more flexible by calling an streaming algorithm to update and provide theRank
of a specific point while writing new data. But, while trying to implement it, we notice few things:HistogramTransformation
was mapping the elements like they wereQuantiles
.histogram
, we required an external method to be called before indexing:quantiles
in PR Issue #416: Add CDFQuantile Transformers and Transformations #413 , we were also implementing the same methodology.This issue is to reorganize the Transformers and Transformations to have the following nomenclatures:
CDFQuantilesTransformation
CDF<implementation>Transformation
In which we only would have implementation for
QuantilesTransformation
in both String and Numeric cases, which different initialization of the bins for each case.With an API such as:
The text was updated successfully, but these errors were encountered: