-
Notifications
You must be signed in to change notification settings - Fork 717
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Release/500 release candidate #13873
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Member
maziyarpanahi
commented
Jul 1, 2023
•
edited
Loading
edited
- SPARKNLP 836 - Introducing "Instructor Embeddings" for sentence embeddings like Instructor-XL model #13849
- Integrating ONNX runtime (ORT) in Spark NLP 5.0.0 🎉 #13857
- Feature/doc similarity ranker #13858
- SPARKNLP 852 - Introducing "E5 Embeddings" for sentence embeddings like e5-large-v2 model #13859
- SPARKNLP-846: BART: Added maxInputLength. #13863
- Added Sentence Embeddings Notebooks. #13874
- Draft: Chore: conda recipe update #13764
…dings like Instructor-XL model (#13849) * Added Instructor Embeddings * Added Instructor Embeddings python code * fixed broadcast bug * fixed broadcast bug * Changed test type to slow
maziyarpanahi
added
enhancement
documentation
bug-fix
new-feature
Introducing a new feature
new model
DON'T MERGE
Do not merge this PR
labels
Jul 1, 2023
* Add ONNX Runtime to the dependencies * Add both CPU and GPU coordinates for onnxruntime * Implement OnnxSerializeModel * Implement OnnxWrapper * Update error message for loading external models * Add support for ONNX to BertEmbeddings annotator * Add support for ONNX to BERT backend * Add support for ONNX to DeBERTa * Implement ONNX in DeBERTa backend * Adapt Bert For sentence embeddings with the new backend * Update unit test for BERT (temp) * Update unit test for DeBERTa (temp) * Update onnxruntime and google cloud dependencies * Seems Apple Silicon and Aarch64 are supported in onnxruntime * Cleaning up * Remove bad merge * Update BERT unit test * Add fix me to the try * Making withSafeOnnxModelLoader thread safe * update onnxruntime * Revert back to normal unit tests for now [ski ptest] * Added ADT for ModelEngine (#13862) Co-authored-by: Stefano Lori <s.lori@izicap.com> * Optimize ONNX on CPU * refactor * Add ONNX support to DistilBERT * Add support for ONNX in RoBERTa * Fix the bad serialization on write * Fix using the wrong object --------- Co-authored-by: Stefano Lori <wolliq@users.noreply.github.com> Co-authored-by: Stefano Lori <s.lori@izicap.com>
…ke e5-large-v2 model (#13859) * Added E5 model * changed test type --------- Co-authored-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>
* Added maxInputLength. fixes #13829 * changed test types
1 task
* Added doc similarity ranker annotator template * Created ranker model * gitignore modified * Added params to LSH models * Added BRP LSH as annotator engine * Added replace features col with embeddings * Added LSH logic on vector cast * Added skeleton for lsh doc sim ranker - WIP * Fixed mh3 hash calculation * Fixed dataset assertions id vs neghbours * Converting neighbours result string to map * Added finisher to extract lsh id and neighbors * Labels refactoring * Added distance param to show in rankings * Added logic to select nearest neighbor * Added identity ranking for debugging * Adding Python interface to doc sim ranker approach and model * WIP - Python interface * WIP - fixed umbalanced embeddings Py test * Added MinHash engine to doc sim ranker * Fixed serde for ranker map params * Clean up pytests * Added doc sim ranker finisher Python interface * stabilized tests for doc sim ranker * Moved and enriched test for doc sim ranker * Bumped version 5.0.0 in doc sim ranker test --------- Co-authored-by: Stefano Lori <s.lori@izicap.com>
* update conda recipe * rm python build configs * update conda build instructions * update python version reqs * update recipe import test --------- Co-authored-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
bug-fix
documentation
DON'T MERGE
Do not merge this PR
enhancement
new model
new-feature
Introducing a new feature
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.