Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release/500 release candidate #13873

Merged
merged 13 commits into from
Jul 3, 2023
Merged

Conversation

maziyarpanahi and others added 3 commits June 19, 2023 08:50
…dings like Instructor-XL model (#13849)

* Added Instructor Embeddings

* Added Instructor Embeddings python code

* fixed broadcast bug

* fixed broadcast bug

* Changed test type to slow
maziyarpanahi and others added 3 commits July 1, 2023 15:09
* Add ONNX Runtime to the dependencies

* Add both CPU and GPU coordinates for onnxruntime

* Implement OnnxSerializeModel

* Implement OnnxWrapper

* Update error message for loading external models

* Add support for ONNX to BertEmbeddings annotator

* Add support for ONNX to BERT backend

* Add support for ONNX to DeBERTa

* Implement ONNX in DeBERTa backend

* Adapt Bert For sentence embeddings with the new backend

* Update unit test for BERT (temp)

* Update unit test for DeBERTa (temp)

* Update onnxruntime and google cloud dependencies

* Seems Apple Silicon and Aarch64 are supported in onnxruntime

* Cleaning up

* Remove bad merge

* Update BERT unit test

* Add fix me to the try

* Making withSafeOnnxModelLoader thread safe

* update onnxruntime

* Revert back to normal unit tests for now [ski ptest]

* Added ADT for ModelEngine (#13862)

Co-authored-by: Stefano Lori <s.lori@izicap.com>

* Optimize ONNX on CPU

* refactor

* Add ONNX support to DistilBERT

* Add support for ONNX in RoBERTa

* Fix the bad serialization on write

* Fix using the wrong object

---------

Co-authored-by: Stefano Lori <wolliq@users.noreply.github.com>
Co-authored-by: Stefano Lori <s.lori@izicap.com>
…ke e5-large-v2 model (#13859)

* Added E5 model

* changed test type

---------

Co-authored-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>
* Added maxInputLength. fixes #13829

* changed test types
@maziyarpanahi maziyarpanahi linked an issue Jul 1, 2023 that may be closed by this pull request
1 task
maziyarpanahi and others added 7 commits July 1, 2023 17:24
* Added doc similarity ranker annotator template

* Created ranker model

* gitignore modified

* Added params to LSH models

* Added BRP LSH as annotator engine

* Added replace features col with embeddings

* Added LSH logic on vector cast

* Added skeleton for lsh doc sim ranker - WIP

* Fixed mh3 hash calculation

* Fixed dataset assertions id vs neghbours

* Converting neighbours result string to map

* Added finisher to extract lsh id and neighbors

* Labels refactoring

* Added distance param to show in rankings

* Added logic to select nearest neighbor

* Added identity ranking for debugging

* Adding Python interface to doc sim ranker approach and model

* WIP - Python interface

* WIP - fixed umbalanced embeddings Py test

* Added MinHash engine to doc sim ranker

* Fixed serde for ranker map params

* Clean up pytests

* Added doc sim ranker finisher Python interface

* stabilized tests for doc sim ranker

* Moved and enriched test for doc sim ranker

* Bumped version 5.0.0 in doc sim ranker test

---------

Co-authored-by: Stefano Lori <s.lori@izicap.com>
* update conda recipe

* rm python build configs

* update conda build instructions

* update python version reqs

* update recipe import test

---------

Co-authored-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>
@maziyarpanahi maziyarpanahi merged commit cf9b75e into master Jul 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BART Summarization max tokens?
5 participants