-
Notifications
You must be signed in to change notification settings - Fork 717
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
release/530-release-candidate #14164
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Member
maziyarpanahi
commented
Feb 6, 2024
•
edited
Loading
edited
- [Issue#14129] Fix for spark.jsl.settings.storage.cluster_tmp_dir configuration #14132
- SPARKNLP-942: MPNet Classifiers #14147
- Sparknlp 876: Introducing LLAMA2 #14148
- Doc sim rank as retriever #14149
- adding import notebook + changing default model + adding onnx support #14158
- 812 implement de berta for zero shot classification annotator #14151
- SPARKNLP-886: Add Fine tuned sentence bert notebook #14152
- [SPARKNLP-986] Fixing optional input col validations #14153
- [SPARKNLP-984] Fixing Deberta notebooks URIs #14154
- SparkNLP 933: Introducing M2M100 : multilingual translation model #14155
- SPARKNLP-985: Make Whisper compatible with onnx_data files #14165
- Fixed a bug with models that has 'onnx_data' file not working in dbfs/hdfs #14169 (Whisper Large model)
- [SPARKNLP-940] Adding changes to correctly copy cluster index storage… #14167 (server-less like Glue)
- [SPARKNLP-988] Updating EntityRuler documentation #14168
- SPARKNLP-1000: Fix No Operation named [init_all_tables] for GPT2 #14177
- fixes python documentation #14172
- fixed all sbt warnings #14156
- onnxruntime 1.17.0
- new DBr TLS
- new EMR (7.0)
* SPARKNLP-942: MPNetForSequenceClassification * SPARKNLP-942: MPNetForQuestionAnswering * SPARKNLP-942: MPNet Classifiers Documentation * Restore RobertaforQA bugfix
* introducing LLAMA2 * Added option to read model from model path to onnx wrapper * Added option to read model from model path to onnx wrapper * updated text description * LLAMA2 python API * added method to save onnx_data * added position ids * - updated Generate.scala to accept onnx tensors - added beam search support for LLAMA2 * updated max input length * updated python default params changed test to slow test * fixed serialization bug
* Added retrieval interface to the doc sim rank approach * Added Python interface as retriever in doc sim ranker --------- Co-authored-by: Stefano Lori <s.lori@izicap.com>
* adding code * adding notebook for import --------- Co-authored-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>
…4155) * introducing LLAMA2 * Added option to read model from model path to onnx wrapper * Added option to read model from model path to onnx wrapper * updated text description * LLAMA2 python API * added method to save onnx_data * added position ids * - updated Generate.scala to accept onnx tensors - added beam search support for LLAMA2 * updated max input length * updated python default params changed test to slow test * fixed serialization bug * Added Scala code for M2M100 * Documentation for scala code * Python API for M2M100 * added more tests for scala * added tests for python * added pretrained * rewording * fixed serialization bug * fixed serialization bug --------- Co-authored-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>
maziyarpanahi
added
enhancement
documentation
bug-fix
new-feature
Introducing a new feature
new model
DON'T MERGE
Do not merge this PR
labels
Feb 6, 2024
Some annotators might have different naming schemes for their files. Added a parameter to control this.
maziyarpanahi
changed the title
remove file system url prefix (#14132)
release/530-release-candidate
Feb 8, 2024
…bs/spark-nlp into release/530-release-candidate
#14167) * [SPARKNLP-940] Adding changes to correctly copy cluster index storage when defined * [SPARKNLP-940] Moving local mode control to its right place * [SPARKNLP-940] Refactoring sentToCLuster method
Fixes `java.lang.IllegalArgumentException: No Operation named [init_all_tables] in the Graph` when model needs to be deserialized. The deserialization is skipped when the modelis already loaded (so it will only appear on the worker nodes and not the driver) GPT2 does not contain tables and so does not require this command.
1 task
…warnings-in-SBT-build fixed all sbt warnings
This reverts commit eb91fde.
…bs/spark-nlp into release/530-release-candidate
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
bug-fix
documentation
DON'T MERGE
Do not merge this PR
enhancement
new model
new-feature
Introducing a new feature
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.