-
Notifications
You must be signed in to change notification settings - Fork 717
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SPARKNLP-785 Fix WordEmbeddingsModel Bug with LightPipeline #13715
Merged
maziyarpanahi
merged 1 commit into
release/440-release-candidate
from
bug/SPARKNLP-785-setEnableInMemoryStorage-param-is-not-compatible-with-LightPipeline-due-to-None.get-error
Apr 6, 2023
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
maziyarpanahi
changed the base branch from
master
to
release/440-release-candidate
April 6, 2023 18:03
maziyarpanahi
approved these changes
Apr 6, 2023
maziyarpanahi
added a commit
that referenced
this pull request
Apr 10, 2023
* SPARKNLP-782 Removes deprecated parameter enablePatternRegex (#13664) * SPARKNLP-748: Custom Entity Name for Date2Chunk (#13680) - added parameter "entityName" to change metadata name * SPARKNLP-784 Fix loading WordEmbeddingsModel bug when cache_folder is from S3 (#13707) * SPARKNLP-605: ConvNextForImageClassification (#13713) * SPARKNLP-605: ConvNextForImageClassification - Added ConvNextForImageClassification with new tests - Refactored image Preprocessor and added new config - Implemented filters with resample property for ImageResizeUtils.resizeBufferedImage (with minor performance gain) - Minor improvements for ViT and Swin * SPARKNLP-605: Docs * SPARKNLP-605: Lazy values for test * SPARKNLP-785 Fix WordEmbeddingsModel bug whit LightPipeline (#13715) * [skip test] SPARKNLP-783: Python 3.6 deprecated in Spark 3.2 (#13724) * SPARKNLP-763 Implementing ZeroShot Text Classification for BERT and DistilBERT based on NLI (#13727) * SPARKNLP-763 Fix a typo Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr> * SPARKNLP-763 add unfinished traits Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr> * SPARKNLP-763 Create a new BertForZeroShotClassification annotator Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr> * SPARKNLP-763 Create a new HasCandidateLabelsProperties Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr> * SPARKNLP-763 Implement predict sequence with NLI, new tokenize from strings, and new tag ZeroShot Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr> * SPARKNLP-763 Clean up the code Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr> * SPARKNLP-763 Add BertForZeroShotClassification to annotator [skip test] Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr> * SPARKNLP-763 Add BertForZeroShotClassification to ResourceDownloader [skip test] Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr> * SPARKNLP-763 Implement BertForZeroShotClassification in Python [skip test] Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr> * SPARKNLP-763 Add unit tests for BertForZeroShotClassification Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr> * change default model to bert_base_cased_zero_shot_classifier_xnli * SPARKNLP-763 Fix Scaladoc and Pydoc Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr> * SPARKNLP-763 Fix Update unit test in Scala Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr> --------- Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr> * Sparknlp 534 Introducing BART Transformer for text-to-text generation tasks like translation and summarization (#13731) * WIP: Added Bart transformer scala files * WIP: Added BART tokenizer test and BART is locally working * WIP: Added BART tokenizer test and BART is locally working * WIP: Added Beam Hypothesis and Beam Scorer implementations * WIP: Added Logit Processors * WIP: Added Beam Search implementation * WIP: Completed Beam Search implementation WIP: Added Generate method for text generation * WIP: fixed a bug in Beam search algorithm WIP: Generate method for text generation * WIP: changed BartTransformer methods to include beam size and added description * WIP: changed BartTransformer test methods * WIP: fixed errors in BeamSearch * WIP: Updated to use separate encoder decoder model * WIP: Changed model to handle the int64 version of the model weights * WIP: Added python API implementation * Pass session and encoder state as a parameter Clean up unnecessary code * Update TopK Logit Warper Logic * Code clean up * Update Tests * Update documentation * Update documentation and python tests * Update python tests * SPARKNLP-534 move BartTokenizer to the Bart backend Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr> * SPARKNLP-534 Fix the copyright year Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr> * SPARKNLP-534 Add BartTransformer to annotator and ResourceDownloader Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr> * SPARKNLP-534 Fix BartTransformer in annotator Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr> --------- Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr> Co-authored-by: Maziyar Panahi <maziyar.panahi@iscpif.fr> * Bump version to 4.4.0 * Update doc style and fix unit test [skip test] Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr> * SPARKNLP-605: Fix parameter eval for vit tests * Update default model name (#13744) * SPARKNLP-796 Creating a new `nerHasNoSchema` param (#13745) * Adding missing CPUvsGPUbenchmark page * SPARKNLP-796 Creating a new `nerHasNoSchema` param Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr> --------- Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr> * Change default model for BART to distilbart-xsum-12-6 Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr> * Change default model for BART to distilbart_xsum_12_6 Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr> * Replace nlp with sparknlp.org website * Update INT64 to INT32 (#13748) * Fix the wrong column in unit test [skip test] Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr> * SPARKNLP-805: Documentation for release/440 (#13743) * Fixed memory leak * Added Bart Notebook * Add new features and update docs[run doc] * Update install.md * Update CHANGELOG [run doc] * Update Scala and Python APIs * release spark-nlp 4.4.0 on Conda [skip test] --------- Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr> Co-authored-by: Danilo Burbano <37355249+danilojsl@users.noreply.github.com> Co-authored-by: Devin Ha <33089471+DevinTDHa@users.noreply.github.com> Co-authored-by: Prabod Rathnayaka <prabod@rathnayaka.me> Co-authored-by: Devin Ha <t.ha@tu-berlin.de> Co-authored-by: github-actions <action@github.com>
jsl-builder
pushed a commit
that referenced
this pull request
Apr 12, 2023
* SPARKNLP-782 Removes deprecated parameter enablePatternRegex (#13664) * SPARKNLP-748: Custom Entity Name for Date2Chunk (#13680) - added parameter "entityName" to change metadata name * SPARKNLP-784 Fix loading WordEmbeddingsModel bug when cache_folder is from S3 (#13707) * SPARKNLP-605: ConvNextForImageClassification (#13713) * SPARKNLP-605: ConvNextForImageClassification - Added ConvNextForImageClassification with new tests - Refactored image Preprocessor and added new config - Implemented filters with resample property for ImageResizeUtils.resizeBufferedImage (with minor performance gain) - Minor improvements for ViT and Swin * SPARKNLP-605: Docs * SPARKNLP-605: Lazy values for test * SPARKNLP-785 Fix WordEmbeddingsModel bug whit LightPipeline (#13715) * [skip test] SPARKNLP-783: Python 3.6 deprecated in Spark 3.2 (#13724) * SPARKNLP-763 Implementing ZeroShot Text Classification for BERT and DistilBERT based on NLI (#13727) * SPARKNLP-763 Fix a typo Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr> * SPARKNLP-763 add unfinished traits Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr> * SPARKNLP-763 Create a new BertForZeroShotClassification annotator Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr> * SPARKNLP-763 Create a new HasCandidateLabelsProperties Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr> * SPARKNLP-763 Implement predict sequence with NLI, new tokenize from strings, and new tag ZeroShot Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr> * SPARKNLP-763 Clean up the code Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr> * SPARKNLP-763 Add BertForZeroShotClassification to annotator [skip test] Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr> * SPARKNLP-763 Add BertForZeroShotClassification to ResourceDownloader [skip test] Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr> * SPARKNLP-763 Implement BertForZeroShotClassification in Python [skip test] Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr> * SPARKNLP-763 Add unit tests for BertForZeroShotClassification Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr> * change default model to bert_base_cased_zero_shot_classifier_xnli * SPARKNLP-763 Fix Scaladoc and Pydoc Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr> * SPARKNLP-763 Fix Update unit test in Scala Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr> --------- Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr> * Sparknlp 534 Introducing BART Transformer for text-to-text generation tasks like translation and summarization (#13731) * WIP: Added Bart transformer scala files * WIP: Added BART tokenizer test and BART is locally working * WIP: Added BART tokenizer test and BART is locally working * WIP: Added Beam Hypothesis and Beam Scorer implementations * WIP: Added Logit Processors * WIP: Added Beam Search implementation * WIP: Completed Beam Search implementation WIP: Added Generate method for text generation * WIP: fixed a bug in Beam search algorithm WIP: Generate method for text generation * WIP: changed BartTransformer methods to include beam size and added description * WIP: changed BartTransformer test methods * WIP: fixed errors in BeamSearch * WIP: Updated to use separate encoder decoder model * WIP: Changed model to handle the int64 version of the model weights * WIP: Added python API implementation * Pass session and encoder state as a parameter Clean up unnecessary code * Update TopK Logit Warper Logic * Code clean up * Update Tests * Update documentation * Update documentation and python tests * Update python tests * SPARKNLP-534 move BartTokenizer to the Bart backend Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr> * SPARKNLP-534 Fix the copyright year Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr> * SPARKNLP-534 Add BartTransformer to annotator and ResourceDownloader Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr> * SPARKNLP-534 Fix BartTransformer in annotator Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr> --------- Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr> Co-authored-by: Maziyar Panahi <maziyar.panahi@iscpif.fr> * Bump version to 4.4.0 * Update doc style and fix unit test [skip test] Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr> * SPARKNLP-605: Fix parameter eval for vit tests * Update default model name (#13744) * SPARKNLP-796 Creating a new `nerHasNoSchema` param (#13745) * Adding missing CPUvsGPUbenchmark page * SPARKNLP-796 Creating a new `nerHasNoSchema` param Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr> --------- Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr> * Change default model for BART to distilbart-xsum-12-6 Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr> * Change default model for BART to distilbart_xsum_12_6 Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr> * Replace nlp with sparknlp.org website * Update INT64 to INT32 (#13748) * Fix the wrong column in unit test [skip test] Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr> * SPARKNLP-805: Documentation for release/440 (#13743) * Fixed memory leak * Added Bart Notebook * Add new features and update docs[run doc] * Update install.md * Update CHANGELOG [run doc] * Update Scala and Python APIs * release spark-nlp 4.4.0 on Conda [skip test] --------- Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr> Co-authored-by: Danilo Burbano <37355249+danilojsl@users.noreply.github.com> Co-authored-by: Devin Ha <33089471+DevinTDHa@users.noreply.github.com> Co-authored-by: Prabod Rathnayaka <prabod@rathnayaka.me> Co-authored-by: Devin Ha <t.ha@tu-berlin.de> Co-authored-by: github-actions <action@github.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
When using
WorbEmbeddingsModel
withsetEnableInMemoryStorage
as true in aLightPipeline
it raisedNon.get
errorMotivation and Context
Make available
WorbEmbeddingsModel
inLightPipeline
How Has This Been Tested?
Screenshots (if appropriate):
Types of changes
Checklist: