Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release/440 release candidate #13742

Merged
merged 27 commits into from
Apr 10, 2023
Merged

Conversation

maziyarpanahi
Copy link
Member

@maziyarpanahi maziyarpanahi commented Apr 6, 2023

Description

Explanation of Changes

RKNLP-782 Removes deprecated parameter enablePatternRegex (https://github.com/JohnSnowLabs/spark-nlp/pull/13664[)](https://github.com/JohnSnowLabs/spark-nlp/pull/13742/commits/5259a4d65d8676772ee5da1caf3794d4ac39ad8b)

5259a4d
@DevinTDHa
SPARKNLP-748: Custom Entity Name for Date2Chunk (https://github.com/JohnSnowLabs/spark-nlp/pull/13680[)](https://github.com/JohnSnowLabs/spark-nlp/pull/13742/commits/87127c621dec8b5485634d63e889a727b8ad2976)

87127c6
@danilojsl
SPARKNLP-784 Fix loading WordEmbeddingsModel bug when cache_folder is…

dbad9f2
@DevinTDHa
SPARKNLP-605: ConvNextForImageClassification (https://github.com/JohnSnowLabs/spark-nlp/pull/13713[)](https://github.com/JohnSnowLabs/spark-nlp/pull/13742/commits/bf4428d1bb507129a213897c3debe87740fa9b86)

bf4428d
@danilojsl
SPARKNLP-785 Fix WordEmbeddingsModel bug whit LightPipeline (https://github.com/JohnSnowLabs/spark-nlp/pull/13715[)](https://github.com/JohnSnowLabs/spark-nlp/pull/13742/commits/b6b8cc66c4b5c0c7afc6f3d16782255432f7a0fe)

b6b8cc6
@DevinTDHa
[skip test] SPARKNLP-783: Python 3.6 deprecated in Spark 3.2 (https://github.com/JohnSnowLabs/spark-nlp/pull/13724[)](https://github.com/JohnSnowLabs/spark-nlp/pull/13742/commits/51098594bde986b0da38fc70ee56a2470dc44592)

5109859
@maziyarpanahi
SPARKNLP-763 Implementing ZeroShot Text Classification for BERT and D…

79d5976
@prabod
@maziyarpanahi
Sparknlp 534 Introducing BART Transformer for text-to-text generation…

danilojsl and others added 8 commits April 6, 2023 19:56
- added parameter "entityName" to change metadata name
* SPARKNLP-605: ConvNextForImageClassification

- Added ConvNextForImageClassification with new tests
- Refactored image Preprocessor and added new config
- Implemented filters with resample property for
  ImageResizeUtils.resizeBufferedImage (with minor
  performance gain)
- Minor improvements for ViT and Swin

* SPARKNLP-605: Docs

* SPARKNLP-605: Lazy values for test
…istilBERT based on NLI (#13727)

* SPARKNLP-763 Fix a typo

Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>

* SPARKNLP-763 add unfinished traits

Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>

* SPARKNLP-763 Create a new BertForZeroShotClassification annotator

Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>

* SPARKNLP-763 Create a new HasCandidateLabelsProperties

Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>

* SPARKNLP-763 Implement predict sequence with NLI, new tokenize from strings, and new tag ZeroShot

Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>

* SPARKNLP-763 Clean up the code

Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>

* SPARKNLP-763 Add BertForZeroShotClassification to annotator [skip test]

Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>

* SPARKNLP-763 Add BertForZeroShotClassification to ResourceDownloader [skip test]

Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>

* SPARKNLP-763 Implement BertForZeroShotClassification in Python [skip test]

Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>

* SPARKNLP-763 Add unit tests for BertForZeroShotClassification

Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>

* change default model to bert_base_cased_zero_shot_classifier_xnli

* SPARKNLP-763 Fix Scaladoc and Pydoc

Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>

* SPARKNLP-763 Fix Update unit test in Scala

Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>

---------

Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>
… tasks like translation and summarization (#13731)

* WIP: Added Bart transformer scala files

* WIP: Added BART tokenizer test and BART is locally working

* WIP: Added BART tokenizer test and BART is locally working

* WIP: Added Beam Hypothesis and Beam Scorer implementations

* WIP: Added Logit Processors

* WIP: Added Beam Search implementation

* WIP: Completed Beam Search implementation
WIP: Added Generate method for text generation

* WIP: fixed a bug in Beam search algorithm
WIP: Generate method for text generation

* WIP: changed BartTransformer methods to include beam size and added description

* WIP: changed BartTransformer test methods

* WIP: fixed errors in BeamSearch

* WIP: Updated to use separate encoder decoder model

* WIP: Changed model to handle the int64 version of the model weights

* WIP: Added python API implementation

* Pass session and encoder state as a parameter
Clean up unnecessary code

* Update TopK Logit Warper Logic

* Code clean up

* Update Tests

* Update documentation

* Update documentation and python tests

* Update python tests

* SPARKNLP-534 move BartTokenizer to the Bart backend

Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>

* SPARKNLP-534 Fix the copyright year

Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>

* SPARKNLP-534 Add BartTransformer to annotator and ResourceDownloader

Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>

* SPARKNLP-534 Fix BartTransformer in annotator

Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>

---------

Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>
Co-authored-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>
@maziyarpanahi maziyarpanahi self-assigned this Apr 6, 2023
@maziyarpanahi maziyarpanahi added enhancement documentation bug-fix new-feature Introducing a new feature models_hub pretrained models and pipelines new model DON'T MERGE Do not merge this PR labels Apr 6, 2023
@maziyarpanahi
Copy link
Member Author

@DevinTDHa maybe we should ignore/SlowTest ConvNextForImageClassificationTestSpec? https://github.com/JohnSnowLabs/spark-nlp/actions/runs/4631721818/jobs/8194935232#step:7:3458

@DevinTDHa
Copy link
Member

DevinTDHa commented Apr 7, 2023

@DevinTDHa maybe we should ignore/SlowTest ConvNextForImageClassificationTestSpec? https://github.com/JohnSnowLabs/spark-nlp/actions/runs/4631721818/jobs/8194935232#step:7:3458

@maziyarpanahi I pushed a commit (47d8e51) with a fix!

prabod and others added 9 commits April 7, 2023 13:01
* Adding missing CPUvsGPUbenchmark page

* SPARKNLP-796 Creating a new `nerHasNoSchema` param

Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>

---------

Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>
Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>
Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>
Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>
@maziyarpanahi maziyarpanahi merged commit 684f8c9 into master Apr 10, 2023
jsl-builder pushed a commit that referenced this pull request Apr 12, 2023
* SPARKNLP-782 Removes deprecated parameter enablePatternRegex (#13664)

* SPARKNLP-748: Custom Entity Name for Date2Chunk (#13680)

- added parameter "entityName" to change metadata name

* SPARKNLP-784 Fix loading WordEmbeddingsModel bug when cache_folder is from S3 (#13707)

* SPARKNLP-605: ConvNextForImageClassification (#13713)

* SPARKNLP-605: ConvNextForImageClassification

- Added ConvNextForImageClassification with new tests
- Refactored image Preprocessor and added new config
- Implemented filters with resample property for
  ImageResizeUtils.resizeBufferedImage (with minor
  performance gain)
- Minor improvements for ViT and Swin

* SPARKNLP-605: Docs

* SPARKNLP-605: Lazy values for test

* SPARKNLP-785 Fix WordEmbeddingsModel bug whit LightPipeline (#13715)

* [skip test] SPARKNLP-783: Python 3.6 deprecated in Spark 3.2 (#13724)

* SPARKNLP-763 Implementing ZeroShot Text Classification for BERT and DistilBERT based on NLI (#13727)

* SPARKNLP-763 Fix a typo

Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>

* SPARKNLP-763 add unfinished traits

Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>

* SPARKNLP-763 Create a new BertForZeroShotClassification annotator

Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>

* SPARKNLP-763 Create a new HasCandidateLabelsProperties

Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>

* SPARKNLP-763 Implement predict sequence with NLI, new tokenize from strings, and new tag ZeroShot

Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>

* SPARKNLP-763 Clean up the code

Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>

* SPARKNLP-763 Add BertForZeroShotClassification to annotator [skip test]

Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>

* SPARKNLP-763 Add BertForZeroShotClassification to ResourceDownloader [skip test]

Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>

* SPARKNLP-763 Implement BertForZeroShotClassification in Python [skip test]

Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>

* SPARKNLP-763 Add unit tests for BertForZeroShotClassification

Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>

* change default model to bert_base_cased_zero_shot_classifier_xnli

* SPARKNLP-763 Fix Scaladoc and Pydoc

Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>

* SPARKNLP-763 Fix Update unit test in Scala

Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>

---------

Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>

* Sparknlp 534 Introducing BART Transformer for text-to-text generation tasks like translation and summarization (#13731)

* WIP: Added Bart transformer scala files

* WIP: Added BART tokenizer test and BART is locally working

* WIP: Added BART tokenizer test and BART is locally working

* WIP: Added Beam Hypothesis and Beam Scorer implementations

* WIP: Added Logit Processors

* WIP: Added Beam Search implementation

* WIP: Completed Beam Search implementation
WIP: Added Generate method for text generation

* WIP: fixed a bug in Beam search algorithm
WIP: Generate method for text generation

* WIP: changed BartTransformer methods to include beam size and added description

* WIP: changed BartTransformer test methods

* WIP: fixed errors in BeamSearch

* WIP: Updated to use separate encoder decoder model

* WIP: Changed model to handle the int64 version of the model weights

* WIP: Added python API implementation

* Pass session and encoder state as a parameter
Clean up unnecessary code

* Update TopK Logit Warper Logic

* Code clean up

* Update Tests

* Update documentation

* Update documentation and python tests

* Update python tests

* SPARKNLP-534 move BartTokenizer to the Bart backend

Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>

* SPARKNLP-534 Fix the copyright year

Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>

* SPARKNLP-534 Add BartTransformer to annotator and ResourceDownloader

Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>

* SPARKNLP-534 Fix BartTransformer in annotator

Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>

---------

Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>
Co-authored-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>

* Bump version to 4.4.0

* Update doc style and fix unit test [skip test]

Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>

* SPARKNLP-605: Fix parameter eval for vit tests

* Update default model name (#13744)

* SPARKNLP-796 Creating a new `nerHasNoSchema` param (#13745)

* Adding missing CPUvsGPUbenchmark page

* SPARKNLP-796 Creating a new `nerHasNoSchema` param

Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>

---------

Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>

* Change default model for BART to distilbart-xsum-12-6

Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>

* Change default model for BART to distilbart_xsum_12_6

Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>

* Replace nlp with sparknlp.org website

* Update INT64 to INT32 (#13748)

* Fix the wrong column in unit test [skip test]

Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>

* SPARKNLP-805: Documentation for release/440 (#13743)

* Fixed memory leak

* Added Bart Notebook

* Add new features and update docs[run doc]

* Update install.md

* Update CHANGELOG [run doc]

* Update Scala and Python APIs

* release spark-nlp 4.4.0 on Conda [skip test]

---------

Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>
Co-authored-by: Danilo Burbano <37355249+danilojsl@users.noreply.github.com>
Co-authored-by: Devin Ha <33089471+DevinTDHa@users.noreply.github.com>
Co-authored-by: Prabod Rathnayaka <prabod@rathnayaka.me>
Co-authored-by: Devin Ha <t.ha@tu-berlin.de>
Co-authored-by: github-actions <action@github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug-fix documentation DON'T MERGE Do not merge this PR enhancement models_hub pretrained models and pipelines new model new-feature Introducing a new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants