Skip to content

Spark NLP 4.4.3: Patch release

Compare
Choose a tag to compare
@maziyarpanahi maziyarpanahi released this 26 May 12:13
· 409 commits to master since this release
d5abbc0

πŸ“’ Overview

Spark NLP 4.4.3 πŸš€ comes with a new param to switch from multi-class to multi-label in all of our classifiers including ZeroShot, extending support to download models directly with an S3 path in ResourceDownloader, bug fixes, and improvements!

We want to thank our community for their valuable feedback, feature requests, and contributions. Our Models Hub now contains over 18,000+ free and truly open-source models & pipelines. πŸŽ‰

Spark NLP has a new home! https://sparknlp.org is where you can find all the documentation, models, and demos for Spark NLP. It aims to provide valuable resources to anyone interested in 100% open-source NLP solutions by using Spark NLP πŸš€


⭐ New Features & Enhancements

  • New multilabel parameter to switch from multi-class to multi-label on all Classifiers in Spark NLP: AlbertForSequenceClassification, BertForSequenceClassification, DeBertaForSequenceClassification, DistilBertForSequenceClassification, LongformerForSequenceClassification, RoBertaForSequenceClassification, XlmRoBertaForSequenceClassification, XlnetForSequenceClassification, BertForZeroShotClassification, DistilBertForZeroShotClassification, and RobertaForZeroShotClassification
  • Refactor protected Params and Features to avoid unwanted exceptions during runtime #13797
  • Add proper documentation and instructions for ZeroShot classifiers: BertForZeroShotClassification, DistilBertForZeroShotClassification, and RobertaForZeroShotClassification #13798
  • Extend support for downloading models/pipelines directly by given name or S3 path in ResourceDownloader #13796
from sparknlp.pretrained import ResourceDownloader

# partial S3 path
ResourceDownloader.downloadModelDirectly("public/models/albert_base_sequence_classifier_ag_news_en_3.4.0_3.0_1639648298937.zip", remote_loc = "public/models")

# full S3 path
ResourceDownloader.downloadModelDirectly("s3://auxdata.johnsnowlabs.com/public/models/albert_base_sequence_classifier_ag_news_en_3.4.0_3.0_1639648298937.zip", remote_loc = "public/models", unzip = False)

πŸ› Bug Fixes

  • Fix pretrained pipelines that stopped working since the 4.4.2 release on PySpark 3.0 and 3.1 versions (adding 123 new pipelines were added) #13805
  • Fix pretrained pipelines that stopped working since the 4.4.2 release on PySpark 3.4 versions (adding 120 new pipelines were added) #13828
  • Fix Java compatibility issue caused by SystemUtils dependency #13806

Known issue:
Current pre-trained pipelines don't work on PySpark 3.2 and 3.3. They will all be fixed in the next few days.


πŸ“– Documentation


❀️ Community support

  • Slack For live discussion with the Spark NLP community and the team
  • GitHub Bug reports, feature requests, and contributions
  • Discussions Engage with other community members, share ideas, and show off how you use Spark NLP!
  • Medium Spark NLP articles
  • YouTube Spark NLP video tutorials

Installation

Python

#PyPI

pip install spark-nlp==4.4.3

Spark Packages

spark-nlp on Apache Spark 3.0.x, 3.1.x, 3.2.x, and 3.3.x (Scala 2.12):

spark-shell --packages com.johnsnowlabs.nlp:spark-nlp_2.12:4.4.3

pyspark --packages com.johnsnowlabs.nlp:spark-nlp_2.12:4.4.3

GPU

spark-shell --packages com.johnsnowlabs.nlp:spark-nlp-gpu_2.12:4.4.3

pyspark --packages com.johnsnowlabs.nlp:spark-nlp-gpu_2.12:4.4.3

Apple Silicon (M1 & M2)

spark-shell --packages com.johnsnowlabs.nlp:spark-nlp-silicon_2.12:4.4.3

pyspark --packages com.johnsnowlabs.nlp:spark-nlp-silicon_2.12:4.4.3

AArch64

spark-shell --packages com.johnsnowlabs.nlp:spark-nlp-aarch64_2.12:4.4.3

pyspark --packages com.johnsnowlabs.nlp:spark-nlp-aarch64_2.12:4.4.3

Maven

spark-nlp on Apache Spark 3.0.x, 3.1.x, 3.2.x, and 3.3.x:

<dependency>
    <groupId>com.johnsnowlabs.nlp</groupId>
    <artifactId>spark-nlp_2.12</artifactId>
    <version>4.4.3</version>
</dependency>

spark-nlp-gpu:

<dependency>
    <groupId>com.johnsnowlabs.nlp</groupId>
    <artifactId>spark-nlp-gpu_2.12</artifactId>
    <version>4.4.3</version>
</dependency>

spark-nlp-silicon:

<dependency>
    <groupId>com.johnsnowlabs.nlp</groupId>
    <artifactId>spark-nlp-silicon_2.12</artifactId>
    <version>4.4.3</version>
</dependency>

spark-nlp-aarch64:

<dependency>
    <groupId>com.johnsnowlabs.nlp</groupId>
    <artifactId>spark-nlp-aarch64_2.12</artifactId>
    <version>4.4.3</version>
</dependency>

FAT JARs

What's Changed

New Contributors

Full Changelog: 4.4.2...4.4.3