Spark NLP 4.4.4: Patch release
π’ Overview
Spark NLP 4.4.4 π is a patch release with bug fixes and other improvements. We want to thank our community for their valuable feedback, feature requests, and contributions. Our Models Hub now contains over 17,000+ free and truly open-source models & pipelines. π
Spark NLP has a new home! https://sparknlp.org is where you can find all the documentation, models, and demos for Spark NLP. It aims to provide valuable resources to anyone interested in 100% open-source NLP solutions by using Spark NLP π.
β New Features & Enhancements
- Add
Warmup
stage to loading all Transformers for word embeddings: ALBERT, BERT, CamemBERT, DistilBERT, RoBERTa, XLM-RoBERTa, and XLNet. This helps to reduce the first inference time and also validate importing external models from HuggingFace #13851 - Add new notebooks to import ZeroShot Classifiers for Bert, DistilBERT, and RoBERTa fine-tuned based on NLI datasets #13845
π Bug Fixes
- Fix not being able to save models from XXXForSequenceClassitication and XXXForZeroShotClassification annotators #13842
- Fix pretrained pipelines that stopped working since the 4.4.2 release on PySpark 3.2 and 3.3 versions (adding 121 new pipelines were added) #13836
π New Notebooks
Notebooks | Colab | Colab |
---|---|---|
BertForZeroShotClassification | HuggingFace in Spark NLP - BertForZeroShotClassification | |
DistilBertForZeroShotClassification | HuggingFace in Spark NLP - DistilBertForZeroShotClassification | |
RoBertaForZeroShotClassification | HuggingFace in Spark NLP - RoBertaForZeroShotClassification |
- You can visit Import Transformers in Spark NLP
- You can visit Spark NLP Examples for 100+ examples
π Documentation
- Import models from TF Hub & HuggingFace
- Spark NLP Notebooks
- Models Hub with new models
- Spark NLP Articles
- Spark NLP in Action
- Spark NLP Documentation
- Spark NLP Scala APIs
- Spark NLP Python APIs
β€οΈ Community support
- Slack For live discussion with the Spark NLP community and the team
- GitHub Bug reports, feature requests, and contributions
- Discussions Engage with other community members, share ideas,
and show off how you use Spark NLP! - Medium Spark NLP articles
- JohnSnowLabs official Medium
- YouTube Spark NLP video tutorials
Installation
Python
#PyPI
pip install spark-nlp==4.4.4
Spark Packages
spark-nlp on Apache Spark 3.0.x, 3.1.x, 3.2.x, 3.3.x, and 3.4.x (Scala 2.12):
spark-shell --packages com.johnsnowlabs.nlp:spark-nlp_2.12:4.4.4
pyspark --packages com.johnsnowlabs.nlp:spark-nlp_2.12:4.4.4
GPU
spark-shell --packages com.johnsnowlabs.nlp:spark-nlp-gpu_2.12:4.4.4
pyspark --packages com.johnsnowlabs.nlp:spark-nlp-gpu_2.12:4.4.4
Apple Silicon (M1 & M2)
spark-shell --packages com.johnsnowlabs.nlp:spark-nlp-silicon_2.12:4.4.4
pyspark --packages com.johnsnowlabs.nlp:spark-nlp-silicon_2.12:4.4.4
AArch64
spark-shell --packages com.johnsnowlabs.nlp:spark-nlp-aarch64_2.12:4.4.4
pyspark --packages com.johnsnowlabs.nlp:spark-nlp-aarch64_2.12:4.4.4
Maven
spark-nlp on Apache Spark 3.0.x, 3.1.x, 3.2.x, 3.3.x, and 3.4.x:
<dependency>
<groupId>com.johnsnowlabs.nlp</groupId>
<artifactId>spark-nlp_2.12</artifactId>
<version>4.4.4</version>
</dependency>
spark-nlp-gpu:
<dependency>
<groupId>com.johnsnowlabs.nlp</groupId>
<artifactId>spark-nlp-gpu_2.12</artifactId>
<version>4.4.4</version>
</dependency>
spark-nlp-silicon:
<dependency>
<groupId>com.johnsnowlabs.nlp</groupId>
<artifactId>spark-nlp-silicon_2.12</artifactId>
<version>4.4.4</version>
</dependency>
spark-nlp-aarch64:
<dependency>
<groupId>com.johnsnowlabs.nlp</groupId>
<artifactId>spark-nlp-aarch64_2.12</artifactId>
<version>4.4.4</version>
</dependency>
FAT JARs
-
CPU on Apache Spark 3.x/3.1.x/3.2.x/3.3.x/3.4.x: https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/jars/spark-nlp-assembly-4.4.4.jar
-
GPU on Apache Spark 3.0.x/3.1.x/3.2.x/3.3.x/3.4.x: https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/jars/spark-nlp-gpu-assembly-4.4.4.jar
-
M1 on Apache Spark 3.0.x/3.1.x/3.2.x/3.3.x/3.4.x: https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/jars/spark-nlp-silicon-assembly-4.4.4.jar
-
AArch64 on Apache Spark 3.0.x/3.1.x/3.2.x/3.3.x/3.4.x: https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/jars/spark-nlp-aarch64-assembly-4.4.4.jar
What's Changed
- Models hub by @maziyarpanahi in #13837
- FEATURE NMH-175: Add Copy to s3 on open source models [skip-test] by @KshitizGIT in #13844
- FEATURE NMH-175: Remove models with missing s3 [skip-test] by @KshitizGIT in #13847
- Resolve saving bug with multilabel parameter by @DevinTDHa in #13842
- SPARKNLP-815: Add examples for ZeroShotClassifiers by @DevinTDHa in #13845
- SPARKNLP 801 set up warmup for all embeddings by @maziyarpanahi in #13851
- Sparknlp 801 set up warmup for all embeddings classifiers by @maziyarpanahi in #13852
Full Changelog: 4.4.3...4.4.4