Spark NLP 5.0.1: Patch release
π’ Overview
Spark NLP 5.0.1 π is a patch release with bug fixes and other improvements. We want to thank our community for their valuable feedback, feature requests, and contributions. Our Models Hub now contains over 18,000+ free and truly open-source models & pipelines. π
π Bug Fixes & Enhancements
- Fix
multiLabel
param issue inXXXForSequenceClassitication
andXXXForZeroShotClassification
annotators - Add the missing
threshold
param to allXXXForSequenceClassitication
in Python - Fix issue with passing
spark.driver.cores
config as a param into start() function in Python and Scala - Fix 600+ models' cards on Models Hub with duplicated code snippets
- Add new notebooks to export
BERT
,DistilBERT
,RoBERTa
, andDeBERTa
models toONNX
format
π New Notebooks
Spark NLP | Notebooks | Colab |
---|---|---|
BertEmbeddings | HuggingFace in Spark NLP - BERT | BERT |
DistilBertEmbeddings | HuggingFace in Spark NLP - DistilBERT | DistilBERT |
RoBertaEmbeddings | HuggingFace in Spark NLP - RoBERTa | RoBERTa |
DeBertaEmbeddings | HuggingFace in Spark NLP - DeBERTa | DeBERTa |
- You can visit Import Transformers in Spark NLP
- You can visit Spark NLP Examples for 100+ examples
π Documentation
- Import models from TF Hub & HuggingFace
- Spark NLP Notebooks
- Models Hub with new models
- Spark NLP Articles
- Spark NLP in Action
- Spark NLP Documentation
- Spark NLP Scala APIs
- Spark NLP Python APIs
β€οΈ Community support
- Slack For live discussion with the Spark NLP community and the team
- GitHub Bug reports, feature requests, and contributions
- Discussions Engage with other community members, share ideas,
and show off how you use Spark NLP! - Medium Spark NLP articles
- JohnSnowLabs official Medium
- YouTube Spark NLP video tutorials
Installation
Python
#PyPI
pip install spark-nlp==5.0.1
Spark Packages
spark-nlp on Apache Spark 3.0.x, 3.1.x, 3.2.x, 3.3.x, and 3.4.x (Scala 2.12):
spark-shell --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.0.1
pyspark --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.0.1
GPU
spark-shell --packages com.johnsnowlabs.nlp:spark-nlp-gpu_2.12:5.0.1
pyspark --packages com.johnsnowlabs.nlp:spark-nlp-gpu_2.12:5.0.1
Apple Silicon (M1 & M2)
spark-shell --packages com.johnsnowlabs.nlp:spark-nlp-silicon_2.12:5.0.1
pyspark --packages com.johnsnowlabs.nlp:spark-nlp-silicon_2.12:5.0.1
AArch64
spark-shell --packages com.johnsnowlabs.nlp:spark-nlp-aarch64_2.12:5.0.1
pyspark --packages com.johnsnowlabs.nlp:spark-nlp-aarch64_2.12:5.0.1
Maven
spark-nlp on Apache Spark 3.0.x, 3.1.x, 3.2.x, 3.3.x, and 3.4.x:
<dependency>
<groupId>com.johnsnowlabs.nlp</groupId>
<artifactId>spark-nlp_2.12</artifactId>
<version>5.0.1</version>
</dependency>
spark-nlp-gpu:
<dependency>
<groupId>com.johnsnowlabs.nlp</groupId>
<artifactId>spark-nlp-gpu_2.12</artifactId>
<version>5.0.1</version>
</dependency>
spark-nlp-silicon:
<dependency>
<groupId>com.johnsnowlabs.nlp</groupId>
<artifactId>spark-nlp-silicon_2.12</artifactId>
<version>5.0.1</version>
</dependency>
spark-nlp-aarch64:
<dependency>
<groupId>com.johnsnowlabs.nlp</groupId>
<artifactId>spark-nlp-aarch64_2.12</artifactId>
<version>5.0.1</version>
</dependency>
FAT JARs
-
CPU on Apache Spark 3.x/3.1.x/3.2.x/3.3.x/3.4.x: https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/jars/spark-nlp-assembly-5.0.1.jar
-
GPU on Apache Spark 3.0.x/3.1.x/3.2.x/3.3.x/3.4.x: https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/jars/spark-nlp-gpu-assembly-5.0.1.jar
-
M1 on Apache Spark 3.0.x/3.1.x/3.2.x/3.3.x/3.4.x: https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/jars/spark-nlp-silicon-assembly-5.0.1.jar
-
AArch64 on Apache Spark 3.0.x/3.1.x/3.2.x/3.3.x/3.4.x: https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/jars/spark-nlp-aarch64-assembly-5.0.1.jar
What's Changed
- Edited notebook for doc sim ranker with E5 by @wolliq in #13878
- update SEO titles by @agsfer in #13887
- SPARKNLP-867 Solves multiLabel param issue in ZeroShot annotators by @danilojsl in #13888
- Sparknlp 868 make spark driver cores override local in start functions by @maziyarpanahi in #13894
- [SPARKNLP-863 SPARKNLP-864 SPARKNLP-865 SPARKNLP-866] ONNX Export Notebooks by @DevinTDHa in #13889
- SPARKNLP-869 Adding threshold to properties for python module by @danilojsl in #13890
- Models hub by @maziyarpanahi in #13896
- Release/501 release candidate by @maziyarpanahi in #13895
Full Changelog: 5.0.0...5.0.1