Spark NLP 5.1.2: Unveiling the First Image-to-Text VisionEncoderDecoder, Over 3,000 ONNX state-of-the-art Transformer Models, Overhaul update in documentation, and bug fixes!
π’ Overview
For the first time, Spark NLP 5.1.2 π proudly presents a new image-to-text annotator designed for captioning images. Additionally, we've added over 3,000 state-of-the-art transformer models in ONNX format to ensure rapid inference in your RAG when you are using LLMs.
We're pleased to announce that our Models Hub now boasts 21,000+ free and truly open-source models & pipelines π. Our deepest gratitude goes out to our community for their invaluable feedback, feature suggestions, and contributions.
π₯ New Features & Enhancements
- NEW: We're excited to introduce the
VisionEncoderDecoderForImageCaptioning
annotator, designed specifically for image-to-text captioning. We used VisionEncoderDecoderModel to import models fine-tuned for auto image captioning
The VisionEncoderDecoder can be employed to set up an image-to-text model. The encoding part can utilize any pretrained Transformer-based vision model, such as ViT, BEiT, DeiT, or Swin. Meanwhile, for the decoding part, it can make use of any pretrained language model like RoBERTa, GPT2, BERT, or DistilBERT.
The efficacy of using pretrained checkpoints to initialize image-to-text-sequence models is evident in the study titled TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models by Minghao Li, Tengchao Lv, Lei Cui, Yijuan Lu, Dinei Florencio, Cha Zhang, Zhoujun Li, and Furu Wei.
Image Captioning Using Hugging Face Vision Encoder Decoder β Step2Step Guide (Part 2)
-
NEW: We've added cutting-edge transformer models in ONNX format for seamless integration. Our annotators will automatically recognize and utilize these models, streamlining your LLM pipelines without any additional setup.
-
We have added all the missing features from our documentation and added examples to Python and Scala APIs:
- E5Embeddings
- InstructorEmbeddings
- MPNetEmbeddings
- OpenAICompletion
- VisionEncoderDecoderForImageCaptioning
- DocumentSimilarityRanker
- BartForZeroShotClassification
- XlmRoBertaForZeroShotClassification
- CamemBertForQuestionAnswering
- DeBertaForSequenceClassification
- DeBertaForTokenClassification
- Date2Chunk
π Bug Fixes
- We've made a minor adjustment to the beam search algorithm, enhancing the quality of the BART Transformer results.
π New Notebooks
Notebooks | Colab |
---|---|
Vision Encoder Decoder: Image Captioning at Scale in Spark NLP | |
Import Whisper models (ONNX) |
π Documentation
- Import models from TF Hub & HuggingFace
- Spark NLP Notebooks
- Models Hub with new models
- Spark NLP Articles
- Spark NLP in Action
- Spark NLP Documentation
- Spark NLP Scala APIs
- Spark NLP Python APIs
β€οΈ Community support
- Slack For live discussion with the Spark NLP community and the team
- GitHub Bug reports, feature requests, and contributions
- Discussions Engage with other community members, share ideas, and show off how you use Spark NLP!
- Medium Spark NLP articles
- YouTube Spark NLP video tutorials
Installation
Python
#PyPI
pip install spark-nlp==5.1.2
Spark Packages
spark-nlp on Apache Spark 3.0.x, 3.1.x, 3.2.x, 3.3.x, and 3.4.x (Scala 2.12):
spark-shell --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.1.2
pyspark --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.1.2
GPU
spark-shell --packages com.johnsnowlabs.nlp:spark-nlp-gpu_2.12:5.1.2
pyspark --packages com.johnsnowlabs.nlp:spark-nlp-gpu_2.12:5.1.2
Apple Silicon (M1 & M2)
spark-shell --packages com.johnsnowlabs.nlp:spark-nlp-silicon_2.12:5.1.2
pyspark --packages com.johnsnowlabs.nlp:spark-nlp-silicon_2.12:5.1.2
AArch64
spark-shell --packages com.johnsnowlabs.nlp:spark-nlp-aarch64_2.12:5.1.2
pyspark --packages com.johnsnowlabs.nlp:spark-nlp-aarch64_2.12:5.1.2
Maven
spark-nlp on Apache Spark 3.0.x, 3.1.x, 3.2.x, 3.3.x, and 3.4.x:
<dependency>
<groupId>com.johnsnowlabs.nlp</groupId>
<artifactId>spark-nlp_2.12</artifactId>
<version>5.1.2</version>
</dependency>
spark-nlp-gpu:
<dependency>
<groupId>com.johnsnowlabs.nlp</groupId>
<artifactId>spark-nlp-gpu_2.12</artifactId>
<version>5.1.2</version>
</dependency>
spark-nlp-silicon:
<dependency>
<groupId>com.johnsnowlabs.nlp</groupId>
<artifactId>spark-nlp-silicon_2.12</artifactId>
<version>5.1.2</version>
</dependency>
spark-nlp-aarch64:
<dependency>
<groupId>com.johnsnowlabs.nlp</groupId>
<artifactId>spark-nlp-aarch64_2.12</artifactId>
<version>5.1.2</version>
</dependency>
FAT JARs
-
CPU on Apache Spark 3.x/3.1.x/3.2.x/3.3.x/3.4.x: https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/jars/spark-nlp-assembly-5.1.2.jar
-
GPU on Apache Spark 3.0.x/3.1.x/3.2.x/3.3.x/3.4.x: https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/jars/spark-nlp-gpu-assembly-5.1.2.jar
-
M1 on Apache Spark 3.0.x/3.1.x/3.2.x/3.3.x/3.4.x: https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/jars/spark-nlp-silicon-assembly-5.1.2.jar
-
AArch64 on Apache Spark 3.0.x/3.1.x/3.2.x/3.3.x/3.4.x: https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/jars/spark-nlp-aarch64-assembly-5.1.2.jar
What's Changed
- FAQ fix by @agsfer in #13985
- faq fix by @agsfer in #13986
- Models hub by @maziyarpanahi in #14006 @ahmedlone127
- Release/512 release candidate by @maziyarpanahi in #14007 @DevinTDHa
Full Changelog: 5.1.1...5.1.2