JohnSnowLabs · maziyarpanahi · Sep 25, 2024 · Sep 1, 2024 · Sep 1, 2024 · Sep 1, 2024
diff --git a/README.md b/README.md
@@ -51,7 +51,7 @@ $ java -version
 $ conda create -n sparknlp python=3.7 -y
 $ conda activate sparknlp
 # spark-nlp by default is based on pyspark 3.x
-$ pip install spark-nlp==5.4.0 pyspark==3.3.1
+$ pip install spark-nlp==5.5.0-rc1 pyspark==3.3.1
 ```
 
 In Python console or Jupyter `Python3` kernel:
@@ -116,7 +116,7 @@ For a quick example of using pipelines and models take a look at our official [d
 
 ### Apache Spark Support
 
-Spark NLP *5.4.0* has been built on top of Apache Spark 3.4 while fully supports Apache Spark 3.0.x, 3.1.x, 3.2.x, 3.3.x, 3.4.x, and 3.5.x
+Spark NLP *5.5.0-rc1* has been built on top of Apache Spark 3.4 while fully supports Apache Spark 3.0.x, 3.1.x, 3.2.x, 3.3.x, 3.4.x, and 3.5.x
 
 | Spark NLP | Apache Spark 3.5.x | Apache Spark 3.4.x | Apache Spark 3.3.x | Apache Spark 3.2.x | Apache Spark 3.1.x | Apache Spark 3.0.x | Apache Spark 2.4.x | Apache Spark 2.3.x |
 |-----------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|
@@ -141,7 +141,7 @@ Find out more about 4.x `SparkNLP` versions in our official [documentation](http
 
 ### Databricks Support
 
-Spark NLP 5.4.0 has been tested and is compatible with the following runtimes:
+Spark NLP 5.5.0-rc1 has been tested and is compatible with the following runtimes:
 
 | **CPU**            | **GPU**            |
 |--------------------|--------------------|
@@ -154,7 +154,7 @@ We are compatible with older runtimes. For a full list check databricks support
 
 ### EMR Support
 
-Spark NLP 5.4.0 has been tested and is compatible with the following EMR releases:
+Spark NLP 5.5.0-rc1 has been tested and is compatible with the following EMR releases:
 
 | **EMR Release**    |
 |--------------------|
@@ -166,7 +166,7 @@ Spark NLP 5.4.0 has been tested and is compatible with the following EMR release
 We are compatible with older EMR releases. For a full list check EMR support in our official [documentation](https://sparknlp.org/docs/en/install#emr-support)
 
 Full list of [Amazon EMR 6.x releases](https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-release-6x.html)
-Full list 5.4.2mazon EMR 7.x releases](https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-release-7x.html)
+Full list 5.5.0-rc1mazon EMR 7.x releases](https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-release-7x.html)
 
 NOTE: The EMR 6.1.0 and 6.1.1 are not supported.
 
@@ -182,7 +182,7 @@ deployed to Maven central. To add any of our packages as a dependency in your ap
 from our official documentation.
 
 If you are interested, there is a simple SBT project for Spark NLP to guide you on how to use it in your
-projects [Spark NLP SBT S5.4.2r](https://github.com/maziyarpanahi/spark-nlp-starter)
+projects [Spark NLP SBT S5.5.0-rc1r](https://github.com/maziyarpanahi/spark-nlp-starter)
 
 ### Python
 
@@ -227,7 +227,7 @@ In Spark NLP we can define S3 locations to:
 
 Please check [these instructions](https://sparknlp.org/docs/en/install#s3-integration) from our official documentation.
 
-## Document5.4.2
+## Document5.5.0-rc1
 
 ### Examples
 
@@ -260,7 +260,7 @@ the Spark NLP library:
     keywords = {Spark, Natural language processing, Deep learning, Tensorflow, Cluster},
     abstract = {Spark NLP is a Natural Language Processing (NLP) library built on top of Apache Spark ML. It provides simple, performant & accurate NLP annotations for machine learning pipelines that can scale easily in a distributed environment. Spark NLP comes with 1100+ pretrained pipelines and models in more than 192+ languages. It supports nearly all the NLP tasks and modules that can be used seamlessly in a cluster. Downloaded more than 2.7 million times and experiencing 9x growth since January 2020, Spark NLP is used by 54% of healthcare organizations as the world’s most widely used NLP library in the enterprise.}
     }
-}5.4.2
+}5.5.0-rc1
 ```
 
 ## Community support

diff --git a/build.sbt b/build.sbt
@@ -6,7 +6,7 @@ name := getPackageName(is_silicon, is_gpu, is_aarch64)
 
 organization := "com.johnsnowlabs.nlp"
 
-version := "5.4.2"
+version := "5.5.0-rc1"
 
 (ThisBuild / scalaVersion) := scalaVer
 
@@ -180,6 +180,16 @@ val onnxDependencies: Seq[sbt.ModuleID] =
   else
     Seq(onnxCPU)
 
+val llamaCppDependencies =
+  if (is_gpu.equals("true"))
+    Seq(llamaCppGPU)
+  else if (is_silicon.equals("true"))
+    Seq(llamaCppSilicon)
+//  else if (is_aarch64.equals("true"))
+//    Seq(openVinoCPU)
+  else
+    Seq(llamaCppCPU)
+
 val openVinoDependencies: Seq[sbt.ModuleID] =
   if (is_gpu.equals("true"))
     Seq(openVinoGPU)
@@ -202,6 +212,7 @@ lazy val root = (project in file("."))
         utilDependencies ++
         tensorflowDependencies ++
         onnxDependencies ++
+        llamaCppDependencies ++
         openVinoDependencies ++
         typedDependencyParserDependencies,
     // TODO potentially improve this?

diff --git a/docs/_layouts/landing.html b/docs/_layouts/landing.html
@@ -201,7 +201,7 @@ <h3 class="grey h3_title">{{ _section.title }}</h3>
                   <div class="highlight-box">
     {% highlight bash %}
     # Using PyPI
-    $ pip install spark-nlp==5.4.2
+    $ pip install spark-nlp==5.5.0-rc1
 
     # Using Anaconda/Conda
     $ conda install -c johnsnowlabs spark-nlp

diff --git a/docs/en/advanced_settings.md b/docs/en/advanced_settings.md
@@ -52,7 +52,7 @@ spark = SparkSession.builder
     .config("spark.kryoserializer.buffer.max", "2000m")
     .config("spark.jsl.settings.pretrained.cache_folder", "sample_data/pretrained")
     .config("spark.jsl.settings.storage.cluster_tmp_dir", "sample_data/storage")
-    .config("spark.jars.packages", "com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0")
+    .config("spark.jars.packages", "com.johnsnowlabs.nlp:spark-nlp_2.12:5.5.0-rc1")
     .getOrCreate()
 ```
 
@@ -66,7 +66,7 @@ spark-shell \
   --conf spark.kryoserializer.buffer.max=2000M \
   --conf spark.jsl.settings.pretrained.cache_folder="sample_data/pretrained" \
   --conf spark.jsl.settings.storage.cluster_tmp_dir="sample_data/storage" \
-  --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0
+  --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.5.0-rc1
 ```
 
 **pyspark:**
@@ -79,7 +79,7 @@ pyspark \
   --conf spark.kryoserializer.buffer.max=2000M \
   --conf spark.jsl.settings.pretrained.cache_folder="sample_data/pretrained" \
   --conf spark.jsl.settings.storage.cluster_tmp_dir="sample_data/storage" \
-  --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0
+  --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.5.0-rc1
 ```
 
 **Databricks:**

diff --git a/docs/en/annotator_entries/AutoGGUF.md b/docs/en/annotator_entries/AutoGGUF.md
@@ -0,0 +1,135 @@
+{%- capture title -%}
+AutoGGUFModel
+{%- endcapture -%}
+
+{%- capture description -%}
+Annotator that uses the llama.cpp library to generate text completions with large language
+models.
+
+For settable parameters, and their explanations, see [HasLlamaCppProperties](https://github.com/JohnSnowLabs/spark-nlp/tree/master/src/main/scala/com/johnsnowlabs/nlp/HasLlamaCppProperties.scala) and refer to
+the llama.cpp documentation of
+[server.cpp](https://github.com/ggerganov/llama.cpp/tree/7d5e8777ae1d21af99d4f95be10db4870720da91/examples/server)
+for more information.
+
+If the parameters are not set, the annotator will default to use the parameters provided by
+the model.
+
+Pretrained models can be loaded with `pretrained` of the companion object:
+
+```scala
+val autoGGUFModel = AutoGGUFModel.pretrained()
+  .setInputCols("document")
+  .setOutputCol("completions")
+```
+
+The default model is `"gguf-phi3-mini-4k-instruct-q4"`, if no name is provided.
+
+For available pretrained models please see the [Models Hub](https://sparknlp.org/models).
+
+For extended examples of usage, see the
+[AutoGGUFModelTest](https://github.com/JohnSnowLabs/spark-nlp/tree/master/src/test/scala/com/johnsnowlabs/nlp/annotators/seq2seq/AutoGGUFModelTest.scala)
+and the
+[example notebook](https://github.com/JohnSnowLabs/spark-nlp/tree/master/examples/python/llama.cpp/llama.cpp_in_Spark_NLP_AutoGGUFModel.ipynb).
+
+**Note**: To use GPU inference with this annotator, make sure to use the Spark NLP GPU package and set
+the number of GPU layers with the `setNGpuLayers` method.
+
+When using larger models, we recommend adjusting GPU usage with `setNCtx` and `setNGpuLayers`
+according to your hardware to avoid out-of-memory errors.
+{%- endcapture -%}
+
+{%- capture input_anno -%}
+DOCUMENT
+{%- endcapture -%}
+
+{%- capture output_anno -%}
+DOCUMENT
+{%- endcapture -%}
+
+{%- capture python_example -%}
+>>> import sparknlp
+>>> from sparknlp.base import *
+>>> from sparknlp.annotator import *
+>>> from pyspark.ml import Pipeline
+>>> document = DocumentAssembler() \
+...     .setInputCol("text") \
+...     .setOutputCol("document")
+>>> autoGGUFModel = AutoGGUFModel.pretrained() \
+...     .setInputCols(["document"]) \
+...     .setOutputCol("completions") \
+...     .setBatchSize(4) \
+...     .setNPredict(20) \
+...     .setNGpuLayers(99) \
+...     .setTemperature(0.4) \
+...     .setTopK(40) \
+...     .setTopP(0.9) \
+...     .setPenalizeNl(True)
+>>> pipeline = Pipeline().setStages([document, autoGGUFModel])
+>>> data = spark.createDataFrame([["Hello, I am a"]]).toDF("text")
+>>> result = pipeline.fit(data).transform(data)
+>>> result.select("completions").show(truncate = False)
++-----------------------------------------------------------------------------------------------------------------------------------+
+|completions                                                                                                                        |
++-----------------------------------------------------------------------------------------------------------------------------------+
+|[{document, 0, 78,  new user.  I am currently working on a project and I need to create a list of , {prompt -> Hello, I am a}, []}]|
++-----------------------------------------------------------------------------------------------------------------------------------+
+{%- endcapture -%}
+
+{%- capture scala_example -%}
+import com.johnsnowlabs.nlp.base._
+import com.johnsnowlabs.nlp.annotator._
+import org.apache.spark.ml.Pipeline
+import spark.implicits._
+
+val document = new DocumentAssembler()
+  .setInputCol("text")
+  .setOutputCol("document")
+
+val autoGGUFModel = AutoGGUFModel
+  .pretrained()
+  .setInputCols("document")
+  .setOutputCol("completions")
+  .setBatchSize(4)
+  .setNPredict(20)
+  .setNGpuLayers(99)
+  .setTemperature(0.4f)
+  .setTopK(40)
+  .setTopP(0.9f)
+  .setPenalizeNl(true)
+
+val pipeline = new Pipeline().setStages(Array(document, autoGGUFModel))
+
+val data = Seq("Hello, I am a").toDF("text")
+val result = pipeline.fit(data).transform(data)
+result.select("completions").show(truncate = false)
++-----------------------------------------------------------------------------------------------------------------------------------+
+|completions                                                                                                                        |
++-----------------------------------------------------------------------------------------------------------------------------------+
+|[{document, 0, 78,  new user.  I am currently working on a project and I need to create a list of , {prompt -> Hello, I am a}, []}]|
++-----------------------------------------------------------------------------------------------------------------------------------+
+
+{%- endcapture -%}
+
+{%- capture api_link -%}
+[AutoGGUFModel](/api/com/johnsnowlabs/nlp/annotators/seq2seq/AutoGGUFModel)
+{%- endcapture -%}
+
+{%- capture python_api_link -%}
+[AutoGGUFModel](/api/python/reference/autosummary/sparknlp/annotator/seq2seq/auto_gguf_model/index.html)
+{%- endcapture -%}
+
+{%- capture source_link -%}
+[AutoGGUFModel](https://github.com/JohnSnowLabs/spark-nlp/tree/master/src/main/scala/com/johnsnowlabs/nlp/annotators/seq2seq/AutoGGUFModel.scala)
+{%- endcapture -%}
+
+{% include templates/anno_template.md
+title=title
+description=description
+input_anno=input_anno
+output_anno=output_anno
+python_example=python_example
+scala_example=scala_example
+api_link=api_link
+python_api_link=python_api_link
+source_link=source_link
+%}
diff --git a/docs/en/annotators.md b/docs/en/annotators.md
@@ -45,6 +45,7 @@ There are two types of Annotators:
 {:.table-model-big}
 |Annotator|Description|Version |
 |---|---|---|
+{% include templates/anno_table_entry.md path="" name="AutoGGUFModel" summary="Annotator that uses the llama.cpp library to generate text completions with large language models."%}
 {% include templates/anno_table_entry.md path="" name="BGEEmbeddings" summary="Sentence embeddings using BGE."%}
 {% include templates/anno_table_entry.md path="" name="BigTextMatcher" summary="Annotator to match exact phrases (by token) provided in a file against a Document."%}
 {% include templates/anno_table_entry.md path="" name="Chunk2Doc" summary="Converts a `CHUNK` type column back into `DOCUMENT`. Useful when trying to re-tokenize or do further analysis on a `CHUNK` result."%}

diff --git a/docs/en/concepts.md b/docs/en/concepts.md
@@ -66,7 +66,7 @@ $ java -version
 $ conda create -n sparknlp python=3.7 -y
 $ conda activate sparknlp
 # spark-nlp by default is based on pyspark 3.x
-$ pip install spark-nlp==5.4.2 pyspark==3.3.1 jupyter
+$ pip install spark-nlp==5.5.0-rc1 pyspark==3.3.1 jupyter
 $ jupyter notebook
 ```
 

diff --git a/docs/en/examples.md b/docs/en/examples.md
@@ -18,7 +18,7 @@ $ java -version
 # should be Java 8 (Oracle or OpenJDK)
 $ conda create -n sparknlp python=3.7 -y
 $ conda activate sparknlp
-$ pip install spark-nlp==5.4.2 pyspark==3.3.1
+$ pip install spark-nlp==5.5.0-rc1 pyspark==3.3.1
 ```
 
 </div><div class="h3-box" markdown="1">
@@ -40,7 +40,7 @@ This script comes with the two options to define `pyspark` and `spark-nlp` versi
 # -p is for pyspark
 # -s is for spark-nlp
 # by default they are set to the latest
-!bash colab.sh -p 3.2.3 -s 5.4.2
+!bash colab.sh -p 3.2.3 -s 5.5.0-rc1
 ```
 
 [Spark NLP quick start on Google Colab](https://colab.research.google.com/github/JohnSnowLabs/spark-nlp/blob/master/examples/python/quick_start_google_colab.ipynb) is a live demo on Google Colab that performs named entity recognitions and sentiment analysis by using Spark NLP pretrained pipelines.

diff --git a/docs/en/hardware_acceleration.md b/docs/en/hardware_acceleration.md
@@ -50,7 +50,7 @@ Since the new Transformer models such as BERT for Word and Sentence embeddings a
 | DeBERTa Large     |        +477%(5.8x)        |
 | Longformer Base   |         +52%(1.5x)        |
 
-Spark NLP 5.4.2 is built with TensorFlow 2.7.1 and the following NVIDIA® software are only required for GPU support:
+Spark NLP 5.5.0-rc1 is built with TensorFlow 2.7.1 and the following NVIDIA® software are only required for GPU support:
 
 - NVIDIA® GPU drivers version 450.80.02 or higher
 - CUDA® Toolkit 11.2