Skip to content

Commit

Permalink
[SPARKNLP-1027] llama.cpp integration (#14364)
Browse files Browse the repository at this point in the history
* [SPARKNLP-1027] Initial Tests passing

* [SPARKNLP-1027] Implement Parameters

Add metadata to AutoGGUFModel

* [SPARKNLP-1027] Add metadata to AutoGGUFModel

* [SPARKNLP-1027] Scala Side

* [SPARKNLP-1027] Initial Python Tests running and parameters fixed

* [SPARKNLP-1027]  AutoGGUFModel can auto-detect GPU

* [SPARKNLP-1027] Complete Documentation

* [SPARKNLP-1027] Add missing parameters

* [SPARKNLP-1027] Add Support for StructFeature setters on python side

* [SPARKNLP-1027] Add llama.cpp dependencies

* [SPARKNLP-1027] getMetadata for Python side

* Bump jsl-llamacpp to 0.1.0-rc3

* [SPARKNLP-1027] Exception Handling and Finalize tests

* [SPARKNLP-1027] Update jsl-llamacpp version

* [SPARKNLP-1027] Update Documentation

* [SPARKNLP-1027] Remove old Parameters

---------

Co-authored-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>
  • Loading branch information
DevinTDHa and maziyarpanahi authored Sep 5, 2024
1 parent 8d4dc21 commit c2c0e48
Show file tree
Hide file tree
Showing 20 changed files with 3,640 additions and 18 deletions.
13 changes: 12 additions & 1 deletion build.sbt
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ name := getPackageName(is_silicon, is_gpu, is_aarch64)

organization := "com.johnsnowlabs.nlp"

version := "5.4.2"
version := "5.5.0"

(ThisBuild / scalaVersion) := scalaVer

Expand Down Expand Up @@ -180,6 +180,16 @@ val onnxDependencies: Seq[sbt.ModuleID] =
else
Seq(onnxCPU)

val llamaCppDependencies =
if (is_gpu.equals("true"))
Seq(llamaCppGPU)
else if (is_silicon.equals("true"))
Seq(llamaCppSilicon)
// else if (is_aarch64.equals("true"))
// Seq(openVinoCPU)
else
Seq(llamaCppCPU)

val openVinoDependencies: Seq[sbt.ModuleID] =
if (is_gpu.equals("true"))
Seq(openVinoGPU)
Expand All @@ -202,6 +212,7 @@ lazy val root = (project in file("."))
utilDependencies ++
tensorflowDependencies ++
onnxDependencies ++
llamaCppDependencies ++
openVinoDependencies ++
typedDependencyParserDependencies,
// TODO potentially improve this?
Expand Down
135 changes: 135 additions & 0 deletions docs/en/annotator_entries/AutoGGUF.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,135 @@
{%- capture title -%}
AutoGGUFModel
{%- endcapture -%}

{%- capture description -%}
Annotator that uses the llama.cpp library to generate text completions with large language
models.

For settable parameters, and their explanations, see [HasLlamaCppProperties](https://github.com/JohnSnowLabs/spark-nlp/tree/master/src/main/scala/com/johnsnowlabs/nlp/HasLlamaCppProperties.scala) and refer to
the llama.cpp documentation of
[server.cpp](https://github.com/ggerganov/llama.cpp/tree/7d5e8777ae1d21af99d4f95be10db4870720da91/examples/server)
for more information.

If the parameters are not set, the annotator will default to use the parameters provided by
the model.

Pretrained models can be loaded with `pretrained` of the companion object:

```scala
val autoGGUFModel = AutoGGUFModel.pretrained()
.setInputCols("document")
.setOutputCol("completions")
```

The default model is `"gguf-phi3-mini-4k-instruct-q4"`, if no name is provided.

For available pretrained models please see the [Models Hub](https://sparknlp.org/models).

For extended examples of usage, see the
[AutoGGUFModelTest](https://github.com/JohnSnowLabs/spark-nlp/tree/master/src/test/scala/com/johnsnowlabs/nlp/annotators/seq2seq/AutoGGUFModelTest.scala)
and the
[example notebook](https://github.com/JohnSnowLabs/spark-nlp/tree/master/examples/python/llama.cpp/llama.cpp_in_Spark_NLP_AutoGGUFModel.ipynb).

**Note**: To use GPU inference with this annotator, make sure to use the Spark NLP GPU package and set
the number of GPU layers with the `setNGpuLayers` method.

When using larger models, we recommend adjusting GPU usage with `setNCtx` and `setNGpuLayers`
according to your hardware to avoid out-of-memory errors.
{%- endcapture -%}

{%- capture input_anno -%}
DOCUMENT
{%- endcapture -%}

{%- capture output_anno -%}
DOCUMENT
{%- endcapture -%}

{%- capture python_example -%}
>>> import sparknlp
>>> from sparknlp.base import *
>>> from sparknlp.annotator import *
>>> from pyspark.ml import Pipeline
>>> document = DocumentAssembler() \
... .setInputCol("text") \
... .setOutputCol("document")
>>> autoGGUFModel = AutoGGUFModel.pretrained() \
... .setInputCols(["document"]) \
... .setOutputCol("completions") \
... .setBatchSize(4) \
... .setNPredict(20) \
... .setNGpuLayers(99) \
... .setTemperature(0.4) \
... .setTopK(40) \
... .setTopP(0.9) \
... .setPenalizeNl(True)
>>> pipeline = Pipeline().setStages([document, autoGGUFModel])
>>> data = spark.createDataFrame([["Hello, I am a"]]).toDF("text")
>>> result = pipeline.fit(data).transform(data)
>>> result.select("completions").show(truncate = False)
+-----------------------------------------------------------------------------------------------------------------------------------+
|completions |
+-----------------------------------------------------------------------------------------------------------------------------------+
|[{document, 0, 78, new user. I am currently working on a project and I need to create a list of , {prompt -> Hello, I am a}, []}]|
+-----------------------------------------------------------------------------------------------------------------------------------+
{%- endcapture -%}

{%- capture scala_example -%}
import com.johnsnowlabs.nlp.base._
import com.johnsnowlabs.nlp.annotator._
import org.apache.spark.ml.Pipeline
import spark.implicits._

val document = new DocumentAssembler()
.setInputCol("text")
.setOutputCol("document")

val autoGGUFModel = AutoGGUFModel
.pretrained()
.setInputCols("document")
.setOutputCol("completions")
.setBatchSize(4)
.setNPredict(20)
.setNGpuLayers(99)
.setTemperature(0.4f)
.setTopK(40)
.setTopP(0.9f)
.setPenalizeNl(true)

val pipeline = new Pipeline().setStages(Array(document, autoGGUFModel))

val data = Seq("Hello, I am a").toDF("text")
val result = pipeline.fit(data).transform(data)
result.select("completions").show(truncate = false)
+-----------------------------------------------------------------------------------------------------------------------------------+
|completions |
+-----------------------------------------------------------------------------------------------------------------------------------+
|[{document, 0, 78, new user. I am currently working on a project and I need to create a list of , {prompt -> Hello, I am a}, []}]|
+-----------------------------------------------------------------------------------------------------------------------------------+

{%- endcapture -%}

{%- capture api_link -%}
[AutoGGUFModel](/api/com/johnsnowlabs/nlp/annotators/seq2seq/AutoGGUFModel)
{%- endcapture -%}

{%- capture python_api_link -%}
[AutoGGUFModel](/api/python/reference/autosummary/sparknlp/annotator/seq2seq/auto_gguf_model/index.html)
{%- endcapture -%}

{%- capture source_link -%}
[AutoGGUFModel](https://github.com/JohnSnowLabs/spark-nlp/tree/master/src/main/scala/com/johnsnowlabs/nlp/annotators/seq2seq/AutoGGUFModel.scala)
{%- endcapture -%}

{% include templates/anno_template.md
title=title
description=description
input_anno=input_anno
output_anno=output_anno
python_example=python_example
scala_example=scala_example
api_link=api_link
python_api_link=python_api_link
source_link=source_link
%}
1 change: 1 addition & 0 deletions docs/en/annotators.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@ There are two types of Annotators:
{:.table-model-big}
|Annotator|Description|Version |
|---|---|---|
{% include templates/anno_table_entry.md path="" name="AutoGGUFModel" summary="Annotator that uses the llama.cpp library to generate text completions with large language models."%}
{% include templates/anno_table_entry.md path="" name="BGEEmbeddings" summary="Sentence embeddings using BGE."%}
{% include templates/anno_table_entry.md path="" name="BigTextMatcher" summary="Annotator to match exact phrases (by token) provided in a file against a Document."%}
{% include templates/anno_table_entry.md path="" name="Chunk2Doc" summary="Converts a `CHUNK` type column back into `DOCUMENT`. Useful when trying to re-tokenize or do further analysis on a `CHUNK` result."%}
Expand Down
Loading

0 comments on commit c2c0e48

Please sign in to comment.