[SPARKNLP-1027] llama.cpp integration #14364

DevinTDHa · 2024-08-08T14:57:49Z

Description

This PR implements support for llama.cpp in Spark NLP.

llama.cpp is a high-performance C/C++ library designed for running Meta's LLaMA models and other large language models (LLMs) on a variety of hardware platforms.

This will enable users to do inference of LLMs wich a variety of optimizations:

Hardware Optimization: Supports Apple silicon via Metal, x86 architectures with AVX, NVIDIA GPUs (via CUDA).
Quantization: Model Quantization (1.5-bit to 8-bit) to improve inference speed and reduce memory usage.

Motivation and Context

Many users will have clusters with many smaller nodes. This will enable these nodes to also perform inference for LLMs with limited memory.

How Has This Been Tested?

Local tests, google colab, databricks

Types of changes

Bug fix (non-breaking change which fixes an issue)
Code improvements with no or little impact
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

My code follows the code style of this project.
My change requires a change to the documentation.
I have updated the documentation accordingly.
I have read the CONTRIBUTING page.
I have added tests to cover my changes.
All new and existing tests passed.

Add metadata to AutoGGUFModel

…27-llama-cpp-integration

coveralls · 2024-09-02T18:53:45Z

Pull Request Test Coverage Report for Build 10718481706

Details

0 of 383 (0.0%) changed or added relevant lines in 2 files are covered.
49 unchanged lines in 33 files lost coverage.
Overall coverage decreased (-1.6%) to 60.231%

Changes Missing Coverage	Covered Lines	Changed/Added Lines	%
src/main/scala/com/johnsnowlabs/nlp/pretrained/ResourceDownloader.scala	0	2	0.0%
src/main/scala/com/johnsnowlabs/nlp/HasLlamaCppProperties.scala	0	381	0.0%

Files with Coverage Reduction	New Missed Lines	%
src/main/scala/com/johnsnowlabs/nlp/annotators/DateMatcherTranslator.scala	1	89.35%
src/main/scala/com/johnsnowlabs/nlp/annotators/parser/dep/Tagger.scala	1	47.54%
src/main/scala/com/johnsnowlabs/nlp/annotators/ner/crf/FeatureGenerator.scala	1	94.41%
src/main/scala/com/johnsnowlabs/nlp/RawAnnotator.scala	1	80.0%
src/main/scala/com/johnsnowlabs/nlp/annotators/er/EntityRulerApproach.scala	1	95.11%
src/main/scala/com/johnsnowlabs/nlp/annotators/ner/dl/NerDLApproach.scala	1	80.0%
src/main/scala/com/johnsnowlabs/nlp/pretrained/ResourceDownloader.scala	1	41.04%
src/main/scala/com/johnsnowlabs/nlp/pretrained/S3ResourceDownloader.scala	1	52.59%
src/main/scala/com/johnsnowlabs/nlp/annotators/sda/vivekn/ViveknSentimentUtils.scala	1	52.94%
src/main/scala/com/johnsnowlabs/nlp/annotators/keyword/yake/YakeKeywordExtraction.scala	1	97.44%

Totals
Change from base Build 10678787199:	-1.6%
Covered Lines:	8981
Relevant Lines:	14911

💛 - Coveralls

…27-llama-cpp-integration

maziyarpanahi · 2024-09-02T18:03:33Z

build.sbt

+    Seq(llamaCppGPU)
+  else if (is_silicon.equals("true"))
+    Seq(llamaCppSilicon)
+//  else if (is_aarch64.equals("true"))


@DevinTDHa We don't need a special build for aarch64 or it's not supported?

DevinTDHa added 11 commits August 8, 2024 15:22

[SPARKNLP-1027] Initial Tests passing

cb45d79

[SPARKNLP-1027] Implement Parameters

009a91d

Add metadata to AutoGGUFModel

[SPARKNLP-1027] Add metadata to AutoGGUFModel

479ef8e

[SPARKNLP-1027] Scala Side

4c0d46a

[SPARKNLP-1027] Initial Python Tests running and parameters fixed

30c1f4a

[SPARKNLP-1027] AutoGGUFModel can auto-detect GPU

320a7fa

[SPARKNLP-1027] Complete Documentation

499a081

[SPARKNLP-1027] Add missing parameters

82f09fb

[SPARKNLP-1027] Add Support for StructFeature setters on python side

00a7904

[SPARKNLP-1027] Add llama.cpp dependencies

bbcde4d

[SPARKNLP-1027] getMetadata for Python side

5850132

DevinTDHa added new-feature Introducing a new feature dependencies Pull requests that update a dependency file labels Aug 8, 2024

DevinTDHa requested a review from maziyarpanahi August 8, 2024 14:57

DevinTDHa self-assigned this Aug 8, 2024

DevinTDHa added the DON'T MERGE Do not merge this PR label Aug 8, 2024

DevinTDHa changed the title ~~Feature/sparknlp 1027 llama cpp integration~~ [SPARKNLP-1027] llama.cpp integration Aug 15, 2024

DevinTDHa added 4 commits August 15, 2024 16:55

Bump jsl-llamacpp to 0.1.0-rc3

94bd5b9

[SPARKNLP-1027] Exception Handling and Finalize tests

3241a68

[SPARKNLP-1027] Update jsl-llamacpp version

a96284f

[SPARKNLP-1027] Update Documentation

5c3720f

maziyarpanahi changed the base branch from master to release/550-release-candidate September 2, 2024 17:59

Merge branch 'release/550-release-candidate' into feature/SPARKNLP-10…

a1e9344

…27-llama-cpp-integration

Merge branch 'release/550-release-candidate' into feature/SPARKNLP-10…

c474d66

…27-llama-cpp-integration

maziyarpanahi marked this pull request as ready for review September 3, 2024 10:39

[SPARKNLP-1027] Remove old Parameters

ff20a72

maziyarpanahi approved these changes Sep 5, 2024

View reviewed changes

maziyarpanahi merged commit c2c0e48 into JohnSnowLabs:release/550-release-candidate Sep 5, 2024
6 checks passed

maziyarpanahi mentioned this pull request Sep 22, 2024

release/550-release-candidate #14389

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARKNLP-1027] llama.cpp integration #14364

[SPARKNLP-1027] llama.cpp integration #14364

DevinTDHa commented Aug 8, 2024

coveralls commented Sep 2, 2024 •

edited

Loading

maziyarpanahi Sep 2, 2024

[SPARKNLP-1027] llama.cpp integration #14364

[SPARKNLP-1027] llama.cpp integration #14364

Conversation

DevinTDHa commented Aug 8, 2024

Description

Motivation and Context

How Has This Been Tested?

Types of changes

Checklist:

coveralls commented Sep 2, 2024 • edited Loading

Pull Request Test Coverage Report for Build 10718481706

Details

💛 - Coveralls

maziyarpanahi Sep 2, 2024

Choose a reason for hiding this comment

coveralls commented Sep 2, 2024 •

edited

Loading