python bindings? #82

bryanhpchiang · 2023-03-13T07:00:42Z

No description provided.

MarkSchmidty · 2023-03-14T05:50:18Z

Python Bindings for llama.cpp: https://pypi.org/project/llamacpp/0.1.3/ (not mine, just found them)

shaunabanana · 2023-03-15T13:16:56Z

As a temporary work-around before an "official" binding is available, I've written a quick script to call the llama.cpp executable that supports streaming and interactive mode: https://github.com/shaunabanana/llama.py

aratic · 2023-03-15T16:21:07Z

Python Bindings for llama.cpp: https://pypi.org/project/llamacpp/0.1.3/ (not mine, just found them)

looks promising on description, will try try and feedback

bryanhpchiang · 2023-03-15T20:47:41Z

sweet will give it a shot

…

On Wed, Mar 15, 2023 at 9:21 AM, aratic < ***@***.*** > wrote: > > > Python Bindings for llama.cpp: https:/ / pypi. org/ project/ llamacpp/ 0. 1. > 3/ ( https://pypi.org/project/llamacpp/0.1.3/ ) (not mine, just found them) > > > looks promising on description, will try try and feedback — Reply to this email directly, view it on GitHub ( #82 (comment) ) , or unsubscribe ( https://github.com/notifications/unsubscribe-auth/AXMEJMCMQRUJKKEAIAHOJXLW4HT77ANCNFSM6AAAAAAVYVLGQQ ). You are receiving this because you authored the thread. Message ID: <ggerganov/llama . cpp/issues/82/1470344972 @ github. com>

seemanne · 2023-03-17T21:04:43Z

I hacked something together tonight on this. Its python-cpp bindings for the model directly (allowing you to call model.generate() from python and getting the string in returned. It doesn't support setting parameters within python yet (working on it), but it is model agnostic so you can load whatever ggml supports.

Merging it would however require splitting some parts of the code out of main.cpp which @ggerganov has argued against IIRc.

https://github.com/seemanne/llamacpypy

aratic · 2023-03-18T05:21:26Z

@seemanne u did what I want, dude, easier to expose or integrate with web or chat, appreciate and will give it a try

seemanne · 2023-03-18T12:44:15Z

Ok I updated this and put it into a proper fork. You can now pass parameters in python. I will need to do some refactoring to pull upstream changes each time but it should work and i tested it on linux and mac.

LostRuins · 2023-03-18T16:52:51Z

I wrote my own ctypes bindings and wrapped it an a KoboldAI compatible REST API.

https://github.com/LostRuins/llamacpp-for-kobold

abetlen · 2023-03-22T11:03:24Z

EDIT: I've adapted the single-file bindings into a pip-installable package (will build llama.cpp on install) called llama-cpp-python

If anyone's just looking for python bindings I put together llama.py which uses ctypes to expose the current C API.

To use it you have to first build llama.cpp as a shared library and then put the shared library in the same directory as the llama.py file.

On Linux for example, to build the shared library, update the Makefile to add a new target for libllama.so

libllama.so: llama.o ggml.o
	$(CXX) $(CXXFLAGS) -shared -fPIC -o libllama.so llama.o ggml.o $(LDFLAGS)

Then run make libllama.so to generate the library.

Ayushk4 · 2023-03-22T20:19:39Z

We are putting together a Huggingface-like library with python interface that auto-downloads pre-compressed models at https://github.com/NolanoOrg/cformers/#usage
Please let us know what features and models would you us to add.

Piezoid · 2023-03-22T21:49:16Z

I also found these bindings https://github.com/PotatoSpudowski/fastLLaMa

Some feature suggestions, mostly about low level capabilities:

Accessing the output classifier activations from python, enabling sampling and quantitative evaluation from python,
Managing k/v state with its own python object, allowing to swap them in and out.
Array view on embedding and the possibility to bypass ggml_get_rows for feeding back embedding crafted by hand.

DrBenjamin · 2023-03-27T08:58:24Z

EDIT: I've adapted the single-file bindings into a pip-installable package (will build llama.cpp on install) called llama-cpp-python

If anyone's just looking for python bindings I put together llama.py which uses ctypes to expose the current C API.

To use it you have to first build llama.cpp as a shared library and then put the shared library in the same directory as the llama.py file.

On Linux for example, to build the shared library, update the Makefile to add a new target for libllama.so
libllama.so: llama.o ggml.o
	$(CXX) $(CXXFLAGS) -shared -fPIC -o libllama.so llama.o ggml.o $(LDFLAGS)
Then run make libllama.so to generate the library.

Having issues with both variants on a M1 Mac:
from llama_cpp import Llama
produces this error:

zsh: illegal hardware instruction

The python bindings approach (after building the shared library) produces:
libllama.so' (mach-o file, but is an incompatible architecture (have 'arm64', need 'x86_64'))

PotatoSpudowski · 2023-04-17T17:49:01Z

I also found these bindings https://github.com/PotatoSpudowski/fastLLaMa

Some feature suggestions, mostly about low level capabilities:

Accessing the output classifier activations from python, enabling sampling and quantitative evaluation from python,

Managing k/v state with its own python object, allowing to swap them in and out.

Array view on embedding and the possibility to bypass ggml_get_rows for feeding back embedding crafted by hand.

We have added most of these suggestion in the latest fastLLaMa update 👀

dmahurin · 2023-05-17T12:47:37Z

As #1156 is closed as a duplicate of this issue, I am bringing the discussion here about the creation of an official python binding in the llama.cpp repository (which I now assume the the objective of this issue.)

The current external python bindings seem to be:

llama-cpp-python
llamacpp
pyllamacpp
llamacpypy
fastllama

But none really stand out as a candidate to be merged into llama.cpp.

My proposal is to model the llama.cpp bindings after rwkv.cpp by @saharNooby (bert.cpp also follows a similar path).

Assume llama.cpp is built as shared library (built with BUILD_SHARED_LIBS=ON )
Create basic python bindings that just expose functions in the shared library as is
(optional) Create a higher level model that that builds on the basic bindings
Change the examples to be written in python, rather than bash

We could keep the following in mind for the basic binding:

completeness - should be a complete binding, aligning to the llama.cpp interface
simplicity - relatively straight-forward, easy to understand implementation
easily to maintain

Any suggestion on which of the current external python bindings could be considered a good start for eventual merge into llama.cpp?

dmahurin · 2023-05-17T12:59:50Z

If anyone's just looking for python bindings I put together llama.py which uses ctypes to expose the current C API.

@abetlen , could this single file implementation be a starting point for a basic binding mentioned above?

abetlen · 2023-05-17T16:10:26Z

Hey @dmahurin w.r.t your proposal I should point out that what you describe is the current state of llama-cpp-python

Builds llama.cpp as a shared library with support for all the llama.cpp build flags for OpenBLAS, CUDA, CLBLAST
Exposes the entire llama.h api as-is via ctypes
Exposes a higher-level api that handles type conversions, memory management, etc
Includes examples for both APIs in python

That being said I don't have anything against moving these bindings to llama.cpp if that's something the maintainers think is worthwhile / the right approach. I would also be happy to transfer over the PyPI package as long as we don't break downstream users (text-generation-webui, langchain, babyagi, etc).

seemanne · 2023-05-17T16:29:23Z

@dmahurin I don't see how merging python bindings into this repo is needed when solutions like the repo of @abetlen exist already.
Putting the maintenance burden of a mainly python library on mainly cpp developers just so bash can be removed from the readme seems unwise. The python bindings are already linked in the readme, those who want them will find them.

dmahurin · 2023-05-17T16:53:55Z

Hi @seemanne, the purpose is not to replace bash. The purpose is to widen the development community. Like it or not, Python is a very common language in AI development.

I do not think having supported python code would put any burden on cpp developers. Again, reference rwkv.cpp and bert.cpp. The python support in rwkv.cpp for example comes in the form of two python files.

As mentioned, there are 5 independent python bindings for llama.cpp. Unifying at least the base python binding would help to focus related python llama.cpp development.

dmahurin · 2023-06-01T17:19:57Z

@abetlen, perhaps you saw that I created pull request #1660 to add low level python bindings from llama-cpp-python.

The PR puts llama_cpp.py and low level examples in the examples/ folder.

There was a bit of filtering and some squashing to get clean history for the low level commits. For now I excluded the multi char change, mainly because it created a dependency on another file, util.py. (and the change looks more complex than I would expect).

Any comments on the approach of the PR?

gjmulder mentioned this issue Mar 13, 2023

Improving quality with 8bit? #53

Closed

Piezoid mentioned this issue Mar 13, 2023

Add sentencepiece tokenizer and modify build (Support UTF-8 / Emoijs) #66

Closed

aratic mentioned this issue Mar 15, 2023

feature request, restful api / exposure #162

Closed

Ronsor mentioned this issue Mar 15, 2023

[Proposal] "Stable" C API #171

Closed

gjmulder added the enhancement New feature or request label Mar 15, 2023

j-f1 mentioned this issue Mar 18, 2023

How to use it in Python #253

Closed

This was referenced May 12, 2023

[User] Official python binding support #1390

Closed

[Enhancement] Officially supported/provided python bindings #1156

Closed

dmahurin mentioned this issue Jun 1, 2023

Llama cpp low level python bindings #1660

Open

ggerganov closed this as completed Jul 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

python bindings? #82

python bindings? #82

bryanhpchiang commented Mar 13, 2023

MarkSchmidty commented Mar 14, 2023

shaunabanana commented Mar 15, 2023

aratic commented Mar 15, 2023

bryanhpchiang commented Mar 15, 2023 via email

seemanne commented Mar 17, 2023

aratic commented Mar 18, 2023

seemanne commented Mar 18, 2023

LostRuins commented Mar 18, 2023 •

edited

Loading

abetlen commented Mar 22, 2023 •

edited

Loading

Ayushk4 commented Mar 22, 2023

Piezoid commented Mar 22, 2023

DrBenjamin commented Mar 27, 2023

PotatoSpudowski commented Apr 17, 2023

dmahurin commented May 17, 2023 •

edited

Loading

dmahurin commented May 17, 2023 •

edited

Loading

abetlen commented May 17, 2023

seemanne commented May 17, 2023

dmahurin commented May 17, 2023 •

edited

Loading

dmahurin commented Jun 1, 2023

python bindings? #82

python bindings? #82

Comments

bryanhpchiang commented Mar 13, 2023

MarkSchmidty commented Mar 14, 2023

shaunabanana commented Mar 15, 2023

aratic commented Mar 15, 2023

bryanhpchiang commented Mar 15, 2023 via email

seemanne commented Mar 17, 2023

aratic commented Mar 18, 2023

seemanne commented Mar 18, 2023

LostRuins commented Mar 18, 2023 • edited Loading

abetlen commented Mar 22, 2023 • edited Loading

Ayushk4 commented Mar 22, 2023

Piezoid commented Mar 22, 2023

DrBenjamin commented Mar 27, 2023

PotatoSpudowski commented Apr 17, 2023

dmahurin commented May 17, 2023 • edited Loading

dmahurin commented May 17, 2023 • edited Loading

abetlen commented May 17, 2023

seemanne commented May 17, 2023

dmahurin commented May 17, 2023 • edited Loading

dmahurin commented Jun 1, 2023

LostRuins commented Mar 18, 2023 •

edited

Loading

abetlen commented Mar 22, 2023 •

edited

Loading

dmahurin commented May 17, 2023 •

edited

Loading

dmahurin commented May 17, 2023 •

edited

Loading

dmahurin commented May 17, 2023 •

edited

Loading