Multimodal Support (Llava 1.5) #821

damian0815 · 2023-10-15T17:39:34Z

Works with ggerganov/llama.cpp#3613 , i.e. branch llava_servable on my fork of llama.cpp https://github.com/damian0815/llama.cpp/tree/llava_servable .

llava C method bindings have just been added to the bottom of llama_cpp.py - lmk if there's somewhere better they should go.

To run the example:

cp llama-cpp-python/examples/multimodal
python llava.py -m path/to/llava-v1.5/ggml-model-q5_k.gguf --mmproj path/to/mmproj-model-f16.gguf

damian0815 · 2023-10-15T17:50:26Z

llava-demo-sml.mov

abetlen · 2023-10-18T01:34:57Z

@damian0815 this is insane, thank you! I'll keep this open as it currently points to your fork but if there's a way we can get llama.cpp to expose the example api and bind to it I'll happily merge it in.

damian0815 · 2023-10-18T20:45:49Z

@abetlen 🙇

i'm waiting on a promised review on my pull request against llama.cpp, likely the API will need to change upstream, so yeah no point merging this until then.

y10ab1 · 2023-10-23T03:40:55Z

This is great! However, it seems to have a slightly slower inference time compared to pure C++ code. Does it involve offloading layers to the GPU?

llava-demo-sml.mov

damian0815 · 2023-10-23T15:53:49Z

if you're referring to the speed in the video - the demo is running off a laptop, it's certainly being battery life and/or thermally throttled. it is running on GPU but it's not intended to illustrate performance :)

zpzheng · 2023-10-25T04:53:18Z

When can this problem be solved now? I need to use this feature recently. Many models are now multi-model

Josh-XT · 2023-10-25T11:22:04Z

Also curious what needs done to this for it to be able to be merged - anything I can do to help?

damian0815 · 2023-10-25T20:34:47Z

@Josh-XT @zpzheng one thing you could do is leave a comment on my llava.cpp PR (link: ggerganov/llama.cpp#3613) as there's a code review there i'm waiting on for over a week now

zhicwu · 2023-10-26T00:40:59Z

Why not take subprocess as a temporary workaround to unblock your work first? It crashes sometimes on my MacBook and PC. You may also try ggerganov/llama.cpp#3682.

damian0815 · 2023-10-27T12:46:58Z

ok so it seems llama.cpp are just ignoring my work. yay. open source communication FTW.

@abetlen do you already have other channels of communication open with the llama.cpp repo? i don't want to re-implement/refactor C++ code for it to be ignored/rejected any more times than is strictly necessary

abetlen · 2023-11-01T23:01:07Z

@damian0815 I'll try to open an issue there as well to get things moving, there's a few projects in the examples folder that I'd love to include in the API here (finetuning, etc) but I understand that it also makes the API surface larger. I'll see what can be done.

BobCN2017 · 2023-11-05T09:23:48Z

In this issue 3798, Llava1.5 can run on the server and interact with the browser. I also run Llava1.5 on my computer by following the issue's command.

…cpp-python into damian0815-feat_llava_integration

damian0815 · 2023-11-06T19:26:56Z

yahoo

abetlen · 2023-11-08T03:10:03Z

Still have to write up some docs for setting up the server but it's working with the new OpenAI API.

teleprint-me · 2023-11-08T05:21:30Z

🔥

remixer-dec · 2023-11-08T16:21:46Z

Is it possible to use multimodality without server / OpenAI wrapper?

abetlen · 2023-11-10T20:14:29Z

@remixer-dec yes, I should add this to the docs but if you check out llama_cpp/server/app.py you'll see that it's just done by passing a llava specific chat_handler to the Llama class

exploringweirdmachines · 2023-11-16T22:43:51Z

Do you think fuyu 8b works with this?

llava v1.5 integration

4ec3539

damian0815 marked this pull request as draft October 18, 2023 20:45

Josh-XT mentioned this pull request Oct 20, 2023

Support for multi-modal models #813

Closed

sorny92 mentioned this pull request Oct 27, 2023

Expose Llava as a shared library for downstream projects ggerganov/llama.cpp#3613

Merged

5 tasks

abetlen changed the title ~~llava v1.5 support~~ Multimodal Support (Llava 1.5) Nov 6, 2023

abetlen added 10 commits November 6, 2023 13:25

Point llama.cpp to fork

48f4228

Add llava shared library target

61a1e5c

Fix type

46ce323

Update llama.cpp

0d8a91b

Add llava api

0c95066

Merge branch 'llava-1.5' into damian0815-feat_llava_integration

7b98141

Revert changes to llama and llama_cpp

9406d63

Merge branch 'main' into feat_llava_integration

6878def

Update llava example

82007d0

Merge branch 'feat_llava_integration' of github.com:damian0815/llama-…

625f852

…cpp-python into damian0815-feat_llava_integration

abetlen added 2 commits November 6, 2023 14:46

Add types for new gpt-4-vision-preview api

f6fe6b0

Fix typo

39e2be1

abetlen added 10 commits November 6, 2023 19:05

Update llama.cpp

7c3009e

Update llama_types to match OpenAI v1 API

1f1abfd

Update ChatCompletionFunction type

2a369f4

Reorder request parameters

2ea2adf

More API type fixes

87fc84b

Even More Type Updates

5091b9c

Add parameter for custom chat_handler to Llama class

22a776d

Fix circular import

5ac8115

Convert to absolute imports

cb749f2

Fix

d2d2a2d

abetlen added 6 commits November 7, 2023 22:12

Fix pydantic Jsontype bug

177114c

Accept list of prompt tokens in create_completion

21165e7

Add llava1.5 chat handler

74c414c

Add Multimodal notebook

34aa858

Clean up examples

66dda36

Add server docs

71adef4

abetlen marked this pull request as ready for review November 8, 2023 03:43

abetlen merged commit aab74f0 into abetlen:main Nov 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multimodal Support (Llava 1.5) #821

Multimodal Support (Llava 1.5) #821

damian0815 commented Oct 15, 2023

damian0815 commented Oct 15, 2023

abetlen commented Oct 18, 2023

damian0815 commented Oct 18, 2023

y10ab1 commented Oct 23, 2023

damian0815 commented Oct 23, 2023

zpzheng commented Oct 25, 2023

Josh-XT commented Oct 25, 2023

damian0815 commented Oct 25, 2023 •

edited

Loading

zhicwu commented Oct 26, 2023

damian0815 commented Oct 27, 2023

abetlen commented Nov 1, 2023

BobCN2017 commented Nov 5, 2023 •

edited

Loading

damian0815 commented Nov 6, 2023

abetlen commented Nov 8, 2023 •

edited

Loading

teleprint-me commented Nov 8, 2023

remixer-dec commented Nov 8, 2023

abetlen commented Nov 10, 2023

exploringweirdmachines commented Nov 16, 2023 •

edited

Loading

Multimodal Support (Llava 1.5) #821

Multimodal Support (Llava 1.5) #821

Conversation

damian0815 commented Oct 15, 2023

damian0815 commented Oct 15, 2023

abetlen commented Oct 18, 2023

damian0815 commented Oct 18, 2023

y10ab1 commented Oct 23, 2023

damian0815 commented Oct 23, 2023

zpzheng commented Oct 25, 2023

Josh-XT commented Oct 25, 2023

damian0815 commented Oct 25, 2023 • edited Loading

zhicwu commented Oct 26, 2023

damian0815 commented Oct 27, 2023

abetlen commented Nov 1, 2023

BobCN2017 commented Nov 5, 2023 • edited Loading

damian0815 commented Nov 6, 2023

abetlen commented Nov 8, 2023 • edited Loading

teleprint-me commented Nov 8, 2023

remixer-dec commented Nov 8, 2023

abetlen commented Nov 10, 2023

exploringweirdmachines commented Nov 16, 2023 • edited Loading

damian0815 commented Oct 25, 2023 •

edited

Loading

BobCN2017 commented Nov 5, 2023 •

edited

Loading

abetlen commented Nov 8, 2023 •

edited

Loading

exploringweirdmachines commented Nov 16, 2023 •

edited

Loading