-
-
Notifications
You must be signed in to change notification settings - Fork 353
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add GPT4All local provider #209
Conversation
Thanks for submitting your first pull request! You are awesome! 🤗 |
@krassowski This seems like a reasonable option, but believe that we will need some UX changes (messaging, confirmation for downloading, progress bar etc.) to provide this model option. In some cases, users might also have this model already installed, so we will need to handle that. To start, I think we can go with the alternate option of letting users download it to a specific location, and configure in the UI. I want to try this out locally. In case I download the model, does this code require it to be located at |
Currently yes, but we could change it by providing the model path. I could add a filed the same way there is a field for number of threads, does it sound good? For reference, gpt4all documents the default model path here a nd defines it here while LangChain mentiones it here. |
What do you think about disabling auto-download and just displaying an error if the model is not available with instructions for download? |
Yes, that sounds good. Thanks for looking into this. |
@krassowski thanks for working on this, I think supporting local models is really important! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@krassowski
Thanks for working on this. I was able to download and connect with the ggml-gpt4all-j-v1.3-groovy
model using these changes, and it worked. There were some issues with the LangChain version, and other models in the list. I also rebased from main, and can submit the fixes; would you be able to give me permissions to your fork to merge those?
@3coins just checking if you wanted to push to my branch, or should I start working on addressing the review suggestions? |
@krassowski |
@krassowski I also ran into prompt size issue with just 2 consecutive prompts. Is a latency of 5+ mins expected for these models? I am running these on a Mac M1 Pro (16gb). |
@krassowski There is also some encouraging progress on LLM compression, so this should help with better models in future which should behave closely to external providers. |
Thank you! On the performance side, when a model generates a few tokens/second streaming the response (token by token) gives much better UX (the fact that the process takes minutes for long responses is not as bad a problem when tokens are streamed to user this way); I think it was not discussed previously, so I opened #228 to track this (for your consideration). |
I resolved conflicts to push it along. What are the next steps here? Would you like to revisit the model choice, resolve any of the issues referenced? I can help, just not sure what is blocking here. |
One approach for model selection might be to track some of the models supported by GPT4all, which runs a desktop app for playing with local chat models. The GPt4all repo often picks up issues requesting new and popular models, and the ones they support may be indicative of some sort of local LLM user community. Their model list is here. |
@krassowski Hey Michal! I'm very sorry that your PR was left in the queue for so long; this PR was submitted when I was out on vacation, and I had missed it in the recent weeks. I submitted a PR to your branch to fix a few bugs I had encountered and add some documentation for prospective GPT4All users. Please review and merge this when you have time: krassowski#1 After that, the next step is to rebase this branch onto the latest commit on We really appreciate your effort and patience on this PR! We are aiming to include your PR in the next release of Jupyter AI v1 and v2. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@krassowski This PR looks great! The only remaining task is to rebase this branch onto the latest commit of main
to make sure CI passes. I would also remove the merge commits to preserve a linear history for this branch, as we will backport this PR to 1.x. 👍
for more information, see https://pre-commit.ci
df53b37
to
06ced6d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@krassowski Awesome work! 🎉
Hi Team, I downloaded the models in the cache folder as suggested - Please suggest me steps and working versions to accomplish running local model over chat interface or jupyter cell via magic commands. Please find attached snaps of errors - i am still getting the same error ("wasn't able to index that path") in the chat interface (#348) and when using ai_magic command, the response is the following - |
* Add GPT4All * Allow to tune number of threads * Disable auto-download * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix build * bump langchain to v0.0.223 see: langchain-ai/langchain@265c285 * implement async for GPT4All * update user docs with GPT4All installation instructions --------- Co-authored-by: 3coins <pyjain@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: David L. Qiu <david@qiu.dev>
* Add GPT4All * Allow to tune number of threads * Disable auto-download * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix build * bump langchain to v0.0.223 see: langchain-ai/langchain@265c285 * implement async for GPT4All * update user docs with GPT4All installation instructions --------- Co-authored-by: 3coins <pyjain@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: David L. Qiu <david@qiu.dev>
This is proof of concept for #190. I tried a number of models and GPT4All appears to be most straighforward to install.
It would be good if the language of the document selection gets included from were included in the prompt (here the model assumed it is C# for some reason). Results, as seen above, are not great for coding tasks, but supposedly these models are good at reasoning and conversations.
Langchain updates
A newer
langchain
version is required because:>= 0.0.174
: the bindings switched frompygpt4all
to a new officialgpt4all
package (Update GPT4ALL integration langchain-ai/langchain#4567)>= 0.0.188
: to supportallow_download
attribute (add allow_download as class attribute for GPT4All langchain-ai/langchain#5512)Model download
GPT4All
bindings have a native support for downloading model weights (disabled by default in langchain). If we decide to toggle it on by default user would not not have to do anything and the model would just work. The experience will depend on network speed as downloading the model can take from minutes to hours, but then it is cached in~/.cache/gpt4all/
. The progress bar displays only in terminal, but download failures show up in UI as exception tracebacks. Ideally we would have a way to show that download is in progress in the UI.Alternatively, users can download the model directly, e.g.
The download sizes are:
Performance
GPT4All
runs on CPU (there is also a GPU version,GPT4AllGPU
but there are no buindings in lanchaing - although we could contribute). The performance of CPU versions somewhat depends on number of threads (but then using too many threads can slow it down). This PR makes number of threads user-configurable.Additionally a number of fields could be added to enhance user configurability, e.g.
temp
,n_predict
(max output tokens), etc.