Installation

This workshop utilizes multiple tools that can have conflicting requirements. Recommend using pyenv to manage separate python versions, then virtualenv for separate environments for each component (e.g. data_prep, rag).

Introduction

In this workshop, we show the two main workflows when working with LLMs: RAG (retrieval-augmented generation) and fine-tuning. We show how to take a PDF and generate a dataset for fine-tuning and evaluation.

Data Preparation

We utilize marker to turn PDFs into markdown, which can then be easily parsed into a format suitable for LLM consumption. Please reference the marker docs for installation and setup. After it's successfully installed, convert the example PDF into markdown using marker.

Now that you have a suitable document, we'll turn it into a Q&A dataset using a local llm. First, install Ollama.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
data_prep		data_prep
finetune		finetune
rag		rag
.gitignore		.gitignore
README.md		README.md
eval.py		eval.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Installation

Introduction

Data Preparation

Fine-tuning

RAG

Evals

About

Releases

Packages

Contributors 2

Languages

ethxnp/llm_workshop

Folders and files

Latest commit

History

Repository files navigation

Installation

Introduction

Data Preparation

Fine-tuning

RAG

Evals

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages