Skip to content

Commit

Permalink
Merge pull request #63 from confident-ai/feature/add-llamaindex-integ…
Browse files Browse the repository at this point in the history
…ration

Add llamaidnex
  • Loading branch information
jwongster2 authored Aug 28, 2023
2 parents 7b50720 + 1ba1e34 commit 3494b7c
Show file tree
Hide file tree
Showing 2 changed files with 117 additions and 1 deletion.
113 changes: 113 additions & 0 deletions docs/docs/tutorials/evaluating-llamaindex.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
# Evaluating LlamaIndex

LlamaIndex connects data sources with queries and responses. It provides an opinionated framework for Retrieval-Augmented Generation.

## Installation and Setup

```sh
pip install -q -q llama-index
pip install -U deepeval
```

Once installed , you can get set up and start writing tests.

```sh
# Optional step: Login to get a nice dashboard for your tests later!
# During this step - make sure to save your project as llama
deepeval login
```

## Use With Your LlamaIndex

DeepEval integrates nicely with LlamaIndex's `ResponseEvaluator` class. Below is an example of the factual consistency documentation.

```python

from llama_index.response.schema import Response
from typing import List
from llama_index.schema import Document
from deepeval.metrics.factual_consistency import FactualConsistencyMetric

from llama_index import (
TreeIndex,
VectorStoreIndex,
SimpleDirectoryReader,
LLMPredictor,
ServiceContext,
Response,
)
from llama_index.llms import OpenAI
from llama_index.evaluation import ResponseEvaluator

import os
import openai

api_key = "sk-XXX"
openai.api_key = api_key

gpt4 = OpenAI(temperature=0, model="gpt-4", api_key=api_key)
service_context_gpt4 = ServiceContext.from_defaults(llm=gpt4)
evaluator_gpt4 = ResponseEvaluator(service_context=service_context_gpt4)

```

#### Getting a lLamaHub Loader

```python
from llama_index import download_loader

WikipediaReader = download_loader("WikipediaReader")

loader = WikipediaReader()
documents = loader.load_data(pages=['Tokyo'])
tree_index = TreeIndex.from_documents(documents=documents)
vector_index = VectorStoreIndex.from_documents(
documents, service_context=service_context_gpt4
)
```

We then build an evaluator based on the `BaseEvaluator` class that requires an `evaluate` method.

In this example, we show you how to write a factual consistency check.

```python
class FactualConsistencyResponseEvaluator:
def get_context(self, response: Response) -> List[Document]:
"""Get context information from given Response object using source nodes.
Args:
response (Response): Response object from an index based on the query.
Returns:
List of Documents of source nodes information as context information.
"""
context = []

for context_info in response.source_nodes:
context.append(Document(text=context_info.node.get_content()))

return context

def evaluate(self, response: Response) -> str:
"""Evaluate factual consistency metrics
"""
answer = str(response)
context = self.get_context(response)
metric = FactualConsistencyMetric()
context = " ".join([d.text for d in context])
score = metric.measure(output=answer, context=context)
if metric.is_successful():
return "YES"
else:
return "NO"

evaluator = FactualConsistencyResponseEvaluator()
```

You can then evaluate as such:

```python
query_engine = tree_index.as_query_engine()
response = query_engine.query("How did Tokyo get its name?")
eval_result = evaluator.evaluate(response)
```
5 changes: 4 additions & 1 deletion docs/sidebars.js
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,10 @@ const sidebars = {
{
type: "category",
label: "Tutorials",
items: ["tutorials/evaluating-langchain"]
items: [
"tutorials/evaluating-llamaindex",
// "tutorials/evaluating-langchain",
]
},
{
type: "category",
Expand Down

0 comments on commit 3494b7c

Please sign in to comment.