-
Notifications
You must be signed in to change notification settings - Fork 453
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #63 from confident-ai/feature/add-llamaindex-integ…
…ration Add llamaidnex
- Loading branch information
Showing
2 changed files
with
117 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,113 @@ | ||
# Evaluating LlamaIndex | ||
|
||
LlamaIndex connects data sources with queries and responses. It provides an opinionated framework for Retrieval-Augmented Generation. | ||
|
||
## Installation and Setup | ||
|
||
```sh | ||
pip install -q -q llama-index | ||
pip install -U deepeval | ||
``` | ||
|
||
Once installed , you can get set up and start writing tests. | ||
|
||
```sh | ||
# Optional step: Login to get a nice dashboard for your tests later! | ||
# During this step - make sure to save your project as llama | ||
deepeval login | ||
``` | ||
|
||
## Use With Your LlamaIndex | ||
|
||
DeepEval integrates nicely with LlamaIndex's `ResponseEvaluator` class. Below is an example of the factual consistency documentation. | ||
|
||
```python | ||
|
||
from llama_index.response.schema import Response | ||
from typing import List | ||
from llama_index.schema import Document | ||
from deepeval.metrics.factual_consistency import FactualConsistencyMetric | ||
|
||
from llama_index import ( | ||
TreeIndex, | ||
VectorStoreIndex, | ||
SimpleDirectoryReader, | ||
LLMPredictor, | ||
ServiceContext, | ||
Response, | ||
) | ||
from llama_index.llms import OpenAI | ||
from llama_index.evaluation import ResponseEvaluator | ||
|
||
import os | ||
import openai | ||
|
||
api_key = "sk-XXX" | ||
openai.api_key = api_key | ||
|
||
gpt4 = OpenAI(temperature=0, model="gpt-4", api_key=api_key) | ||
service_context_gpt4 = ServiceContext.from_defaults(llm=gpt4) | ||
evaluator_gpt4 = ResponseEvaluator(service_context=service_context_gpt4) | ||
|
||
``` | ||
|
||
#### Getting a lLamaHub Loader | ||
|
||
```python | ||
from llama_index import download_loader | ||
|
||
WikipediaReader = download_loader("WikipediaReader") | ||
|
||
loader = WikipediaReader() | ||
documents = loader.load_data(pages=['Tokyo']) | ||
tree_index = TreeIndex.from_documents(documents=documents) | ||
vector_index = VectorStoreIndex.from_documents( | ||
documents, service_context=service_context_gpt4 | ||
) | ||
``` | ||
|
||
We then build an evaluator based on the `BaseEvaluator` class that requires an `evaluate` method. | ||
|
||
In this example, we show you how to write a factual consistency check. | ||
|
||
```python | ||
class FactualConsistencyResponseEvaluator: | ||
def get_context(self, response: Response) -> List[Document]: | ||
"""Get context information from given Response object using source nodes. | ||
Args: | ||
response (Response): Response object from an index based on the query. | ||
Returns: | ||
List of Documents of source nodes information as context information. | ||
""" | ||
context = [] | ||
|
||
for context_info in response.source_nodes: | ||
context.append(Document(text=context_info.node.get_content())) | ||
|
||
return context | ||
|
||
def evaluate(self, response: Response) -> str: | ||
"""Evaluate factual consistency metrics | ||
""" | ||
answer = str(response) | ||
context = self.get_context(response) | ||
metric = FactualConsistencyMetric() | ||
context = " ".join([d.text for d in context]) | ||
score = metric.measure(output=answer, context=context) | ||
if metric.is_successful(): | ||
return "YES" | ||
else: | ||
return "NO" | ||
|
||
evaluator = FactualConsistencyResponseEvaluator() | ||
``` | ||
|
||
You can then evaluate as such: | ||
|
||
```python | ||
query_engine = tree_index.as_query_engine() | ||
response = query_engine.query("How did Tokyo get its name?") | ||
eval_result = evaluator.evaluate(response) | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters