Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE][DocBot] Generate responses with our RAG pipeline #83

Open
dtaivpp opened this issue Nov 1, 2023 · 2 comments
Open

[FEATURE][DocBot] Generate responses with our RAG pipeline #83

dtaivpp opened this issue Nov 1, 2023 · 2 comments
Labels
enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed OSCI

Comments

@dtaivpp
Copy link
Collaborator

dtaivpp commented Nov 1, 2023

Is your feature request related to a problem?

At the moment we have several of the pieces of the RAG pipeline built but now we need to pull it all together. We need to pull this all together now.

What solution would you like?

Our DocBot class will call docbot.language_model:generate_response see #80. Generate response will need to query our cohere-index and collect the response to return to the user.

https://opensearch.org/docs/latest/ml-commons-plugin/conversational-search/#using-the-pipeline

Note* here the interaction_size is the number of previous chats to send as context. The context_size is the number of results from our search that we will send through.

The most challenging thing with this PR is we will need to use a neural search in our query section in order to find the most relevant documents. https://opensearch.org/docs/latest/search-plugins/neural-text-search/#step-4-search-the-index-using-neural-search. The model_id that we will need to reference here is the MODEL_ID that is being used by our ingestion pipeline.

@dtaivpp dtaivpp added enhancement New feature or request untriaged Issues not seen by a maintainer yet. labels Nov 1, 2023
@LucasWang750
Copy link

I would like to take this issue

@dtaivpp dtaivpp removed the untriaged Issues not seen by a maintainer yet. label Nov 1, 2023
@dtaivpp dtaivpp added good first issue Good for newcomers help wanted Extra attention is needed OSCI labels Nov 1, 2023
@dtaivpp
Copy link
Collaborator Author

dtaivpp commented Dec 15, 2023

Here is an example of what the generate language pipeline looks like:

GET /docbot/_search
{
  "_source": {
    "exclude": [
      "content_embedding"
    ]
  },
  "query": {
    "hybrid": {
      "queries": [
        {
          "match": {
            "content": {
              "query": "How do I enable segment replication"
            }
          }
        },
        {
          "neural": {
            "content_embedding": {
              "query_text": "How do I enable segment replication",
              "model_id": "Z8VpCYwBKF5Jo_eo10QE",
              "k": 5
            }
          }
        }
      ]
    }
  },
  "ext": {
		"generative_qa_parameters": {
		  "llm_model": "gpt-3.5-turbo",
			"llm_question": "How do I enable segment replication",
			"conversation_id": "JcVbCYwBKF5Jo_eoe0TD",
                         "context_size": 3,
                         "interaction_size": 3,
                         "timeout": 45
		}
	}
}

We will need to pass in the model ID, conversation ID, and the question. Then when we are processing the answers this is the response ["ext"]["retrieval_augmented_generation"]["answer"]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed OSCI
Projects
None yet
Development

No branches or pull requests

2 participants