Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UnprocessableEntityError when embedding using openai and a proxy #464

Open
MoezGholami opened this issue Nov 12, 2023 · 9 comments
Open
Labels
bug Something isn't working

Comments

@MoezGholami
Copy link

Description

Original issue on OpenAI community forum: https://community.openai.com/t/attributeerror-module-openai-has-no-attribute-error/486676

The new version of the openai package is breaking the the Jupyter-ai notebook. Any command to the openai models raises the following error:

AttributeError: module 'openai' has no attribute 'error'

The Jupyter notebook works with openai version 0.28 but fails with openai 1.2.3.

Reproduce

  1. pip install --upgrade openai
  2. Send the any command to Open AI's API in a Juypter notebook.

Expected behavior

Jupyter AI should work with the new openai python package.

Context

Versions:

  1. jupyter_ai: 2.5.0
  2. openai: 1.2.3
  3. python: 3.11.6
  4. pip: 23.3.1
@MoezGholami MoezGholami added the bug Something isn't working label Nov 12, 2023
Copy link

welcome bot commented Nov 12, 2023

Thank you for opening your first issue in this project! Engagement like this is essential for open source projects! 🤗

If you haven't done so already, check out Jupyter's Code of Conduct. Also, please try to follow the issue template as it helps other other community members to contribute more effectively.
welcome
You can meet the other Jovyans by joining our Discourse forum. There is also an intro thread there where you can stop by and say Hi! 👋

Welcome to the Jupyter community! 🎉

@dlqqq
Copy link
Member

dlqqq commented Nov 21, 2023

Quick dig through the LangChain PRs indicates that this was fixed last week, first released at v0.0.336. Should be fixable by bumping our LangChain version.

Edit: forgot to include link. langchain-ai/langchain#13262

@sqlreport
Copy link

sqlreport commented Jan 11, 2024

@dlqqq I have:
python=3.10
jupyter_ai==1.9.0
langchain==0.0.350
openai==1.7.1

when I run /learn command on chatUI, I get an error
IndexError: list index out of range

@sqlreport
Copy link

sqlreport commented Jan 11, 2024

When I change my documents to txt file I am getting error like (looks like expecting string input instead)

Exception: UnprocessableEntityError("Error code: 422 - {'detail': [{'type': 'string_type', 'loc': ['body', 'input', 'str'], 'msg': 'Input should be a valid string', 'input': [[2465, 836, 374, 350, 6043, 12336]], 'url': 'https://errors.pydantic.dev/2.4/v/string_type'}, {'type': 'string_type', 'loc': ['body', 'input', 'list[str]', 0], 'msg': 'Input should be a valid string', 'input': [2465, 836, 374, 350, 6043, 12336], 'url': 'https://errors.pydantic.dev/2.4/v/string_type'}]}")

@dlqqq dlqqq changed the title Breaking changes after openai 1.2.3 (module 'openai' has no attribute 'error') Support openai==1.x (module 'openai' has no attribute 'error') Jan 11, 2024
@JasonWeill
Copy link
Collaborator

@sqlreport This is likely a regression after PR #551 was merged in and released in version 1.9.0 and 2.9.0. I can look at it.

@JasonWeill
Copy link
Collaborator

With tip-of-main in Jupyter AI and openai 1.6.1, I don't see this error with an OpenAI embedding model; /learn on a directory including text files works without errors.

@sqlreport
Copy link

@JasonWeill can you provide unit test code that can mimic the openai call so I can troubleshoot my openai proxy? I have ran openai example code with string and it works without issue. It must be the way chatUI generates the openai call where it is passing non-string data.

@JasonWeill
Copy link
Collaborator

@sqlreport I looked at our existing codebase and I don't see unit tests for the /learn handler, sorry. There are two possible places where unexpected non-string data might come in:

  1. The split method in packages/jupyter-ai/jupyter_ai/document_loaders/directory.py:
    def split(path, all_files: bool, splitter):
    chunks = []
    for dir, subdirs, filenames in os.walk(path):
    # Filter out hidden filenames, hidden directories, and excluded directories,
    # unless "all files" are requested
    if not all_files:
    subdirs[:] = [d for d in subdirs if not (d[0] == "." or d in EXCLUDE_DIRS)]
    filenames = [f for f in filenames if not f[0] == "."]
    for filename in filenames:
    filepath = Path(os.path.join(dir, filename))
    if filepath.suffix not in SUPPORTED_EXTS:
    continue
    document = dask.delayed(path_to_doc)(filepath)
    chunk = dask.delayed(split_document)(document, splitter)
    chunks.append(chunk)
    flattened_chunks = dask.delayed(flatten)(*chunks)
    return flattened_chunks
    — this takes a file and converts it to chunks, according to one of several splitter classes.
  2. The get_embeddings method in the same file:
    def get_embeddings(chunks, em_provider_cls, em_provider_args):
    # split documents in parallel w.r.t. each file
    embeddings = []
    # compute embeddings in parallel
    for chunk in chunks:
    embedding = dask.delayed(embed_chunk)(chunk, em_provider_cls, em_provider_args)
    embeddings.append(embedding)
    return dask.delayed(join)(embeddings)
    — this creates a list of Dask delayed tasks to send one chunk at a time to the selected embedding model class (em_provider_cls). Dask executes these in learn.py, the /learn handler:
    embedding_records = await dask_client.compute(delayed)

@JasonWeill JasonWeill changed the title Support openai==1.x (module 'openai' has no attribute 'error') UnprocessableEntityError when embedding using openai and a proxy Jan 12, 2024
@andrewbovey
Copy link

@sqlreport Did you find a solution to the Exception: UnprocessableEntityError("Error code: 422 - {'detail': [{'type': 'string_type', 'loc': ['body', 'input', 'str'], 'msg': 'Input should be a valid string', 'input': [[2465, 836, 374, 350, 6043, 12336]], 'url': 'https://errors.pydantic.dev/2.4/v/string_type'}, {'type': 'string_type', 'loc': ['body', 'input', 'list[str]', 0], 'msg': 'Input should be a valid string', 'input': [2465, 836, 374, 350, 6043, 12336], 'url': 'https://errors.pydantic.dev/2.4/v/string_type'}]}")

Mine is UnprocessableEntityError: Error code: 422 - {'detail': [{'type': 'string_type', 'loc': ['body', 'input', 'str'], 'msg': 'Input should be a valid string', 'input': [[3923, 14071, 3956, 527, 1070, 30]]}, {'type': 'string_type', 'loc': ['body', 'input', 'list[str]', 0], 'msg': 'Input should be a valid string', 'input': [3923, 14071, 3956, 527, 1070, 30]}]}

langchain==0.2.1
opeanai==1.31.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants