feat: Support chunking strategy in file_search tool in openai_dart #496

davidmigloz · 2024-07-20T08:57:59Z

By default, max_chunk_size_tokens is set to 800 and chunk_overlap_tokens is set to 400, meaning every file is indexed by being split up into 800-token chunks, with 400-token overlap between consecutive chunks.

You can adjust this by setting chunking_strategy when adding files to the vector store. There are certain limitations to chunking_strategy:

max_chunk_size_tokens must be between 100 and 4096 inclusive.
chunk_overlap_tokens must be non-negative and should not exceed max_chunk_size_tokens / 2.

Ref: https://platform.openai.com/docs/assistants/tools/file-search

…avidmigloz#496)

feat: Support chunking strategy in file_search tool in openai_dart

d7f4d4c

davidmigloz self-assigned this Jul 20, 2024

davidmigloz added t:enhancement New feature or request p:openai_dart openai_dart package. labels Jul 20, 2024

davidmigloz added this to the v0.8.0 milestone Jul 20, 2024

davidmigloz merged commit cfa974a into main Jul 20, 2024
1 check passed

davidmigloz deleted the chunking-strategy branch July 20, 2024 09:00

KennethKnudsen97 pushed a commit to KennethKnudsen97/langchain_dart that referenced this pull request Oct 1, 2024

feat: Support chunking strategy in file_search tool in openai_dart (d…

28f6a5c

…avidmigloz#496)

KennethKnudsen97 pushed a commit to KennethKnudsen97/langchain_dart that referenced this pull request Oct 1, 2024

feat: Support chunking strategy in file_search tool in openai_dart (d…

31106fe

…avidmigloz#496)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Support chunking strategy in file_search tool in openai_dart #496

feat: Support chunking strategy in file_search tool in openai_dart #496

davidmigloz commented Jul 20, 2024 •

edited

Loading

feat: Support chunking strategy in file_search tool in openai_dart #496

feat: Support chunking strategy in file_search tool in openai_dart #496

Conversation

davidmigloz commented Jul 20, 2024 • edited Loading

davidmigloz commented Jul 20, 2024 •

edited

Loading