You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When a PDF document with the following structure is read by Azure Document Intelligence, files for Paragraph 1 and Paragraph 2 are created in the GraphRAG input folder, but no file is created for the Table/Image(description).
Paragraph 1
Table
Paragraph 2
Image
...
Reproduction steps
1. In Retrieval settings > GraphRAG Collection > File loader, select`Azure AI Document Intelligence (figure+table extraction)`1. Upload a PDF file containing a table in GraphRAG1. Execute a query related to the table
I think it would be more appropriate to have a format like ktem_app_data/markdown_cache_dir, where tables and other elements are expanded inline, as the text to be indexed.
The text was updated successfully, but these errors were encountered:
Description
When a PDF document with the following structure is read by Azure Document Intelligence, files for Paragraph 1 and Paragraph 2 are created in the GraphRAG input folder, but no file is created for the Table/Image(description).
Reproduction steps
Screenshots
No response
Logs
No response
Browsers
No response
OS
No response
Additional information
AzureAIDocumentIntelligenceLoader
stores Text/Table/Image separately in the Document without duplication, whileGraphRAGIndexingPipeline
outputs only Text.I think it would be more appropriate to have a format like
ktem_app_data/markdown_cache_dir
, where tables and other elements are expanded inline, as the text to be indexed.The text was updated successfully, but these errors were encountered: