Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parsing a table from a pdf file into a markdown format is no longer showing table column header bounaries #629

Open
mehtdip opened this issue Feb 24, 2025 · 0 comments
Labels
bug Something isn't working

Comments

@mehtdip
Copy link

mehtdip commented Feb 24, 2025

Describe the bug
parsing a table from a pdf file into a markdown format using get_json_result API is no longer showing horizontal boundaries separating column header row from data rows,
Files
Attached file : untitled10.pdf

Job ID
c0c99601-ef27-45b0-8804-42b048d7eabd

**Client:**I
Please remove untested options:

  • Python Library

Untitled10.pdf

reader = LlamaParse(api_key="")
json_objs = reader.get_json_result(pdf)

Additional context
Till last week the markdown table was showing boundary like in following sample generated using llamaparse api :

(in millions, except percentages and per share amounts) 2024 2023 change
revenue $ 245,122 $ 211,915 16%
gross margin $ 171,008 $ 146,052 17%

but since last weekend it generates as below without :
|(in millions, except percentages and per share amounts)|2024|2023|percentage change|
|revenue|$245,122|$211,915|16%|
|gross margin|$171,008|$146,052|17%|

We are using "items" array and "md" element from json output for each page :
'items': [
{'type': 'heading', 'lvl': 1, 'value': 'SUMMARY RESULTS OF OPERATIONS', 'md': '# SUMMARY RESULTS OF OPERATIONS', 'bBox': {'x': 0, 'y': 0, 'w': 594.96, 'h': 841.92}},
{'type': 'table', 'rows': [['(In millions, except percentages and per share amounts)', '2024', '2023', 'Percentage Change'], ['Revenue', '$245,122', '$211,915', '16%'], ['Gross margin', '$171,008', '$146,052', '17%'], ['Operating income', '$109,433', '$88,523', '24%'], ['Net income', '$88,136', '$72,361', '22%'], ['Diluted earnings per share', '$11.80', '$9.68', '22%'], ['Adjusted gross margin (non-GAAP)', '$171,008', '$146,204', '17%'], ['Adjusted operating income (non-GAAP)', '$109,433', '$89,694', '22%'], ['Adjusted net income (non-GAAP)', '$88,136', '$73,307', '20%'], ['Adjusted diluted earnings per share (non-GAAP)', '$11.80', '$9.81', '20%']],
'md': '|(In millions, except percentages and per share amounts)|2024|2023|Percentage Change|\n|Revenue|$245,122|$211,915|16%|\n|Gross margin|$171,008|$146,052|17%|\n|Operating income|$109,433|$88,523|24%|\n|Net income|$88,136|$72,361|22%|\n|Diluted earnings per share|$11.80|$9.68|22%|\n|Adjusted gross margin (non-GAAP)|$171,008|$146,204|17%|\n|Adjusted operating income (non-GAAP)|$109,433|$89,694|22%|\n|Adjusted net income (non-GAAP)|$88,136|$73,307|20%|\n|Adjusted diluted earnings per share (non-GAAP)|$11.80|$9.81|20%|', 'isPerfectTable': True, 'csv': '"(In millions, except percentages and per share amounts)","2024","2023","Percentage Change"\n"Revenue","$245,122","$211,915","16%"\n"Gross margin","$171,008","$146,052","17%"\n"Operating income","$109,433","$88,523","24%"\n"Net income","$88,136","$72,361","22%"\n"Diluted earnings per share","$11.80","$9.68","22%"\n"Adjusted gross margin (non-GAAP)","$171,008","$146,204","17%"\n"Adjusted operating income (non-GAAP)","$109,433","$89,694","22%"\n"Adjusted net income (non-GAAP)","$88,136","$73,307","20%"\n"Adjusted diluted earnings per share (non-GAAP)","$11.80","$9.81","20%"', 'bBox': {'x': 0, 'y': 0, 'w': 594.96, 'h': 841.92}},
{'type': 'text', 'value': 'Adjusted gross margin, operating income, net income, and diluted earnings per share (“EPS”) are non-GAAP financial measures. Prior year non-GAAP financial measures exclude the impact of a $1.2 billion charge in the second quarter of fiscal year 2023 (“Q2 charge”), which included employee severance expenses, impairment charges resulting from changes to our hardware portfolio, and costs related to lease consolidation activities. Refer to the Non-GAAP Financial Measures section below for a reconciliation of our financial results reported in accordance with GAAP to non-GAAP financial results.', 'md': 'Adjusted gross margin, operating income, net income, and diluted earnings per share (“EPS”) are non-GAAP financial measures. Prior year non-GAAP financial measures exclude the impact of a $1.2 billion charge in the second quarter of fiscal year 2023 (“Q2 charge”), which included employee severance expenses, impairment charges resulting from changes to our hardware portfolio, and costs related to lease consolidation activities. Refer to the Non-GAAP Financial Measures section below for a reconciliation of our financial results reported in accordance with GAAP to non-GAAP financial results.', 'bBox': {'x': 0, 'y': 0, 'w': 594.96, 'h': 841.92}},

@mehtdip mehtdip added the bug Something isn't working label Feb 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant