Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add PPTX chart support #33

Merged
merged 3 commits into from
Dec 16, 2024

Conversation

nyosegawa
Copy link
Contributor

This pull request adds support for converting charts in PPTX files to Markdown tables.

Files Changed:

  • src/markitdown/_markitdown.py: Added chart detection and _convert_chart_to_markdown method to handle PPTX charts.
  • tests/test_files/test.pptx: Inserted a chart slide into the existing PPTX file for testing purposes.
  • tests/test_markitdown.py: Added assertions to verify chart titles and values in the generated Markdown output.

Please review and let me know if any adjustments or additional tests are needed.

Sample

pptx:

pptx

output:

<!-- Slide number: 4 -->
# A chart to test parsing:

### Chart: a3f6004b-6f4f-4ea8-bee3-3741f4dc385f

| Category | Series 1 |
|---|---|
| 2000 | 2000.0 |
| 2001 | 2001.0 |
| 2002 | 2002.0 |
| 2003 | 2003.0 |

@nyosegawa
Copy link
Contributor Author

@microsoft-github-policy-service agree

@9to5crypto
Copy link

,

@afourney
Copy link
Member

This is pretty nice. Does it support most/all types of charts or just a subset? In any case, the bar-chart results look good.

@afourney afourney merged commit da779dd into microsoft:main Dec 16, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants