Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use parquet to speed up dataset building. #290

Merged
merged 66 commits into from
Dec 3, 2024
Merged

Conversation

cjohns-scottlogic
Copy link
Contributor

What type of PR is this? (check all applicable)

  • Refactor
  • Feature
  • Bug Fix
  • Optimization
  • Documentation Update

Description

Use parquet and duckdb to speed up processing.

Related Tickets & Documents

  • Ticket Link
  • Related Issue #
  • Closes #

QA Instructions, Screenshots, Recordings

Please replace this line with instructions on how to test your changes, a note
on the devices and browsers this has been tested on, as well as any relevant
images for UI changes.

Added/updated tests?

We encourage you to keep the code coverage percentage at 80% and above.

  • Yes
  • No, and this is why: please replace this line with details on why tests
    have not been included
  • I need help with writing tests

[optional] Are there any post deployment tasks we need to perform?

[optional] Are there any dependencies on other PRs or Work?

cjohns-scottlogic and others added 30 commits November 25, 2024 14:04
…and/digital-land-python into fix/pq-dataset-typology
…and/digital-land-python into fix/pq-dataset-typology
@codecov-commenter
Copy link

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 96.46018% with 4 lines in your changes missing coverage. Please review.

Project coverage is 82.88%. Comparing base (7cc9562) to head (95f3f4f).
Report is 120 commits behind head on main.

Files with missing lines Patch % Lines
digital_land/commands.py 82.35% 3 Missing ⚠️
digital_land/package/datasetparquet.py 98.92% 1 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #290      +/-   ##
==========================================
+ Coverage   78.02%   82.88%   +4.85%     
==========================================
  Files          76       84       +8     
  Lines        4077     4732     +655     
==========================================
+ Hits         3181     3922     +741     
+ Misses        896      810      -86     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@cjohns-scottlogic cjohns-scottlogic merged commit 93a97bd into main Dec 3, 2024
3 checks passed
@cjohns-scottlogic cjohns-scottlogic deleted the feat/parquet branch December 3, 2024 09:42
cjohns-scottlogic added a commit that referenced this pull request Dec 4, 2024
alexglasertpx pushed a commit that referenced this pull request Dec 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants