-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update direct csv fast load to handle empty ending lines on csv and other edge conditions #206
Comments
|
another blocking fast load:
|
duttonw
changed the title
Update direct csv load to handle empty ending lines on csv
Update direct csv fast load to handle empty ending lines on csv and other edge conditions
Jan 31, 2024
ThrawnCA
added a commit
that referenced
this issue
Feb 9, 2024
Add a unit test sample that has an extra empty line. This ideally should be handled gracefully (ie ignore the extra line)
ThrawnCA
added a commit
that referenced
this issue
Feb 9, 2024
Extra empty lines at the end of a file should be ignored.
ThrawnCA
added a commit
that referenced
this issue
Feb 9, 2024
Skip rows that are completely blank instead of erroring out
@duttonw I'm not sure how feasible it is to handle columns with the wrong number of commas, but completely blank rows are simple enough. Tabulator has built-in functionality to let us skip them. |
ThrawnCA
added a commit
to qld-gov-au/ckanext-xloader
that referenced
this issue
Feb 28, 2024
ThrawnCA
added a commit
that referenced
this issue
Mar 15, 2024
ThrawnCA
added a commit
to qld-gov-au/ckanext-xloader
that referenced
this issue
Mar 15, 2024
… out - Copied from upstream GitHub issue ckan#206
once qld-gov-au#90 reaches /ckan/ckanet-xloader this can be closed. |
resolved in qld-gov-au#90, it will get to ckan org version in due time. |
peterVorman
added a commit
to OpenGov-OpenData/ckanext-xloader
that referenced
this issue
Aug 5, 2024
* commit 'a96ce28c589dfe6b1b850d8eeeb14f1e1dfe9759': (80 commits) Add note about 2.11 support Update images and actions, test 2.11 feat(tests): added tyoe guess on mixed integers; add more ignorable blank lines to test sample, ckan#206 add more options for maintainers to expedite XLoader runs, GitHub ckan#202 strip extra space for column name In plugin.py, there is an fix of resource format key error fix list syntax for combining range and dict skip blank rows in source files, ckan#206 add unit test for handling empty lines, ckan#206 add sample file with extra blank line at end, ckan#206 fix(tests): less complicated; further cleanup fix(tests): finalized test method; fix(tests): subrequest params; fix(tests): module path; feat(tests): added new test; fix(syntax): flake8; fix(helpers): comments and better syntax; fix(templates): set in block; ... # Resolved conflicts: # .github/workflows/test.yml # ckanext/xloader/controllers.py # ckanext/xloader/plugin.py # ckanext/xloader/templates-bs2/package/resource_edit_base.html # ckanext/xloader/templates/package/resource_edit_base.html # ckanext/xloader/utils.py # ckanext/xloader/views.py
peterVorman
added a commit
to OpenGov-OpenData/ckanext-xloader
that referenced
this issue
Aug 5, 2024
* commit 'a96ce28c589dfe6b1b850d8eeeb14f1e1dfe9759': (80 commits) Add note about 2.11 support Update images and actions, test 2.11 feat(tests): added tyoe guess on mixed integers; add more ignorable blank lines to test sample, ckan#206 add more options for maintainers to expedite XLoader runs, GitHub ckan#202 strip extra space for column name In plugin.py, there is an fix of resource format key error fix list syntax for combining range and dict skip blank rows in source files, ckan#206 add unit test for handling empty lines, ckan#206 add sample file with extra blank line at end, ckan#206 fix(tests): less complicated; further cleanup fix(tests): finalized test method; fix(tests): subrequest params; fix(tests): module path; feat(tests): added new test; fix(syntax): flake8; fix(helpers): comments and better syntax; fix(templates): set in block; ... # Resolved conflicts: # .github/workflows/test.yml # ckanext/xloader/controllers.py # ckanext/xloader/plugin.py # ckanext/xloader/templates-bs2/package/resource_edit_base.html # ckanext/xloader/templates/package/resource_edit_base.html # ckanext/xloader/utils.py # ckanext/xloader/views.py
peterVorman
added a commit
to OpenGov-OpenData/ckanext-xloader
that referenced
this issue
Aug 5, 2024
* commit 'a96ce28c589dfe6b1b850d8eeeb14f1e1dfe9759': (80 commits) Add note about 2.11 support Update images and actions, test 2.11 feat(tests): added tyoe guess on mixed integers; add more ignorable blank lines to test sample, ckan#206 add more options for maintainers to expedite XLoader runs, GitHub ckan#202 strip extra space for column name In plugin.py, there is an fix of resource format key error fix list syntax for combining range and dict skip blank rows in source files, ckan#206 add unit test for handling empty lines, ckan#206 add sample file with extra blank line at end, ckan#206 fix(tests): less complicated; further cleanup fix(tests): finalized test method; fix(tests): subrequest params; fix(tests): module path; feat(tests): added new test; fix(syntax): flake8; fix(helpers): comments and better syntax; fix(templates): set in block; ... # Resolved conflicts: # .github/workflows/test.yml # ckanext/xloader/controllers.py # ckanext/xloader/plugin.py # ckanext/xloader/templates-bs2/package/resource_edit_base.html # ckanext/xloader/templates/package/resource_edit_base.html # ckanext/xloader/utils.py # ckanext/xloader/views.py
peterVorman
added a commit
to OpenGov-OpenData/ckanext-xloader
that referenced
this issue
Aug 5, 2024
* commit 'a96ce28c589dfe6b1b850d8eeeb14f1e1dfe9759': (80 commits) Add note about 2.11 support Update images and actions, test 2.11 feat(tests): added tyoe guess on mixed integers; add more ignorable blank lines to test sample, ckan#206 add more options for maintainers to expedite XLoader runs, GitHub ckan#202 strip extra space for column name In plugin.py, there is an fix of resource format key error fix list syntax for combining range and dict skip blank rows in source files, ckan#206 add unit test for handling empty lines, ckan#206 add sample file with extra blank line at end, ckan#206 fix(tests): less complicated; further cleanup fix(tests): finalized test method; fix(tests): subrequest params; fix(tests): module path; feat(tests): added new test; fix(syntax): flake8; fix(helpers): comments and better syntax; fix(templates): set in block; ... # Resolved conflicts: # .github/workflows/test.yml # ckanext/xloader/controllers.py # ckanext/xloader/plugin.py # ckanext/xloader/templates-bs2/package/resource_edit_base.html # ckanext/xloader/templates/package/resource_edit_base.html # ckanext/xloader/utils.py # ckanext/xloader/views.py
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Example logs from importing 40mb csv file with 400,000+ rows.
https://www.data.qld.gov.au/dataset/5efaa096-4480-4540-88be-a10ababd9f49/resource/a14317b7-2fca-41b7-8294-9a1f7a085b0f
The text was updated successfully, but these errors were encountered: