-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FDS-1725] Missing entityId
handling testing
#1496
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🔥
tests/data/mock_manifests/InvalidFilenameAndEntityIDManifest.csv
Outdated
Show resolved
Hide resolved
tests/data/mock_manifests/InvalidFilenameAndEntityIDManifest.csv
Outdated
Show resolved
Hide resolved
For the new cases where we're checking if entityIds don't exist at all or are missing from the manifest we should also add validation error messages to |
Thanks for the comments @GiaJordan! I am going to revert this back to being a draft for now and work on those updates. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking very good! I left a few comments. But I have no problems having it pre-approved
@BWMac sorry one more comment. Since there's no test related to CLI, I could test manually. Give me some time :) I will report back shortly. |
@BWMac thanks for your patience. I tested the validation CLI with my own test manifest and it works. And I also realized that we don't have "dataset scope" parameter added for manifest submission so we can't use the file name validation during submission. But this issue is definitely outside of your PR! Thanks for your hard work. After the tests are all passing, I think you could merge 👍🏼 |
|
Description:
This PR follows up on #1456, where the
filenameExists
validation rule was implemented. This rule handles cases where a file path present in a manifest does not exist in a dataset and cases where the entityId provided in a manifest row does not match with its corresponding file path.This PR adds additional error handling for situations including:
Previously, an outer join was used in the validation rule followed by a line that handled cases where there were more rows in the dataset than the manifest. The outer join implementation did not always maintain the order of manifest rows resulting in incorrect errors on manifest rows. I changed this outer join to a left join on the manifest to handle both issues.
The row number indexing for
filename_validation
is also updated to match the other validation rules for consistency.I have also updated
generate_filename_error
to handle the new error cases.Testing:
filename_validation
have been added as needed.filename_validation
andgenerate_filename_error
have been added