-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added unit tests for validate attribute class #1486
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I built the docker image using this branch and ran schematic profiler - success!
INFO: [2024-08-30 12:12:53] manifest-validate - Monitoring manifest validation
INFO: [2024-08-30 12:12:53] httpx - HTTP Request: POST http://localhost:3001/v1/model/validate?schema_url=https%3A%2F%2Fraw.githubusercontent.com%2FSage-Bionetworks%2Fschematic%2Fdevelop%2Ftests%2Fdata%2Fexample.model.jsonld&data_type=Patient&restrict_rules=True "HTTP/1.1 200 OK"
INFO: [2024-08-30 12:12:54] httpx - HTTP Request: POST http://localhost:3001/v1/model/validate?schema_url=https%3A%2F%2Fraw.githubusercontent.com%2FSage-Bionetworks%2Fschematic%2Fdevelop%2Ftests%2Fdata%2Fexample.model.jsonld&data_type=Patient "HTTP/1.1 200 OK"
INFO: [2024-08-30 12:13:46] httpx - HTTP Request: POST http://localhost:3001/v1/model/validate?schema_url=https%3A%2F%2Fraw.githubusercontent.com%2Fncihtan%2Fdata-models%2Fmain%2FHTAN.model.jsonld&data_type=Biospecimen "HTTP/1.1 200 OK"
INFO: [2024-08-30 12:13:46] manifest-submit - Monitoring manifest submission
INFO: [2024-08-30 12:14:31] httpx - HTTP Request: POST http://localhost:3001/v1/model/submit?schema_url=https%3A%2F%2Fraw.githubusercontent.com%2FSage-Bionetworks%2Fschematic%2Fdevelop%2Ftests%2Fdata%2Fexample.model.jsonld&dataset_id=syn51376664&asset_view=syn51376649&restrict_rules=True&use_schema_label=True&data_model_labels=class_label&table_manipulation=replace&manifest_record_type=table_and_file "HTTP/1.1 200 OK"
INFO: [2024-08-30 12:14:43] httpx - HTTP Request: POST http://localhost:3001/v1/model/submit?schema_url=https%3A%2F%2Fraw.githubusercontent.com%2FSage-Bionetworks%2Fschematic%2Fdevelop%2Ftests%2Fdata%2Fexample.model.jsonld&dataset_id=syn51376664&asset_view=syn51376649&restrict_rules=True&use_schema_label=True&data_model_labels=class_label&table_manipulation=replace&manifest_record_type=file_only "HTTP/1.1 200 OK"
INFO: [2024-08-30 12:14:55] httpx - HTTP Request: POST http://localhost:3001/v1/model/submit?schema_url=https%3A%2F%2Fraw.githubusercontent.com%2FSage-Bionetworks%2Fdata_flow_config%2Fmain%2FHTAN%2Fdataflow_component.csv&dataset_id=syn51376664&asset_view=syn51376649&restrict_rules=True&use_schema_label=True&data_model_labels=class_label&table_manipulation=replace&manifest_record_type=file_only "HTTP/1.1 200 OK"
INFO: [2024-08-30 12:14:57] manifest-storage - Monitoring storage endpoints
INFO: [2024-08-30 12:15:03] httpx - HTTP Request: GET http://localhost:3001/v1/storage/assets/tables?asset_view=syn23643253&return_type=json "HTTP/1.1 200 OK"
INFO: [2024-08-30 12:15:09] httpx - HTTP Request: GET http://localhost:3001/v1/storage/project/datasets?asset_view=syn23643253&project_id=syn26251192 "HTTP/1.1 200 OK"
INFO: [2024-08-30 12:15:34] httpx - HTTP Request: GET http://localhost:3001/v1/storage/project/datasets?asset_view=syn20446927&project_id=syn32596076 "HTTP/1.1 200 OK"
INFO: [2024-08-30 12:15:34] manifest-generator - Monitoring manifest generation
INFO: [2024-08-30 12:15:45] httpx - HTTP Request: GET http://localhost:3001/v1/manifest/generate?schema_url=https%3A%2F%2Fraw.githubusercontent.com%2FSage-Bionetworks%2Fschematic%2Fdevelop%2Ftests%2Fdata%2Fexample.model.jsonld&title=example&data_type=Patient "HTTP/1.1 200 OK"
INFO: [2024-08-30 12:15:58] httpx - HTTP Request: GET http://localhost:3001/v1/manifest/generate?schema_url=https%3A%2F%2Fraw.githubusercontent.com%2FSage-Bionetworks%2Fschematic%2Fdevelop%2Ftests%2Fdata%2Fexample.model.jsonld&title=example&data_type=Patient&output=excel "HTTP/1.1 200 OK"
INFO: [2024-08-30 12:16:21] httpx - HTTP Request: GET http://localhost:3001/v1/manifest/generate?schema_url=https%3A%2F%2Fraw.githubusercontent.com%2FSage-Bionetworks%2Fschematic%2Fdevelop%2Ftests%2Fdata%2Fexample.model.jsonld&title=example&data_type=Patient&output=excel&dataset_id=syn51078367&asset_view=syn23643253 "HTTP/1.1 200 OK"
INFO: [2024-08-30 12:17:05] httpx - HTTP Request: GET http://localhost:3001/v1/manifest/generate?schema_url=https%3A%2F%2Fraw.githubusercontent.com%2Fncihtan%2Fdata-models%2Fmain%2FHTAN.model.jsonld&title=example&data_type=Patient "HTTP/1.1 200 OK"
INFO: [2024-08-30 12:17:05] manifest-storage - Monitoring storage endpoints
INFO: [2024-08-30 12:17:11] httpx - HTTP Request: GET http://localhost:3001/v1/storage/assets/tables?asset_view=syn23643253&return_type=json "HTTP/1.1 200 OK"
INFO: [2024-08-30 12:17:16] httpx - HTTP Request: GET http://localhost:3001/v1/storage/project/datasets?asset_view=syn23643253&project_id=syn26251192 "HTTP/1.1 200 OK"
INFO: [2024-08-30 12:17:41] httpx - HTTP Request: GET http://localhost:3001/v1/storage/project/datasets?asset_view=syn20446927&project_id=syn32596076 "HTTP/1.1 200 OK"
INFO: [2024-08-30 12:17:41] manifest-generator - Monitoring manifest generation
INFO: [2024-08-30 12:17:52] httpx - HTTP Request: GET http://localhost:3001/v1/manifest/generate?schema_url=https%3A%2F%2Fraw.githubusercontent.com%2FSage-Bionetworks%2Fschematic%2Fdevelop%2Ftests%2Fdata%2Fexample.model.jsonld&title=example&data_type=Patient "HTTP/1.1 200 OK"
INFO: [2024-08-30 12:18:03] httpx - HTTP Request: GET http://localhost:3001/v1/manifest/generate?schema_url=https%3A%2F%2Fraw.githubusercontent.com%2FSage-Bionetworks%2Fschematic%2Fdevelop%2Ftests%2Fdata%2Fexample.model.jsonld&title=example&data_type=Patient&output=excel "HTTP/1.1 200 OK"
INFO: [2024-08-30 12:18:26] httpx - HTTP Request: GET http://localhost:3001/v1/manifest/generate?schema_url=https%3A%2F%2Fraw.githubusercontent.com%2FSage-Bionetworks%2Fschematic%2Fdevelop%2Ftests%2Fdata%2Fexample.model.jsonld&title=example&data_type=Patient&output=excel&dataset_id=syn51078367&asset_view=syn23643253 "HTTP/1.1 200 OK"
INFO: [2024-08-30 12:19:12] httpx - HTTP Request: GET http://localhost:3001/v1/manifest/generate?schema_url=https%3A%2F%2Fraw.githubusercontent.com%2Fncihtan%2Fdata-models%2Fmain%2FHTAN.model.jsonld&title=example&data_type=Patient "HTTP/1.1 200 OK"
INFO: [2024-08-30 12:19:12] manifest-submit - Monitoring manifest submission
INFO: [2024-08-30 12:19:56] httpx - HTTP Request: POST http://localhost:3001/v1/model/submit?schema_url=https%3A%2F%2Fraw.githubusercontent.com%2FSage-Bionetworks%2Fschematic%2Fdevelop%2Ftests%2Fdata%2Fexample.model.jsonld&dataset_id=syn51376664&asset_view=syn51376649&restrict_rules=True&use_schema_label=True&data_model_labels=class_label&table_manipulation=replace&manifest_record_type=table_and_file "HTTP/1.1 200 OK"
INFO: [2024-08-30 12:20:08] httpx - HTTP Request: POST http://localhost:3001/v1/model/submit?schema_url=https%3A%2F%2Fraw.githubusercontent.com%2FSage-Bionetworks%2Fschematic%2Fdevelop%2Ftests%2Fdata%2Fexample.model.jsonld&dataset_id=syn51376664&asset_view=syn51376649&restrict_rules=True&use_schema_label=True&data_model_labels=class_label&table_manipulation=replace&manifest_record_type=file_only "HTTP/1.1 200 OK"
INFO: [2024-08-30 12:20:21] httpx - HTTP Request: POST http://localhost:3001/v1/model/submit?schema_url=https%3A%2F%2Fraw.githubusercontent.com%2FSage-Bionetworks%2Fdata_flow_config%2Fmain%2FHTAN%2Fdataflow_component.csv&dataset_id=syn51376664&asset_view=syn51376649&restrict_rules=True&use_schema_label=True&data_model_labels=class_label&table_manipulation=replace&manifest_record_type=file_only "HTTP/1.1 200 OK"
INFO: [2024-08-30 12:20:23] manifest-validate - Monitoring manifest validation
INFO: [2024-08-30 12:20:23] httpx - HTTP Request: POST http://localhost:3001/v1/model/validate?schema_url=https%3A%2F%2Fraw.githubusercontent.com%2FSage-Bionetworks%2Fschematic%2Fdevelop%2Ftests%2Fdata%2Fexample.model.jsonld&data_type=Patient&restrict_rules=True "HTTP/1.1 200 OK"
INFO: [2024-08-30 12:20:24] httpx - HTTP Request: POST http://localhost:3001/v1/model/validate?schema_url=https%3A%2F%2Fraw.githubusercontent.com%2FSage-Bionetworks%2Fschematic%2Fdevelop%2Ftests%2Fdata%2Fexample.model.jsonld&data_type=Patient "HTTP/1.1 200 OK"
INFO: [2024-08-30 12:21:06] httpx - HTTP Request: POST http://localhost:3001/v1/model/validate?schema_url=https%3A%2F%2Fraw.githubusercontent.com%2Fncihtan%2Fdata-models%2Fmain%2FHTAN.model.jsonld&data_type=Biospecimen "HTTP/1.1 200 OK"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, I don't personally see any issues with the logic, but I haven't personally tested the cross file manifest functionality.
I'm going to pre-approve because the logic is sound and the tests look great with some minor comments. Great work pushing this through!
I'll leave final review to someone on FAIR
@BryanFauble For this I moved the new tests to /tests/unit/test_validate_attribute.py, but they seem to be cuasing an error there. |
@BryanFauble Ok I got this error locally:
|
Since he's ooo, I'll respond. Does it run locally when running that same command as set in GitHub Ci/CD? |
Ok. Look at the logs and diagnosis. From a glance at the logs test file names need to be unique. So we just need to keep that in mind that we can't name them the same across the 2 directories. |
I've added init files to both |
TBH we decided that atleast for now, the code is doing what we want. I'm going to start a more comprehensive look into it, because I'm skeptical based on the documentation. So for now the tests should just ensure the current behavior, and we can change them if needed in the future. |
Thanks Andrew for all the work here - we would definitely appreciate another set of eyes to look at what schematic is doing every step of the way. I just wanted to let you know that it's my opinion that we should focus less on "having to deliver" or "finishing everything you committed to", and instead focussing more on doing it right the first time. It's also my opinion that the time to code review, validate, learn, educate your peers, write quality and robust code with close to little nitpick comments, should be accounted for during sprint planning. So if that means committing to less, then so be it. |
@thomasyu888 I went ahead and paramaterized these tests |
02f0937
to
9531cce
Compare
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left some comments for a very small change. Overall it looks good. Thanks for the hard work.
@andrewelamb also the last bullet point in the PR description can be removed? |
Done! |
ValidateAttribute
classValidateAttribute
classValidateAttribute._get_target_manifest_dataframes
ValidateAttribute._get_target_manifest_dataframes
is a new method that gets the target manufests for doign cross manifest validation, and returns them as pandas.DataFrames. The purpose of this is to abstract out the synapse functionality into one method that can be mocked for unit testing.ValidateAttribute._run_validation_across_target_manifests
now uses this.NOTE: It was decided that the current functionality of cross manifest validation is correct, see here. The unti tests added reflect that.