Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added unit tests for validate attribute class #1486

Merged
merged 7 commits into from
Sep 4, 2024

Conversation

andrewelamb
Copy link
Contributor

@andrewelamb andrewelamb commented Aug 30, 2024

  • Fixes FDS-2323
  • Creates a new unit test file for the ValidateAttribute class
  • Adds the unit testting done in fix how cross manifest validation set rule works #1484 (Cross Manifest Validation was deemed to be working)
  • Fixes some documentation and typing in the ValidateAttribute class
  • Adds new method ValidateAttribute._get_target_manifest_dataframes

ValidateAttribute._get_target_manifest_dataframes is a new method that gets the target manufests for doign cross manifest validation, and returns them as pandas.DataFrames. The purpose of this is to abstract out the synapse functionality into one method that can be mocked for unit testing.

ValidateAttribute._run_validation_across_target_manifests now uses this.

NOTE: It was decided that the current functionality of cross manifest validation is correct, see here. The unti tests added reflect that.

Copy link
Member

@thomasyu888 thomasyu888 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I built the docker image using this branch and ran schematic profiler - success!

INFO: [2024-08-30 12:12:53] manifest-validate - Monitoring manifest validation
INFO: [2024-08-30 12:12:53] httpx - HTTP Request: POST http://localhost:3001/v1/model/validate?schema_url=https%3A%2F%2Fraw.githubusercontent.com%2FSage-Bionetworks%2Fschematic%2Fdevelop%2Ftests%2Fdata%2Fexample.model.jsonld&data_type=Patient&restrict_rules=True "HTTP/1.1 200 OK"
INFO: [2024-08-30 12:12:54] httpx - HTTP Request: POST http://localhost:3001/v1/model/validate?schema_url=https%3A%2F%2Fraw.githubusercontent.com%2FSage-Bionetworks%2Fschematic%2Fdevelop%2Ftests%2Fdata%2Fexample.model.jsonld&data_type=Patient "HTTP/1.1 200 OK"
INFO: [2024-08-30 12:13:46] httpx - HTTP Request: POST http://localhost:3001/v1/model/validate?schema_url=https%3A%2F%2Fraw.githubusercontent.com%2Fncihtan%2Fdata-models%2Fmain%2FHTAN.model.jsonld&data_type=Biospecimen "HTTP/1.1 200 OK"
INFO: [2024-08-30 12:13:46] manifest-submit - Monitoring manifest submission
INFO: [2024-08-30 12:14:31] httpx - HTTP Request: POST http://localhost:3001/v1/model/submit?schema_url=https%3A%2F%2Fraw.githubusercontent.com%2FSage-Bionetworks%2Fschematic%2Fdevelop%2Ftests%2Fdata%2Fexample.model.jsonld&dataset_id=syn51376664&asset_view=syn51376649&restrict_rules=True&use_schema_label=True&data_model_labels=class_label&table_manipulation=replace&manifest_record_type=table_and_file "HTTP/1.1 200 OK"
INFO: [2024-08-30 12:14:43] httpx - HTTP Request: POST http://localhost:3001/v1/model/submit?schema_url=https%3A%2F%2Fraw.githubusercontent.com%2FSage-Bionetworks%2Fschematic%2Fdevelop%2Ftests%2Fdata%2Fexample.model.jsonld&dataset_id=syn51376664&asset_view=syn51376649&restrict_rules=True&use_schema_label=True&data_model_labels=class_label&table_manipulation=replace&manifest_record_type=file_only "HTTP/1.1 200 OK"
INFO: [2024-08-30 12:14:55] httpx - HTTP Request: POST http://localhost:3001/v1/model/submit?schema_url=https%3A%2F%2Fraw.githubusercontent.com%2FSage-Bionetworks%2Fdata_flow_config%2Fmain%2FHTAN%2Fdataflow_component.csv&dataset_id=syn51376664&asset_view=syn51376649&restrict_rules=True&use_schema_label=True&data_model_labels=class_label&table_manipulation=replace&manifest_record_type=file_only "HTTP/1.1 200 OK"
INFO: [2024-08-30 12:14:57] manifest-storage - Monitoring storage endpoints
INFO: [2024-08-30 12:15:03] httpx - HTTP Request: GET http://localhost:3001/v1/storage/assets/tables?asset_view=syn23643253&return_type=json "HTTP/1.1 200 OK"
INFO: [2024-08-30 12:15:09] httpx - HTTP Request: GET http://localhost:3001/v1/storage/project/datasets?asset_view=syn23643253&project_id=syn26251192 "HTTP/1.1 200 OK"
INFO: [2024-08-30 12:15:34] httpx - HTTP Request: GET http://localhost:3001/v1/storage/project/datasets?asset_view=syn20446927&project_id=syn32596076 "HTTP/1.1 200 OK"
INFO: [2024-08-30 12:15:34] manifest-generator - Monitoring manifest generation
INFO: [2024-08-30 12:15:45] httpx - HTTP Request: GET http://localhost:3001/v1/manifest/generate?schema_url=https%3A%2F%2Fraw.githubusercontent.com%2FSage-Bionetworks%2Fschematic%2Fdevelop%2Ftests%2Fdata%2Fexample.model.jsonld&title=example&data_type=Patient "HTTP/1.1 200 OK"
INFO: [2024-08-30 12:15:58] httpx - HTTP Request: GET http://localhost:3001/v1/manifest/generate?schema_url=https%3A%2F%2Fraw.githubusercontent.com%2FSage-Bionetworks%2Fschematic%2Fdevelop%2Ftests%2Fdata%2Fexample.model.jsonld&title=example&data_type=Patient&output=excel "HTTP/1.1 200 OK"
INFO: [2024-08-30 12:16:21] httpx - HTTP Request: GET http://localhost:3001/v1/manifest/generate?schema_url=https%3A%2F%2Fraw.githubusercontent.com%2FSage-Bionetworks%2Fschematic%2Fdevelop%2Ftests%2Fdata%2Fexample.model.jsonld&title=example&data_type=Patient&output=excel&dataset_id=syn51078367&asset_view=syn23643253 "HTTP/1.1 200 OK"
INFO: [2024-08-30 12:17:05] httpx - HTTP Request: GET http://localhost:3001/v1/manifest/generate?schema_url=https%3A%2F%2Fraw.githubusercontent.com%2Fncihtan%2Fdata-models%2Fmain%2FHTAN.model.jsonld&title=example&data_type=Patient "HTTP/1.1 200 OK"
INFO: [2024-08-30 12:17:05] manifest-storage - Monitoring storage endpoints
INFO: [2024-08-30 12:17:11] httpx - HTTP Request: GET http://localhost:3001/v1/storage/assets/tables?asset_view=syn23643253&return_type=json "HTTP/1.1 200 OK"
INFO: [2024-08-30 12:17:16] httpx - HTTP Request: GET http://localhost:3001/v1/storage/project/datasets?asset_view=syn23643253&project_id=syn26251192 "HTTP/1.1 200 OK"
INFO: [2024-08-30 12:17:41] httpx - HTTP Request: GET http://localhost:3001/v1/storage/project/datasets?asset_view=syn20446927&project_id=syn32596076 "HTTP/1.1 200 OK"
INFO: [2024-08-30 12:17:41] manifest-generator - Monitoring manifest generation
INFO: [2024-08-30 12:17:52] httpx - HTTP Request: GET http://localhost:3001/v1/manifest/generate?schema_url=https%3A%2F%2Fraw.githubusercontent.com%2FSage-Bionetworks%2Fschematic%2Fdevelop%2Ftests%2Fdata%2Fexample.model.jsonld&title=example&data_type=Patient "HTTP/1.1 200 OK"
INFO: [2024-08-30 12:18:03] httpx - HTTP Request: GET http://localhost:3001/v1/manifest/generate?schema_url=https%3A%2F%2Fraw.githubusercontent.com%2FSage-Bionetworks%2Fschematic%2Fdevelop%2Ftests%2Fdata%2Fexample.model.jsonld&title=example&data_type=Patient&output=excel "HTTP/1.1 200 OK"
INFO: [2024-08-30 12:18:26] httpx - HTTP Request: GET http://localhost:3001/v1/manifest/generate?schema_url=https%3A%2F%2Fraw.githubusercontent.com%2FSage-Bionetworks%2Fschematic%2Fdevelop%2Ftests%2Fdata%2Fexample.model.jsonld&title=example&data_type=Patient&output=excel&dataset_id=syn51078367&asset_view=syn23643253 "HTTP/1.1 200 OK"
INFO: [2024-08-30 12:19:12] httpx - HTTP Request: GET http://localhost:3001/v1/manifest/generate?schema_url=https%3A%2F%2Fraw.githubusercontent.com%2Fncihtan%2Fdata-models%2Fmain%2FHTAN.model.jsonld&title=example&data_type=Patient "HTTP/1.1 200 OK"
INFO: [2024-08-30 12:19:12] manifest-submit - Monitoring manifest submission
INFO: [2024-08-30 12:19:56] httpx - HTTP Request: POST http://localhost:3001/v1/model/submit?schema_url=https%3A%2F%2Fraw.githubusercontent.com%2FSage-Bionetworks%2Fschematic%2Fdevelop%2Ftests%2Fdata%2Fexample.model.jsonld&dataset_id=syn51376664&asset_view=syn51376649&restrict_rules=True&use_schema_label=True&data_model_labels=class_label&table_manipulation=replace&manifest_record_type=table_and_file "HTTP/1.1 200 OK"
INFO: [2024-08-30 12:20:08] httpx - HTTP Request: POST http://localhost:3001/v1/model/submit?schema_url=https%3A%2F%2Fraw.githubusercontent.com%2FSage-Bionetworks%2Fschematic%2Fdevelop%2Ftests%2Fdata%2Fexample.model.jsonld&dataset_id=syn51376664&asset_view=syn51376649&restrict_rules=True&use_schema_label=True&data_model_labels=class_label&table_manipulation=replace&manifest_record_type=file_only "HTTP/1.1 200 OK"
INFO: [2024-08-30 12:20:21] httpx - HTTP Request: POST http://localhost:3001/v1/model/submit?schema_url=https%3A%2F%2Fraw.githubusercontent.com%2FSage-Bionetworks%2Fdata_flow_config%2Fmain%2FHTAN%2Fdataflow_component.csv&dataset_id=syn51376664&asset_view=syn51376649&restrict_rules=True&use_schema_label=True&data_model_labels=class_label&table_manipulation=replace&manifest_record_type=file_only "HTTP/1.1 200 OK"
INFO: [2024-08-30 12:20:23] manifest-validate - Monitoring manifest validation
INFO: [2024-08-30 12:20:23] httpx - HTTP Request: POST http://localhost:3001/v1/model/validate?schema_url=https%3A%2F%2Fraw.githubusercontent.com%2FSage-Bionetworks%2Fschematic%2Fdevelop%2Ftests%2Fdata%2Fexample.model.jsonld&data_type=Patient&restrict_rules=True "HTTP/1.1 200 OK"
INFO: [2024-08-30 12:20:24] httpx - HTTP Request: POST http://localhost:3001/v1/model/validate?schema_url=https%3A%2F%2Fraw.githubusercontent.com%2FSage-Bionetworks%2Fschematic%2Fdevelop%2Ftests%2Fdata%2Fexample.model.jsonld&data_type=Patient "HTTP/1.1 200 OK"
INFO: [2024-08-30 12:21:06] httpx - HTTP Request: POST http://localhost:3001/v1/model/validate?schema_url=https%3A%2F%2Fraw.githubusercontent.com%2Fncihtan%2Fdata-models%2Fmain%2FHTAN.model.jsonld&data_type=Biospecimen "HTTP/1.1 200 OK"

Copy link
Member

@thomasyu888 thomasyu888 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I don't personally see any issues with the logic, but I haven't personally tested the cross file manifest functionality.

I'm going to pre-approve because the logic is sound and the tests look great with some minor comments. Great work pushing this through!

I'll leave final review to someone on FAIR

@andrewelamb
Copy link
Contributor Author

@BryanFauble For this I moved the new tests to /tests/unit/test_validate_attribute.py, but they seem to be cuasing an error there.

@andrewelamb
Copy link
Contributor Author

@BryanFauble Ok I got this error locally:

import file mismatch:
imported module 'test_validate_attribute' has this __file__ attribute:
  /home/alamb/repos/schematic/tests/integration/test_validate_attribute.py
which is not the same as the test file we want to collect:
  /home/alamb/repos/schematic/tests/unit/test_validate_attribute.py
HINT: remove __pycache__ / .pyc files and/or use a unique basename for your test file modules

@thomasyu888
Copy link
Member

@BryanFauble For this I moved the new tests to /tests/unit/test_validate_attribute.py, but they seem to be cuasing an error there.

Since he's ooo, I'll respond. Does it run locally when running that same command as set in GitHub Ci/CD?

@BryanFauble
Copy link
Collaborator

@BryanFauble For this I moved the new tests to /tests/unit/test_validate_attribute.py, but they seem to be cuasing an error there.

@andrewelamb

Ok. Look at the logs and diagnosis.

From a glance at the logs test file names need to be unique. So we just need to keep that in mind that we can't name them the same across the 2 directories.

@andrewelamb
Copy link
Contributor Author

I've added init files to both /test/unit and test/integration/, that seems to fix things locally

@andrewelamb
Copy link
Contributor Author

LGTM, I don't personally see any issues with the logic, but I haven't personally tested the cross file manifest functionality.

I'm going to pre-approve because the logic is sound and the tests look great with some minor comments. Great work pushing this through!

I'll leave final review to someone on FAIR

TBH we decided that atleast for now, the code is doing what we want. I'm going to start a more comprehensive look into it, because I'm skeptical based on the documentation. So for now the tests should just ensure the current behavior, and we can change them if needed in the future.

@thomasyu888
Copy link
Member

thomasyu888 commented Aug 30, 2024

LGTM, I don't personally see any issues with the logic, but I haven't personally tested the cross file manifest functionality.

I'm going to pre-approve because the logic is sound and the tests look great with some minor comments. Great work pushing this through!

I'll leave final review to someone on FAIR

TBH we decided that atleast for now, the code is doing what we want. I'm going to start a more comprehensive look into it, because I'm skeptical based on the documentation. So for now the tests should just ensure the current behavior, and we can change them if needed in the future.

Thanks Andrew for all the work here - we would definitely appreciate another set of eyes to look at what schematic is doing every step of the way.

I just wanted to let you know that it's my opinion that we should focus less on "having to deliver" or "finishing everything you committed to", and instead focussing more on doing it right the first time.

It's also my opinion that the time to code review, validate, learn, educate your peers, write quality and robust code with close to little nitpick comments, should be accounted for during sprint planning. So if that means committing to less, then so be it.

@andrewelamb
Copy link
Contributor Author

@thomasyu888 I went ahead and paramaterized these tests

@BryanFauble BryanFauble force-pushed the FDS-2323-test-coverage branch from 02f0937 to 9531cce Compare September 3, 2024 19:16
Copy link

sonarqubecloud bot commented Sep 3, 2024

Copy link
Contributor

@linglp linglp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left some comments for a very small change. Overall it looks good. Thanks for the hard work.

@linglp
Copy link
Contributor

linglp commented Sep 4, 2024

@andrewelamb also the last bullet point in the PR description can be removed?

@andrewelamb
Copy link
Contributor Author

@andrewelamb also the last bullet point in the PR description can be removed?

Done!

@andrewelamb andrewelamb merged commit d5b2ed3 into develop Sep 4, 2024
7 checks passed
@andrewelamb andrewelamb deleted the FDS-2323-test-coverage branch September 4, 2024 18:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants