Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pypiper should support force_overwrite when reporting results? #201

Closed
donaldcampbelljr opened this issue Nov 15, 2023 · 1 comment
Closed
Labels

Comments

@donaldcampbelljr
Copy link
Member

I noticed that, when managing pipelines via pypiper, if the results already exist, the pipeline will fail. This is because pypiper uses pipestat to report and does not use pipestat.report's force_overwrite parameter:

pypiper/pypiper/manager.py

Lines 1686 to 1692 in 8aaede5

reported_result = self.pipestat.report(
values=val, sample_name=self.pipestat_sample_name, result_formatter=rf
)
if not nolog:
for r in reported_result:
self.info(r)
return reported_result

Example of error (while using PEPATAC):

These results exist for 'DEFAULT_SAMPLE_NAME': Read_type
Traceback (most recent call last):
  File "/home/drc/pepatac_tutorial//tools/pepatac/pipelines/pepatac.py", line 2779, in <module>
    sys.exit(main())
  File "/home/drc/pepatac_tutorial//tools/pepatac/pipelines/pepatac.py", line 731, in main
    pm.report_result("Read_type", args.single_or_paired)
  File "/home/drc/anaconda3/envs/pepatac38/lib/python3.8/site-packages/pypiper/manager.py", line 1615, in report_result
    for r in reported_result:
TypeError: 'bool' object is not iterable

pipestat.report returns False if it cannot report the result which then causes an issue in line 1691.

Solution:
-create a new parameter in pypiper that allows the user to toggle force_overwrite and default it to False.
-implement handling a False value instead of crashing pypiper.

@donaldcampbelljr
Copy link
Member Author

Example of output with suggested solution:

These results exist for 'DEFAULT_SAMPLE_NAME': Fragment distribution
Result successfully reported? False

donaldcampbelljr added a commit that referenced this issue Dec 22, 2023
* try getting just stage name, but fall back to str representation of stage; close #197

* version 0.13.3a1 for pipestat 0.6.0a1

* updated to pipestat 0.6.0

* updated requirements

* testing, drop python 3.7

* fix f-string quote issue for python 3.10

* minor refactor to use pipestat properties instead of cfg dict

* update changelog and version number

* update v0.13.3 and changelog

* fix _refresh_stats bug and change version to 0.14.0

* potential fix for #201

* changelog

* v0.14.0a1 prerelease

* report_object -> change message_raw to be a values dict to conform with pipestat output schemas

* self.pipestat_results_file should take priority over self.pipeline_stats_file related to databio/pepatac#257

* make pipestat_results_file = pipeline_stats_file if it is not provided

* set pipeline_stats_file if pipestat_results_file IS provided, remove checking for the first record_identifier during get_stat

* add pipestat_pipeline_type, defaulting to sample

* pipestat req version bump, v0.14.0a2 bump for pre-release

* v0.14.0 release prep

---------

Co-authored-by: Vince Reuter <vince.reuter@gmail.com>
Co-authored-by: Khoroshevskyi <sasha99250@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant