-
Notifications
You must be signed in to change notification settings - Fork 533
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FIX: Resolving/rebasing paths from/to results files #2971
Conversation
Codecov Report
@@ Coverage Diff @@
## master #2971 +/- ##
==========================================
- Coverage 67.59% 64.27% -3.33%
==========================================
Files 344 342 -2
Lines 43796 43780 -16
Branches 5476 5482 +6
==========================================
- Hits 29606 28140 -1466
- Misses 13473 14551 +1078
- Partials 717 1089 +372
Continue to review full report at Codecov.
|
bf4cc49
to
682c73c
Compare
4d13d18
to
cb0892b
Compare
It seems that nipy#2944 has uncovered a rats-nest hidden in the engine. In resolving that issue, I found out that a great deal of boilerplate was set in place when loading/saving results to deal with ``OutputMulti{Object,Path}`` traits. The reason being that these traits flatten single-element-list values. This PR fixes the pickling behavior of traited specs containing these types of traits. Additionally, this PR also avoids the ``modify_paths`` function that was causing problems originally in nipy#2944. Therefore, this PR effectively make results files static, meaning: caching if the ``base_dir`` of the workflow is changed will not work anymore. I plan to re-insert this feature (results file mobility) with nipy#2971. This PR is just to split that one in more digestible bits. All the boilerplate mentioned above has been cleaned up.
This contains the test from and closes #2949 |
This feels ready to merge. The diff will be easier after #2985 is merged. Hopefully, this ends my burst of refactors. Thanks for your patience! |
Modified ``test_outputmultipath_collapse`` due to a derivation of nipy#2968.
…eeded_outputs=true
Once we figure out the problem of ``OutputMultiObject``, we could go ahead and set fix nipy#2944, fix nipreps/fmriprep#1674, close nipy#2945.
@@ -1260,7 +1260,7 @@ def _run_interface(self, execute=True, updatehash=False): | |||
stop_first=str2bool( | |||
self.config['execution']['stop_on_first_crash']))) | |||
# And store results | |||
_save_resultfile(result, cwd, self.name) | |||
_save_resultfile(result, cwd, self.name, rebase=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be self.config.getboolean('execution', 'use_relative_paths')
?
_save_resultfile(result, cwd, self.name, rebase=False) | |
_save_resultfile(result, cwd, self.name, | |
rebase=self.config.getboolean('execution', 'use_relative_paths')) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the discussion point 2 I wanted to bring up atop. On the one hand, should we expose the inner workings of the engine - after all, adding that here does not change anything to the user (just that they won't be able to move the work directory anymore).
On the other hand, for this to be completely consistent across the node implementation, we need to also rebase/resolve the inputs pickle (actually I'll open an issue because this should be addressed).
I'm under the impression that use_relative_paths
could be useful for the user at the interface level, forcing interfaces to return relative paths if desired.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My overall perspective is "Let's not change the API if we don't absolutely have to to fix the bug." Which includes config file options.
I may not be understanding your position though, so we might be talking past each other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, from that standpoint:
use_relative_paths
Should the paths stored in results (and used to look for inputs) be relative or absolute. Relative paths allow moving the whole working directory around but may cause problems with symlinks. (possible values: true and false; default value: false)
The option is clearly defined specifically for the results file. Under that perspective, yes, any rebasing should be done only if use_relative_paths
is true.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But actually not at the point you suggested, because that one happens at MapNode aggregation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, okay. Sorry, my attention is split a lot of ways right now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, however, we may want to add parameterized tests with use_relative_paths
on and off.
Based on functionalities added in #2970.
Summary
Searches for paths in outputs of the result object to resolve/rebase them when loading/saving the pickle.
Includes a minor variation of the test @effigies proposed.
Fixes #2944 .
Side effects
#2944 and posterior efforts to address it uncovered a couple of problems we may want to address some time (from #2970 (comment)):
hash_files
(ENH: Add resolve/rebaseBasePath
traits methods & tests #2970 (comment)): this PR is a drop-in replacement of traits that should allow FIX: Resolving/rebasing paths from/to results files #2971 to remain strictly scoped into reading/writing of results files (i.e., no need to reworkhash_files
).use_relative_paths
, which I think fully pertains to the Interface level (i.e., it would be independent of the internal format of results files).Acknowledgment