Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pipeline log file could collide with filepath given to logmuse #212

Open
vreuter opened this issue Feb 17, 2024 · 2 comments · Fixed by #215
Open

Pipeline log file could collide with filepath given to logmuse #212

vreuter opened this issue Feb 17, 2024 · 2 comments · Fixed by #215
Assignees

Comments

@vreuter
Copy link
Member

vreuter commented Feb 17, 2024

From consideration of #210 , it becomes apparent that a user could specify a path to a logfile for a pipeline manager's logger to write to that would collide with the path the manager will generate for its own log file. This would almost surely be inadvertent and would lead to unexpected behavior like log file overwrites, duplicate messages, and possibly a race condition on file writes.

To protect against this, I think that in any pipeline manager constructor, we should...

  1. check that there's not this collision
  2. if there is a collision, it should be an exception

It should be an exception rather than a warning because...
a) It's truly exceptional
b) It's almost surely unintended
c) It would lead to undesirable behavior
d) it would be at the start of almost any pipeline, so raising an exception wouldn't risk wasting a lot of compute progress
e) it's unrecoverable -- there's not really a suitable automatic adjustment to make; the hypothetical filepaths in question (logmuse logfile and pypiper log file) may be expected to be present with particular values, e.g. for consumption by other tools, and so doing something like injecting some distinguishing identifier into the filename suffix doesn't seem suitable

@nsheff would you accept (in principle, code review TBD ;) ) a pull request to this effect?

@vreuter
Copy link
Member Author

vreuter commented Feb 17, 2024

Where logmuse opens the file in (over)write mode: https://github.com/databio/logmuse/blob/00f8e9f5fd769f7b10fc51dd0ee514f851707eb5/logmuse/est.py#L260-L261

Where a pipeline manager would pass a logfile to logmuse: https://github.com/databio/pypiper/blob/master/pypiper/manager.py#L213-L232 (currently only in the if not args branch of code, but should be in both -- see #211 )

Where a manager sets its own log file path: https://github.com/databio/pypiper/blob/master/pypiper/manager.py#L281

Where a manager opens its own log file path for appending:

pypiper/pypiper/manager.py

Lines 527 to 531 in 3a8465a

tee = subprocess.Popen(
["tee", "-a", self.pipeline_log_file],
stdin=subprocess.PIPE,
preexec_fn=self._ignore_interrupts,
)

@nsheff
Copy link
Member

nsheff commented Feb 17, 2024

yes, I agree with everything you state. I think basically nobody uses the logmuse to set the logfile path, so it hasn't been a problem. But you're right it could be.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants