-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Connect training progress to csv #185
Conversation
After I made this PR, I had another thought that maybe we could implement this as a pub/sub pattern and remove the need for the awkward progress thread with the sleep function. The idea is that watchdog thread would have a callback to progress tracker, which would publish an event; then view would subscribe to that event and change progress bar when the event happens. Update: ended up running into the same problem where dispatching an event that ends up changing the UI from a non-UI thread causes an exception. This makes sense after looking into our implementation of publisher. |
A couple comments from me, but awesome work on your first |
src/allencell_ml_segmenter/training/metrics_csv_progress_tracker.py
Outdated
Show resolved
Hide resolved
Sounds like something to bring up with @benjijamorris . Maybe there is another file which gets updated by the post-final epoch processes (testing), and the watch could be moved to that ? |
Yeah, our homegrown pub/sub frame work is really just for decoupling our UI from business logic, making testing easier. |
The thread diagram is great - can add it to a new doc under /docs? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great - nice work.
Im interested to go over a few things, mainly for my own understanding.
Also, a few nits and small requests, and want to make sure that an import you commented out is intentional (i left a comment).
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the changes, the responses to every comment and for including links to commits. All very helpful, as a reviewer.
Awesome work on this pr. Good balance of quality code, while not going overboard with design, and getting it done quickly.
Ship it!
Context
When running training, our progress bar currently just revolves infinitely, which is not useful for gauging how long the process will continue. This PR connects the progress bar to the CSV metrics file, which is created and updated by cyto-dl during training. When the metrics file is modified, we inspect the CSV file for the latest epoch and update the progress bar with
latest_epoch_in_csv / total_number_of_epochs
as the percent complete.Changes
QThread
that emits signals containing task progress (UI thread will pick these up and make UI changes)watchdog
file event handler specific to metrics CSV filesThread Diagram
This change relies on a mixture of python
Thread
s and PyQtQThread
s. The visualization here makes it a bit easier to understand what's going on. Effectively, theMetricsCSVProgressTracker
serves as a shared resource between the threads, and updates to the UI are triggered by updates to the tracker (which are in turn triggered by updates to the CSV file). Note that originally, I had my watchdog thread try to call a function to update the UI directly, but this strategy failed since the UI thread will not accept any UI-changing calls from non-QThread
s.Testing
Added unit tests and supporting files in
_tests
. Manually tested that the progress bar updated with 2, 3, and 4 epochs of training.Further Issues
startLongTaskWithProgressBar
function incore/view.py
even though it has important thread cleanup logic.