Check gpu task #361

annehaley · 2024-03-21T17:04:05Z

This PR is intended to demonstrate that the mock deepssm task has access to a GPU when run from the "gpu" queue. These changes can be reverted when the real deepssm task is implemented, but for now this version can be used to test the GPU availability.

In order to demonstrate a difference between the two queues, the "mock-deepssm" endpoint will spawn a task on each queue. To persist the feedback from each task, two TaskProgress objects are used. These TaskProgress objects are not associated with any project, so one migration needs to be applied to allow project=null on TaskProgress objects. Since we need a migration anyway, I added a field "message" to the model (similar to the "error" field but without the connotation of failure). The mock deepssm task will save a string to this field which describes the availability of a GPU device.

Once merged, the expected behavior is as follows:

User submits a post request to api/mock-deepssm
Two tasks are spawned: One deepssm task sent to the "gpu" queue and one deepssm task sent to the default queue ("celery")
User receives a response similar to the following, containing two TaskProgress ids:

{
  "success": true,
  "progress_ids": {
    "gpu": 5,
    "default": 6
  }
}

The task sent to the default queue will run and save the following message to its TaskProgress object: "DeepSSM task not implemented; testing GPU availability. GPU available = False."
The task sent to the "gpu" queue will stay in the queue until the "manage_workers" task is spawned by Celery beat, whereupon the GPU worker on AWS will be started. The GPU worker will pick up the waiting task and save a success message to its TaskProgress object, similar to the following: "DeepSSM task not implemented; testing GPU availability. GPU available = True. Found device [device_name]."
The user can compare these results by making get requests to api/v1/task-progress/5 and api/v1/task-progress/6.

…ar to "error" field

…o TaskProgress object

annehaley added 6 commits March 21, 2024 16:21

Allow TaskProgress objects with no project, add "message" field simil…

b44c312

…ar to "error" field

Modify "deepssm" task to check for GPU availabilty and save message t…

394a2f8

…o TaskProgress object

Fix GPU queue definition

6e87ace

Run formatter

73eda30

Undo manual queue definition

74a71ae

Remove unused import

098fc84

annehaley marked this pull request as ready for review March 21, 2024 18:23

Merge branch 'master' into check-gpu-task

932dc44

annehaley merged commit a47f575 into master Mar 21, 2024
4 checks passed

annehaley deleted the check-gpu-task branch March 21, 2024 19:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Check gpu task #361

Check gpu task #361

annehaley commented Mar 21, 2024

Check gpu task #361

Check gpu task #361

Conversation

annehaley commented Mar 21, 2024