Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alternative for managing task array status in Google Batch #5723

Draft
wants to merge 2 commits into
base: 5422-not_found-error-on-google-batch
Choose a base branch
from

Conversation

jorgee
Copy link
Contributor

@jorgee jorgee commented Jan 29, 2025

This PR is an alternative to process task status in task arrays.
In the current status, every time that getTaskState is called, we perform two calls to the Google Batch client, one to retrieve the list of tasks(listTasks) and another to get the task state (getTaskStatus). This second call is problematic because it generates the NotFoundException and it is also redundant because the first call already provides the task descriptions including their states.

The BatchClient has a HashMap as a cache of array task status. A new method is created to retrieve the status of a task belonging to a task array. Instead of making the call to the Google Batch API, it checks the status in the cache. When the status is not in the cache or is outdated, the listTasks method is called to update all the array tasks statuses. So, the rest of the array tasks do not require querying the Google Batch API again.

When there is no task status, it returns null and fallbacks to the getJobStatus. This is the same as we were doing when no tasks were retrieved from the job or there was a NotFoundException.

The invalidation time is 1 second because it is the same as the one in GoogleBatchTaskHandler. Another alternative is setting it with the same value as the polling interval.

@jorgee jorgee changed the title Include remove of task in hasmap when completed Alternative fot Managing task array status in Google Batch Jan 29, 2025
@jorgee jorgee changed the title Alternative fot Managing task array status in Google Batch Alternative for managing task array status in Google Batch Jan 29, 2025
Signed-off-by: jorgee <jorge.ejarque@seqera.io>
@jorgee jorgee force-pushed the 5422-alternavite-task-arrays branch from 34ced37 to 5f398ea Compare January 30, 2025 10:24
Signed-off-by: jorgee <jorge.ejarque@seqera.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant