You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The details of the UPG_DOWNLOADING state should include a retryable_error state indicating the most recent retryable error that was encountered. This is so that users do not need to wait for the full download timeout to see the error, which defaults to 2 hours.
We should additionally include a retry_until field containing the deadline in UTC for the upgrade to complete. The elastic-agent status command should calculate the time until the deadline so that it is obvious much longer the agent will spend retrying the download. Fleet should be updated to do the same thing but this will be a separate issue.
To test this behavior I set the agent to use a source URI that does not exist:
I then observed that the agent reported itself in the upgrade downloading state with the percent complete stuck at 0% until the eventual transition to the upgrade failed state. The logs did contain the actual error, but looking at the details alone does not tell you the download is failing.
{"log.level":"warn","@timestamp":"2023-11-24T19:30:58.344Z","log.origin":{"file.name":"upgrade/step_download.go","file.line":260},"message":"unable to download package: 3 errors occurred:\n\t* package '/Library/Elastic/Agent/data/elastic-agent-97e821/downloads/elastic-agent-8.11.1-darwin-aarch64.tar.gz' not found: open /Library/Elastic/Agent/data/elastic-agent-97e821/downloads/elastic-agent-8.11.1-darwin-aarch64.tar.gz: no such file or directory\n\t* call to 'https://artifacts.elastic.co/broken/beats/elastic-agent/elastic-agent-8.11.1-darwin-aarch64.tar.gz' returned unsuccessful status code: 404\n\t* call to 'https://artifacts.elastic.co/broken/beats/elastic-agent/elastic-agent-8.11.1-darwin-aarch64.tar.gz' returned unsuccessful status code: 404\n\n; retrying (will be retry 2) in 30.028119235s.","log":{"source":"elastic-agent"},"ecs.version":"1.6.0"}
The text was updated successfully, but these errors were encountered:
cmacknz
changed the title
The UPG_DOWNLOADING state should include a retryable error and the time the agent will retry
The UPG_DOWNLOADING state should include a retryable error and the time the agent should spend retrying
Nov 24, 2023
This is a follow up from a conversation in #3760
The details of the UPG_DOWNLOADING state should include a
retryable_error
state indicating the most recent retryable error that was encountered. This is so that users do not need to wait for the full download timeout to see the error, which defaults to 2 hours.We should additionally include a
retry_until
field containing the deadline in UTC for the upgrade to complete. Theelastic-agent status
command should calculate the time until the deadline so that it is obvious much longer the agent will spend retrying the download. Fleet should be updated to do the same thing but this will be a separate issue.To test this behavior I set the agent to use a source URI that does not exist:
I then observed that the agent reported itself in the upgrade downloading state with the percent complete stuck at 0% until the eventual transition to the upgrade failed state. The logs did contain the actual error, but looking at the details alone does not tell you the download is failing.
The text was updated successfully, but these errors were encountered: