-
Notifications
You must be signed in to change notification settings - Fork 183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove resource group asynchronously and do not wait for completion #1146
Conversation
The following pipelines have been queued for testing: |
This will be a huge improvement. Some pipelines like |
@@ -138,8 +138,8 @@ if (![string]::IsNullOrWhiteSpace($ServiceDirectory)) { | |||
} | |||
|
|||
Log "Deleting resource group '$ResourceGroupName'" | |||
if (Retry { Remove-AzResourceGroup -Name "$ResourceGroupName" -Force:$Force }) { | |||
Write-Verbose "Successfully deleted resource group '$ResourceGroupName'" | |||
if (Remove-AzResourceGroup -Name "$ResourceGroupName" -Force:$Force -AsJob) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the correct option? Wouldn't this fail most of the time as soon as the pwsh process is closed? Or would it block the PS process from shutting down?
I was hoping there was some async call we could make on the azure side to start this process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You and me both @weshaggard! I was surprised, because the python azure cli has --no-wait
, but the powershell cmdlets only have AsJob
, so I was frustrated by the inconsistency.
That said, I was worried about this same thing. In my testing I found that the below code was sufficient enough to trigger a resource group deletion. I'll do a little more investigation though, as I'd like to know whether we're just getting lucky with timing, or if it only delegates the async polling to the background job.
Remove-AzResourceGroup -Name <rg> -Force -AsJob ; Exit
What I'm removing is a retry on some 4xx/5xx failures, but I figured it would be sufficient to have the background poller remove those. That said, if we see some persistent ARM issues, it could cause a lot of resources to pile up, though I would argue that's a better case than blocking the pipeline timing anyway (and as @mikeharder suggested somewhere we may want to bump the deletion poller frequency anyway).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In practice it might not matter, but to be safe I think we should block until Get-AzResourceGroup
returns ProvisioningState: Deleting
. This is very fast (usually less than one second) and confirms the service has received the request. Here's a timeline:
> New-AzResourceGroup -name foo -location westus2
> Get-AzResourceGroup -name foo
ProvisioningState : Succeeded
> Remove-AzResourceGroup -Name foo -Force -AsJob
> Get-AzResourceGroup -name foo
ProvisioningState : Deleting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can add this. There are a couple scenarios to handle re: making sure we don't block on waiting for "Deleting" when it will never reach that state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated. Please check out the latest changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should block until Get-AzResourceGroup
returns ProvisioningState: Deleting
.
The following pipelines have been queued for testing: |
The following pipelines have been queued for testing: |
/check-enforcer override |
@benbp for future attempts you should be able to merge the miss-behaving sync PRs nd then re-run the step in sync pipeline and it should turn green. We should try to do that instead of overriding check-enforcer if possible. |
@weshaggard thanks, I will do that next time! |
…1146) * Remove resource group asynchronously and do not wait for completion * Verify resource group has reached Deleting state before exiting script * Use $_ instead of $Error[0] for resource group removal handling
Since we have background jobs that poll for zombie resource groups already, we don't need to block on resource group deletion. This should save us a couple minutes in our live test pipelines.
Example improvement on a test pipeline (the step takes 4 seconds now):
https://dev.azure.com/azure-sdk/internal/_build/results?buildId=593112&view=logs&j=011e1ec8-6569-5e69-4f06-baf193d1351e&t=5431112d-2b61-5a5f-7042-ef698f761043
I also verified in the activity log that we had a successful deletion event.
Resolves #754