Error in orchestrator: Netherite backend failed to start: Operations per second is over the account limit. #396

ericleigh007 · 2024-05-08T11:25:02Z

      "Microsoft.Azure.DurableTask.Core": "2.15.1",
      "Microsoft.Azure.DurableTask.Netherite": "1.4.2",
      "Microsoft.Azure.WebJobs.Extensions.DurableTask": "2.12.0"

We have started to get this error reported in our production deployment and it is worrying because it results in data loss.

As it is probably obvious from the below, our service bus trigger starts [or in this case attempts to start] an orchestrator. What is "interesting" is that we are not under any undue loading conditions that I can tell, and we just ran a full-up-test in our scaled test environment (equal to this environment in scale) yesterday with no such problem.

Microsoft.Azure.WebJobs.Host.FunctionInvocationException:
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor+<ExecuteWithLoggingAsync>d__26.MoveNext (Microsoft.Azure.WebJobs.Host, Version=3.0.39.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35: D:\a\_work\1\s\src\Microsoft.Azure.WebJobs.Host\Executors\FunctionExecutor.cs:352)
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter`1.GetResult (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor+<TryExecuteAsync>d__18.MoveNext (Microsoft.Azure.WebJobs.Host, Version=3.0.39.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35: D:\a\_work\1\s\src\Microsoft.Azure.WebJobs.Host\Executors\FunctionExecutor.cs:108)
Inner exception System.InvalidOperationException handled at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw:
   at DurableTask.Netherite.NetheriteOrchestrationService+<GetClientAsync>d__21.MoveNext (DurableTask.Netherite, Version=1.0.0.0, Culture=neutral, PublicKeyToken=ef8c4135b1b4225a: /_/src/DurableTask.Netherite/OrchestrationService/NetheriteOrchestrationService.cs:75)
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at DurableTask.Netherite.NetheriteOrchestrationService+<DurableTask-Core-IOrchestrationServiceClient-CreateTaskOrchestrationAsync>d__100.MoveNext (DurableTask.Netherite, Version=1.0.0.0, Culture=neutral, PublicKeyToken=ef8c4135b1b4225a: /_/src/DurableTask.Netherite/OrchestrationService/NetheriteOrchestrationService.cs:594)
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at DurableTask.Core.TaskHubClient+<InternalCreateOrchestrationInstanceWithRaisedEventAsync>d__32.MoveNext (DurableTask.Core, Version=2.16.0.0, Culture=neutral, PublicKeyToken=d53979610a6e89dd: /_/src/DurableTask.Core/TaskHubClient.cs:645)
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at Microsoft.Azure.WebJobs.Extensions.DurableTask.DurableClient+<Microsoft-Azure-WebJobs-Extensions-DurableTask-IDurableOrchestrationClient-StartNewAsync>d__36`1.MoveNext (Microsoft.Azure.WebJobs.Extensions.DurableTask, Version=2.0.0.0, Culture=neutral, PublicKeyToken=014045d636e89289: D:\a\_work\1\s\src\WebJobs.Extensions.DurableTask\ContextImplementations\DurableClient.cs:215)
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at More.Functions.Input.Dew.DewServiceBusTrigger+<StartOrchestrator>d__9.MoveNext (More.Functions, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null: D:\a\1\s\More.Functions\Input\Dew\DewServiceBusTrigger.cs:111)
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at More.Functions.Input.Dew.DewServiceBusTrigger+<ProcessMessage>d__8.MoveNext (More.Functions, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null: D:\a\1\s\More.Functions\Input\Dew\DewServiceBusTrigger.cs:93)
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at More.Functions.Input.Dew.DewServiceBusTrigger+<Run>d__4.MoveNext (More.Functions, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null: D:\a\1\s\More.Functions\Input\Dew\DewServiceBusTrigger.cs:53)
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at More.Functions.Input.Dew.DewServiceBusTrigger+<Run>d__4.MoveNext (More.Functions, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null: D:\a\1\s\More.Functions\Input\Dew\DewServiceBusTrigger.cs:58)
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.GetResult (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at Microsoft.Azure.WebJobs.Host.Executors.VoidTaskMethodInvoker`2+<InvokeAsync>d__2.MoveNext (Microsoft.Azure.WebJobs.Host, Version=3.0.39.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35: D:\a\_work\1\s\src\Microsoft.Azure.WebJobs.Host\Executors\VoidTaskMethodInvoker.cs:20)
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter`1.GetResult (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at Microsoft.Azure.WebJobs.Host.Executors.FunctionInvoker`2+<InvokeAsync>d__10.MoveNext (Microsoft.Azure.WebJobs.Host, Version=3.0.39.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35: D:\a\_work\1\s\src\Microsoft.Azure.WebJobs.Host\Executors\FunctionInvoker.cs:52)
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter`1.GetResult (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor+<InvokeWithTimeoutAsync>d__33.MoveNext (Microsoft.Azure.WebJobs.Host, Version=3.0.39.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35: D:\a\_work\1\s\src\Microsoft.Azure.WebJobs.Host\Executors\FunctionExecutor.cs:581)
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter`1.GetResult (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor+<ExecuteWithWatchersAsync>d__32.MoveNext (Microsoft.Azure.WebJobs.Host, Version=3.0.39.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35: D:\a\_work\1\s\src\Microsoft.Azure.WebJobs.Host\Executors\FunctionExecutor.cs:527)
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.GetResult (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor+<ExecuteWithLoggingAsync>d__26.MoveNext (Microsoft.Azure.WebJobs.Host, Version=3.0.39.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35: D:\a\_work\1\s\src\Microsoft.Azure.WebJobs.Host\Executors\FunctionExecutor.cs:306)
Inner exception Microsoft.Azure.Storage.StorageException handled at DurableTask.Netherite.NetheriteOrchestrationService+<GetClientAsync>d__21.MoveNext:
   at Microsoft.Azure.Storage.Core.Executor.Executor+<ExecuteAsync>d__1`1.MoveNext (Microsoft.Azure.Storage.Common, Version=11.2.3.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35)
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at Microsoft.Azure.EventHubs.Processor.PartitionManager+<InitializeStoresAsync>d__10.MoveNext (Microsoft.Azure.EventHubs.Processor, Version=4.3.2.0, Culture=neutral, PublicKeyToken=7e34167dcc6d6d8c)
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at Microsoft.Azure.EventHubs.Processor.PartitionManager+<StartAsync>d__7.MoveNext (Microsoft.Azure.EventHubs.Processor, Version=4.3.2.0, Culture=neutral, PublicKeyToken=7e34167dcc6d6d8c)
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at Microsoft.Azure.EventHubs.Processor.EventProcessorHost+<RegisterEventProcessorFactoryAsync>d__56.MoveNext (Microsoft.Azure.EventHubs.Processor, Version=4.3.2.0, Culture=neutral, PublicKeyToken=7e34167dcc6d6d8c)
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at DurableTask.Netherite.EventHubsTransport.EventHubsTransport+<<DurableTask-Netherite-ITransportLayer-StartWorkersAsync>g__StartPartitionHost|41_0>d.MoveNext (DurableTask.Netherite, Version=1.0.0.0, Culture=neutral, PublicKeyToken=ef8c4135b1b4225a: /_/src/DurableTask.Netherite/TransportLayer/EventHubs/EventHubsTransport.cs:188)
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at DurableTask.Netherite.EventHubsTransport.EventHubsTransport+<DurableTask-Netherite-ITransportLayer-StartWorkersAsync>d__41.MoveNext (DurableTask.Netherite, Version=1.0.0.0, Culture=neutral, PublicKeyToken=ef8c4135b1b4225a: /_/src/DurableTask.Netherite/TransportLayer/EventHubs/EventHubsTransport.cs:163)
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at DurableTask.Netherite.NetheriteOrchestrationService+<StartWorkersAsync>d__87.MoveNext (DurableTask.Netherite, Version=1.0.0.0, Culture=neutral, PublicKeyToken=ef8c4135b1b4225a: /_/src/DurableTask.Netherite/OrchestrationService/NetheriteOrchestrationService.cs:428)
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at DurableTask.Netherite.NetheriteOrchestrationService+<TryStartAsync>d__85.MoveNext (DurableTask.Netherite, Version=1.0.0.0, Culture=neutral, PublicKeyToken=ef8c4135b1b4225a: /_/src/DurableTask.Netherite/OrchestrationService/NetheriteOrchestrationService.cs:319)
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at DurableTask.Core.TaskHubWorker+<StartAsync>d__32.MoveNext (DurableTask.Core, Version=2.16.0.0, Culture=neutral, PublicKeyToken=d53979610a6e89dd: /_/src/DurableTask.Core/TaskHubWorker.cs:246)
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at Microsoft.Azure.WebJobs.Extensions.DurableTask.DurableTaskExtension+<StartTaskHubWorkerIfNotStartedAsync>d__102.MoveNext (Microsoft.Azure.WebJobs.Extensions.DurableTask, Version=2.0.0.0, Culture=neutral, PublicKeyToken=014045d636e89289: D:\a\_work\1\s\src\WebJobs.Extensions.DurableTask\DurableTaskExtension.cs:1414)
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.GetResult (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at Microsoft.Azure.WebJobs.Host.Listeners.FunctionListener+<StartAsync>d__13.MoveNext (Microsoft.Azure.WebJobs.Host, Version=3.0.39.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35: D:\a\_work\1\s\src\Microsoft.Azure.WebJobs.Host\Listeners\FunctionListener.cs:68)

The text was updated successfully, but these errors were encountered:

ericleigh007 · 2024-05-08T11:50:06Z

We have at least 149 of these over the last 4 hours, but our exception handling may also be hiding some of them.
It looks as if Operations per second on our storage account for the backend is just over the 20k/sec limit, so assuming it is that.

Also I think that my other issue related to FASTER #383 is related, in retrying on these errors we're getting perhaps?

ericleigh007 · 2024-05-08T17:50:59Z

@davidmrdavid glad the job is the job while I guess @sebastianburckhardt is out readying for BUILD, maybe?

At any rate....

Another update: We're trying to get our storage account capacity upgraded on a few accounts to get over the hump and into live, whilst I'm hoping that Netherite troops will find some of the problem that could lead to needing more than expected capacity.
Some ideas:

The FASTER.KV problem maybe causing a lot of retries to the blob?
Whether or not, do we have control of any retry logic that's used by Netherite which we could tweak and provide some relief? I obviously hope that we can just get more, or regain capacity some other way, but maybe adjusting retry policy might provide some immediate relief.
The possibility that we might provide a DIFFERENT storage account for the "larger payload" blobs, and thereby cut that part of the load?

I'm off trying to justify Azure increasing my capacity, but happy to hear anything of any help.

davidmrdavid · 2024-05-08T22:46:01Z

The FASTER.KV problem maybe causing a lot of retries to the blob?

Interesting theory- maybe; I need to see if this triggers an explosive retry-chain. I'll see if I can provide you with a private package based off this (#397) in case you're willing to give it a go and report back.

We have started to get this error reported in our production deployment and it is worrying because it results in data loss.

This is new info to me - what do you mean by this resulting in data loss? Can you please clarify? Is it data loss because you're getting throttled so you need to choose a different storage account (therefore leaving the old data behind)?

ericleigh007 · 2024-05-09T09:19:32Z

Thanks @davidmrdavid
It isn't "strictly" loss as the input is still in the input file, but it means we have to do things like manually retrigger our Cosoms change feed, because the orchestrator that was supposed to start isn't started. then downstream systems don't get their updates, and then the domino effect.

davidmrdavid · 2024-05-09T21:11:11Z

Noted. For now, let's try the package here (#383 (comment)) to see if there's indeed a correlation between those FASTER warnings and this throttling, that should help make the next steps clearer.

ericleigh007 · 2024-05-10T20:00:36Z

Azure support upped our maximum Transactions / second to 40,000 for now. We have not seen this one again since the first time. We have a decent A/B situation set up now because only ONE of our high scale accounts has the uplift. So we'll if the lower-limit accounts still report this occaisionally.

davidmrdavid · 2024-05-10T22:26:52Z

@ericleigh007: so is the plan to have the private package in both environments so we can measure the effect of the change in that A/B set up? Just confirming, that would be great

ericleigh007 · 2024-05-11T22:21:23Z

Unfortunately [really] no. We have no way to put the private package in the production environment so not capable of doing that. It was difficult enough to get the private package into dev where we tested.

Now back to the subject, will definitely keep you guys informed as we run on the 40k environment next week. So far, I've seen the environment hit 1.65M transactions / minute, so with the capacity uplift, we're definitely starting to increase above the standard load.

In the next week or so, I believe we will get the newest released package and dependencies into some environments and within a couple of weeks, if no problems develop, into production.

ericleigh007 · 2024-05-18T02:46:38Z

I sent some PM's to you showing the comparison between our uplifted storage account's usage vs non-uplifted, for about the same workload, give or take.
The function app under the uplifted storage account ran great, while the non-uplifted one had myriad warnings, IndexOutOfRange, Disposed, and Client timeouts starting orchestrators.
This may be correlation without causation, but wanted to mention it here.

microsoft-github-policy-service bot added the Needs: Triage 🔍 label May 8, 2024

davidmrdavid mentioned this issue May 10, 2024

Large amounts of warnings from FASTER.KV #383

Closed

bachuv added P2 Priority 2 and removed Needs: Triage 🔍 labels May 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error in orchestrator: Netherite backend failed to start: Operations per second is over the account limit. #396

Error in orchestrator: Netherite backend failed to start: Operations per second is over the account limit. #396

ericleigh007 commented May 8, 2024 •

edited

Loading

ericleigh007 commented May 8, 2024 •

edited

Loading

ericleigh007 commented May 8, 2024 •

edited

Loading

davidmrdavid commented May 8, 2024 •

edited

Loading

ericleigh007 commented May 9, 2024

davidmrdavid commented May 9, 2024

ericleigh007 commented May 10, 2024

davidmrdavid commented May 10, 2024 •

edited

Loading

ericleigh007 commented May 11, 2024

ericleigh007 commented May 18, 2024

Error in orchestrator: Netherite backend failed to start: Operations per second is over the account limit. #396

Error in orchestrator: Netherite backend failed to start: Operations per second is over the account limit. #396

Comments

ericleigh007 commented May 8, 2024 • edited Loading

ericleigh007 commented May 8, 2024 • edited Loading

ericleigh007 commented May 8, 2024 • edited Loading

davidmrdavid commented May 8, 2024 • edited Loading

ericleigh007 commented May 9, 2024

davidmrdavid commented May 9, 2024

ericleigh007 commented May 10, 2024

davidmrdavid commented May 10, 2024 • edited Loading

ericleigh007 commented May 11, 2024

ericleigh007 commented May 18, 2024

ericleigh007 commented May 8, 2024 •

edited

Loading

ericleigh007 commented May 8, 2024 •

edited

Loading

ericleigh007 commented May 8, 2024 •

edited

Loading

davidmrdavid commented May 8, 2024 •

edited

Loading

davidmrdavid commented May 10, 2024 •

edited

Loading