Clarity around event hub partition config #15

olitomlinson · 2020-12-21T13:39:00Z

The docs state you need

An event hub called partitions with 1-32 partitions. We recommend 12 as a default.
Four event hubs called clients0, clients1, clients2 and clients3 with 32 partitions each.

Should this be

Four event hubs called clients0, clients1, clients2 and clients3 with always 32 partitions each.

or

Four event hubs called clients0, clients1, clients2 and clients3 with 32 partitions each, or whatever you set for the partitions event hub i.e. 12 as per the recommendation

Also, should these partition counts be linked in anyway to the partitionCount configuration in host.json, or is that completely independent of this event hub configuration?

The text was updated successfully, but these errors were encountered:

sebastianburckhardt · 2021-01-04T22:52:37Z

Back from holidays today, so here we go.

The clients should always be 32 partitions each. The number of these client queue partitions is independent of the number of Netherite partitions.
Perhaps I should explain the reason to make this less mysterious. These client queues are used to send messages to the clients. Logically, one would want 1 partition per client; but clients can come and go quickly, dynamically, and their precise number is unknown, so it is not really a good idea to use one reserved EventHubs partition per client. Instead Netherite uses a hashing scheme where each client gets hashed to one of the 128 (=32+32+32+32) client queue partitions, based on its (random) client id. Conflicts are mostly harmless: if two clients should get hashed to the same queue, they may receive some messages for which they are not the target (and then throw them away). So it wastes a bit of network bandwidth, but such conflicts are expected to be rare.
the partitionCount in host.json does not apply. It's not actually part of the NetheriteOrchestrationServiceSettings which control the Netherite backend but only of the AzureStorageOptions which control the Azure Storage Backend. Those two classes have a different set of options.

I will modify the docs to make this clearer.

olitomlinson · 2021-01-05T00:49:15Z

Thanks @sebastianburckhardt

The Azure Storage TaskHub characteristics have very much become engrained to me now so I need to reset that for Netherite.

Another question

With the current Azure Storage impl. It’s very easy to deduce the amount of concurrency in Entities (dictated by the partition count) and Activities (what ever the host scale-out limit is set to).

What is the equivalent in Netherite? Is it simply the total number of partitions in the partitions Event hub, or is it the sum total of partitions in the clientX event hubs?

With regard to Activities running in Netherite, I understand that you don’t get the same infinite scale out as you might with Consumption Plan + Azure Storage impl, as concurrency is controlled by the EH partitions instead of the competing consumer pattern. Previous guidance on Activities was “do whatever long running/complex stuff in an Activity” - does this guidance somewhat change in Netherite, as long running Activities will be reducing the overall throughput of the entire TaskHub, specifically processing Orchestrations and Entities?

sebastianburckhardt · 2021-01-05T15:50:24Z

There are two types of limits:

node scale-out limit: In the current Netherite implementation, the maximum number of nodes that can perform activities and orchestration steps is limited by the number of partitions in the partitions EventHub.
per-node limit: Within each node, the MaxConcurrentActivityFunctions and MaxConcurrentOrchestratorFunctions are working the same as before. The MaxConcurrentOrchestratorFunctions applies to both entity steps and orchestration steps.

The guidance on how to decompose orchestrations and activities does not change, the same rules apply. We are trying to keep the programming model mostly the same. In particular, we are likely to implement support for infinite scale on activities at some point.

That said, if you need more scale-out for some activity function (such as a CPU-heavy computation) the currently best solution is to convert the function into an Http trigger, and then call that Http trigger using context.CallHttpAsync. This allows that function to scale out without a limit on nodes.

sebastianburckhardt · 2021-02-05T21:18:42Z

This is now much simpler: the Event hubs are now automatically created; you only need to create the event hubs namespace.
I have also updated the documentation.

I am separately tracking the concern about activity scaling in #18.

sebastianburckhardt added this to the Public Preview milestone Jan 25, 2021

sebastianburckhardt closed this as completed Feb 23, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarity around event hub partition config #15

Clarity around event hub partition config #15

olitomlinson commented Dec 21, 2020

sebastianburckhardt commented Jan 4, 2021

olitomlinson commented Jan 5, 2021 •

edited

Loading

sebastianburckhardt commented Jan 5, 2021

sebastianburckhardt commented Feb 5, 2021

Clarity around event hub partition config #15

Clarity around event hub partition config #15

Comments

olitomlinson commented Dec 21, 2020

sebastianburckhardt commented Jan 4, 2021

olitomlinson commented Jan 5, 2021 • edited Loading

sebastianburckhardt commented Jan 5, 2021

sebastianburckhardt commented Feb 5, 2021

olitomlinson commented Jan 5, 2021 •

edited

Loading