This repository has been archived by the owner on Jan 23, 2023. It is now read-only.
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Reduce allocations when async method yields
The first time a Task-based method yields, today there are four allocations: - The Task returned from the method - The state machine object boxed to the heap - An Action delegate that'll be passed to awaiters - A MoveNextRunner that stores state machine and the ExecutionContext, and has the method that the Action actually references For a simple async method, e.g. ```C# static async Task DoWorkAsync() { await Task.Yield(); } ``` when it yields the first time, we allocate four objects equaling 232 bytes (64-bit). This PR changes the scheme to use fewer allocations and less memory. With the new version, there are only two allocations: - A type derived from Task - An Action delegate that'll be passed to awaiters This doesn't obviate the need for the state machine, but rather than boxing the object normally, we simply store the state machine onto the Task-derived type, which itself implements IAsyncStateMachine. Further, the captured ExecutionContext is stored onto that same object, rather than requiring a separate MoveNextRunner to be allocated, and the delegate can point to that Task-derived type. With this new scheme and that same example from earlier, rather than costing 4 allocations and 232 bytes, it costs 2 allocations and 176 bytes. It also helps further in another common case. Previously the Task and state machine object would only be allocated once, but the Action and MoveNextRunner would be allocated and then could only be reused for subsequent awaits if the current ExecutionContext was the default. If, however, the current ExecutionContext was not the default, every await would end up allocating another Action and MoveNextRunner, for 2 allocations and 56 bytes on each await. With the new design, those are eliminated, such that even if a non-default ExecutionContext is in play, and even if it changes on between awaits, the original allocations are still used. There's also a small debugging benefit to this change: the resulting Task object now also contains the state machine data, which means if you have a reference to the Task, you can easily in the debugger see the state associated with the async method. Previously you would need to use a tool like sos to find the async state machine object that referenced the relevant task. One hopefully minor downside to the change is that the Task object returned from an async method is now larger than it used to be, with all of the state machine's state on it. Generally this won't matter, as you await a Task and then drop it, so the extra memory pressure doesn't exist for longer than it used to. However, if you happen to hold on to that task for a prolonged period of time, you'll now be keeping alive a larger object than you previously were. There is also a very corner case change in behavior, which shouldn't break any real code, but does actually break one corefx test; there's an AsyncValueTaskMethodBuilder test I wrote, as part of trying to get to 100% code coverage, that explicitly passes the wrong state machine object to the builder's SetStateMachine method, and this change causes one of its asserts to fail (in an expected manner).
- Loading branch information