-
Notifications
You must be signed in to change notification settings - Fork 8.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce Transient Allocations during Bulk Text Output #8617
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…ace that it is a counted string. (Could also just make sure both strings fit within small string optimization so no allocs... but this is better.)
…n there are no links found. Update caller to handle appropriately.
… on every single message.
DHowett
reviewed
Dec 18, 2020
DHowett
approved these changes
Dec 18, 2020
6 tasks
carlos-zamora
approved these changes
Dec 24, 2020
Co-authored-by: Carlos Zamora <carlos.zamora@microsoft.com>
Hello @miniksa! Because this pull request has the p.s. you can customize the way I help with merging this pull request, such as holding this pull request until a specific person approves. Simply @mention me (
|
🎉 Handy links: |
mpela81
pushed a commit
to mpela81/terminal
that referenced
this pull request
Jan 28, 2021
Make a few changes to memory usage throughout the application to reduce transient allocations from the `big.txt` test from ~213,000 to ~53,000. ## PR Checklist * [x] Supports microsoft#3075 * [x] I work here. * [x] Tested manually and WPR'd. Test suite should still pass. * [x] Am core contributor ## Detailed Description of the Pull Request / Additional comments Transient allocations are those that are new'd, used, then delete'd. Going back and forth to the system allocator for things we're just going to throw away or use rapidly again is a performance detriment. Not only is it a bunch of time to go ask the system with a syscall, it also hits a whole bunch of locks on the allocators. This PR identifies a few places where we were accidentally allocating and didn't mean to or were allocating and freeing just to turn around and allocate again. I chose other strategies to avoid this. ## Validation Steps Performed - Ran `big.txt` sample (~6MB file) before and after. Observed heap allocations with WPR.
ghost
pushed a commit
that referenced
this pull request
Feb 16, 2021
…nd bitmap runs and utf8 conversions (#8621) ## References * See also #8617 ## PR Checklist * [x] Supports #3075 * [x] I work here. * [x] Manual test. ## Detailed Description of the Pull Request / Additional comments ### Window Title Generation Every time the renderer checks the title, it's doing two bad things that I've fixed: 1. It's assembling the prefix to the full title doing a concatenation. No one ever gets just the prefix ever after it is set besides the concat. So instead of storing prefix and the title, I store the assembled prefix + title and the bare title. 2. A copy must be made because it was returning `std::wstring` instead of `std::wstring&`. Now it returns the ref. ### Dirty Area Return Every time the renderer checks the dirty area, which is sometimes multiple times per pass (regular text printing, again for selection, etc.), a vector is created off the heap to return the rectangles. The consumers only ever iterate this data. Now we return a span over a rectangle or rectangles that the engine must store itself. 1. For some renderers, it's always a constant 1 element. They update that 1 element when dirty is queried and return it in the span with a span size of 1. 2. For other renderers with more complex behavior, they're already holding a cached vector of rectangles. Now it's effectively giving out the ref to those in the span for iteration. ### Bitmap Runs The `til::bitmap` used a `std::optional<std::vector<til::rectangle>>` inside itself to cache its runs and would clear the optional when the runs became invalidated. Unfortunately doing `.reset()` to clear the optional will destroy the underlying vector and have it release its memory. We know it's about to get reallocated again, so we're just going to make it a `std::pmr::vector` and give it a memory pool. The alternative solution here was to use a `bool` and `std::vector<til::rectangle>` and just flag when the vector was invalid, but that was honestly more code changes and I love excuses to try out PMR now. Also, instead of returning the ref to the vector... I'm just returning a span now. Everyone just iterates it anyway, may as well not share the implementation detail. ### UTF-8 conversions When testing with Terminal and looking at the `conhost.exe`'s PTY renderer, it spends a TON of allocation time on converting all the UTF-16 stuff inside to UTF-8 before it sends it out the PTY. This was because `ConvertToA` was allocating a string inside itself and returning it just to have it freed after printing and looping back around again... as a PTY does. The change here is to use `til::u16u8` that accepts a buffer out parameter so the caller can just hold onto it. ## Validation Steps Performed - [x] `big.txt` in conhost.exe (GDI renderer) - [x] `big.txt` in Terminal (DX, PTY renderer) - [x] Ensure WDDM and BGFX build under Razzle with this change.
This pull request was closed.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Area-Output
Related to output processing (inserting text into buffer, retrieving buffer text, etc.)
Area-Performance
Performance-related issue
Area-Server
Down in the muck of API call servicing, interprocess communication, eventing, etc.
AutoMerge
Marked for automatic merge by the bot when requirements are met
Issue-Bug
It either shouldn't be doing this or needs an investigation.
Product-Conhost
For issues in the Console codebase
Product-Conpty
For console issues specifically related to conpty
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Make a few changes to memory usage throughout the application to reduce transient allocations from the
big.txt
test from ~213,000 to ~53,000.PR Checklist
Detailed Description of the Pull Request / Additional comments
Transient allocations are those that are new'd, used, then delete'd. Going back and forth to the system allocator for things we're just going to throw away or use rapidly again is a performance detriment. Not only is it a bunch of time to go ask the system with a syscall, it also hits a whole bunch of locks on the allocators. This PR identifies a few places where we were accidentally allocating and didn't mean to or were allocating and freeing just to turn around and allocate again. I chose other strategies to avoid this.
Validation Steps Performed
big.txt
sample (~6MB file) before and after. Observed heap allocations with WPR.