Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace dependency on boost with a custom small vector #13716

Merged
14 commits merged into from
Sep 14, 2022
Merged

Conversation

lhecker
Copy link
Member

@lhecker lhecker commented Aug 10, 2022

This replaces ~70k LOC (parts of boost, 1/4th of the code in this project)
with ~700 LOC (small_vector.h). By replacing boost, we simplify future
maintenance and improve compile times.

Validation Steps Performed

  • New and existing unit tests are ok ✅
  • Various common VT applications run fine in debug mode OpenConsole ✅

@lhecker lhecker marked this pull request as draft August 10, 2022 13:39
@DHowett
Copy link
Member

DHowett commented Aug 10, 2022

Hell yeah.

Why is this difficult to review?

@lhecker
Copy link
Member Author

lhecker commented Aug 10, 2022

Why is this difficult to review?

The code isn't that trivial tbh. I honestly thought it was pretty solid when I opened the PR, but as you can see, the tests disagree and this code remains buggy.

@zadjii-msft
Copy link
Member

Yea I mean, small_vector.h is the hard part of this review. There's like what, 11 other relevant changes, and then 450 files of "delete boost" which I'm just gonna skip over. If you get the tests fixed, then hell yea let's ship this. Lets us close out the internal bugs we have about "switch boost to vcpkg" (or whatever they are) too.

@zadjii-msft zadjii-msft added the Area-CodeHealth Issues related to code cleanliness, linting, rules, warnings, errors, static analysis, etc. label Aug 10, 2022
Copy link
Member

@DHowett DHowett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add some tests, answer the Qs, and we'll be golden!

@lhecker lhecker force-pushed the dev/lhecker/small-vector branch from 83a3b2c to 9f9f704 Compare August 12, 2022 17:35
@lhecker lhecker requested a review from DHowett August 12, 2022 17:36
@lhecker lhecker force-pushed the dev/lhecker/small-vector branch from 9f9f704 to efa468b Compare August 12, 2022 17:41
@github-actions

This comment was marked as outdated.

@github-actions

This comment was marked as outdated.

@lhecker lhecker marked this pull request as ready for review August 13, 2022 18:17
@@ -57,6 +57,7 @@ locl
lorem
Lorigin
maxed
minimalistic
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think the word is just "minmal" :D

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with this.jpg. Minimalistic sounds like something I'd say.

@zadjii-msft
Copy link
Member

zadjii-msft commented Aug 24, 2022

Do we think I'm actually qualified to review the implementation of a basic collection like that in c++? Or should I lean on the fact that the tests passed and nothing seems to explode? Otherwise @miniksa is usually a bit more fluent in the idomatic way of implementing something like this, and might be a better reviewer for https://github.com/microsoft/terminal/pull/13716/files#diff-396eb9779e7a38e7354acd9c04395d1fb502e991b843ead351c1312156b14a5a

@DHowett
Copy link
Member

DHowett commented Aug 24, 2022

This is in today's bug bash build as well!

@miniksa
Copy link
Member

miniksa commented Aug 25, 2022

Do we think I'm actually qualified to review the implementation of a basic collection like that in c++? Or should I lean on the fact that the tests passed and nothing seems to explode? Otherwise @miniksa is usually a bit more fluent in the idomatic way of implementing something like this, and might be a better reviewer for https://github.com/microsoft/terminal/pull/13716/files#diff-396eb9779e7a38e7354acd9c04395d1fb502e991b843ead351c1312156b14a5a

I can put it on my list for tomorrow.

Copy link
Member

@miniksa miniksa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very exciting to be losing boost. Looks pretty good. I didn't have any blocking comments.

@@ -57,6 +57,7 @@ locl
lorem
Lorigin
maxed
minimalistic
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with this.jpg. Minimalistic sounds like something I'd say.

Comment on lines +7 to +18
// small_vector::_data can reference both non-owned (&_buffer[0]) and owned (new[]) data.
#pragma warning(disable : 26402) // Return a scoped object instead of a heap-allocated if it has a move constructor (r.3).
// small_vector manages a _capacity of potentially uninitialized data. We can't use regular new/delete.
#pragma warning(disable : 26409) // Avoid calling new and delete explicitly, use std::make_unique<T> instead (r.11).
// That's how the STL implemented their std::vector<>::iterator. I simply copied the concept.
#pragma warning(disable : 26434) // Function '...' hides a non-virtual function '...'.
// Functions like front()/back()/operator[]() are explicitly unchecked, just like the std::vector equivalents.
#pragma warning(disable : 26446) // Prefer to use gsl::at() instead of unchecked subscript operator (bounds.4).
// small_vector::_data references potentially uninitialized data and so we can't pass it regular iterators which reference initialized data.
#pragma warning(disable : 26459) // You called an STL function '...' with a raw pointer parameter at position '...' that may be unsafe ... (stl.1).
// small_vector::_data references potentially uninitialized data and so we can't pass it regular iterators which reference initialized data.
#pragma warning(disable : 26481) // Don't use pointer arithmetic. Use span instead (bounds.1).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has to be the most comprehensive explanation of warning suppressions I've ever seen. Fantastic!


namespace til
{
// This class was adopted from std::span<>::iterator.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have written "stolen shamelessly" instead of "adopted". But I guess your wording is technically a better idea.

reserve(other._size);

std::uninitialized_copy(other.begin(), other.end(), _uninitialized_begin());
_size = other._size;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good call with reporting empty on failure.

// NOTE: If an exception is thrown while copying, the vector is left empty.
small_vector& operator=(const small_vector& other)
{
clear();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was about to ask why you needed std::destroy for the move constructor but not for here, but I suppose it's because the destructors will be naturally called as you're clear()ing. Also I bet I could answer this myself if I scrolled further down.

Copy link
Member Author

@lhecker lhecker Aug 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clear() only destroys the data, but doesn't free the allocated memory.
The copy constructor wants exactly that: It wants to reuse the existing memory. But the move constructor needs to free its own memory, since it'll take ownership of the memory of the other vector.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I was thinking about calling the destructors of the objects inside when I wrote this comment not necessarily about using the memory space.

Comment on lines +382 to +397
if (_capacity != N)
{
_deallocate(_data);
}

if (other._capacity == N)
{
_data = &_buffer[0];
_capacity = N;
_size = other._size;
// The earlier static_assert(std::is_nothrow_move_constructible_v<T>)
// ensures that we don't exit in a weird state with invalid `_size`.
#pragma warning(suppress : 26447) // The function is declared 'noexcept' but calls function '...' which may throw exceptions (f.6).
std::uninitialized_move(other.begin(), other.end(), _uninitialized_begin());
std::destroy(other.begin(), other.end());
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're going to want to explain this. Why is it better to do it this way over just having the pointers always be moved into this one and calling it a day. I presume it's an optimization of some sort, but I don't know what it is. Comments?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean why I check if (other._capacity == N) here and do this extra branch?
If the other vector is "small" (= data is in the _buffer array) its _capacity will equal N. But unlike a heap allocated pointer we can't just "move" a stack allocated array - we can only copy it. And that's why use std::uninitialized_move here to create a copy of the other vector's _buffer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes yes. It's literally the entire optimization. Brain fart. Maybe just leave a small comment on the branch to help remind goofballs like me.

@zadjii-msft zadjii-msft added this to the Terminal v1.17 milestone Aug 25, 2022
@DHowett
Copy link
Member

DHowett commented Sep 2, 2022

@lhecker are the TODOs to-done?

@lhecker
Copy link
Member Author

lhecker commented Sep 2, 2022

@DHowett No I haven't worked on this yet. I was planning to continue with this PR once we forked off 1.16.

@lhecker
Copy link
Member Author

lhecker commented Sep 13, 2022

@DHowett I've just pushed some final fixes and improvements and I think this PR is now ready to be merged. If possible I'd love if you could give the last commit another review.

// This line is noexcept:
// It can't throw because of the earlier static_assert(std::is_nothrow_move_assignable_v<T>).
const auto displacement = std::min(count, moveable);
std::uninitialized_move(end() - displacement, end(), _uninitialized_begin() + (new_size - displacement));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we have tests exercising this code? validating it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, parts of the Basic tests cover this.

Copy link
Member

@DHowett DHowett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think i understand!

@DHowett DHowett added the AutoMerge Marked for automatic merge by the bot when requirements are met label Sep 13, 2022
@ghost
Copy link

ghost commented Sep 13, 2022

Hello @DHowett!

Because this pull request has the AutoMerge label, I will be glad to assist with helping to merge this pull request once all check-in policies pass.

p.s. you can customize the way I help with merging this pull request, such as holding this pull request until a specific person approves. Simply @mention me (@msftbot) and give me an instruction to get started! Learn more here.

@ghost ghost merged commit 71fc4b1 into main Sep 14, 2022
@ghost ghost deleted the dev/lhecker/small-vector branch September 14, 2022 13:57
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-CodeHealth Issues related to code cleanliness, linting, rules, warnings, errors, static analysis, etc. AutoMerge Marked for automatic merge by the bot when requirements are met
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants