-
Notifications
You must be signed in to change notification settings - Fork 30.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
src: use find instead of char-by-char in FromFilePath() #50288
Merged
Merged
Changes from all commits
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
3f4deaa
src: use find instead of char-by-char in FromFilePath()
lemire 95e9f4c
fix typo
lemire a61db5d
adding space
lemire 8c6a185
Update src/node_url.cc
lemire 11d038f
Update src/node_url.cc
lemire ba7f31d
simplifying by not setting pos to zero.
lemire File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential poor performance due to reallocating here. Imagine a pathological input like
"%".repeat(1e6)
.Better to count the number of
%
characters, then preallocate a buffer of sizefile_path.size() + 2 * count
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no reallocation due to the string_view.
Clang compiles the second inner loop to the following:
There are two calls to string append, as you'd expect. Otherwise, there is no allocation. E.g., for example, there are calls to a
std::string_view
constructor because it gets optimized away. Remember thatstd::string_view
instances are non-allocating. That's why, for example, we pass them by value (not by reference), typically.The adversarial scenario is one where the entire string is made of '%'. You can benchmark this case using my code, while passing
--adversarial
to the benchmark program (benchmark --adversarial
). In that scenario, the repeated calls tofind
are not a positive. All tested functions are slow in this adversarial case, but the new code is about 50% slower. That's to be expected. If you do expect strings to contain a large fraction of '%' characters, then the PR is not beneficial. But the point of the PR is that on realistic inputs, the PR can multiply the performance by 10x.Results on my macBook (LLVM14, you can run your own benchmarks):
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Empirically, doing a count + reserve does not improve the performance in the adversarial case where every character is a '%'. It makes the code slightly more complicated, however.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but escaped_file_path is a string, not a string_view, and that's what being appended to.
Honestly, I'm not worried about maximum performance here. I'd rather have predictable worst-case performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bnoordhuis I'm going to merge this in a couple of hours since this is not a review that blocks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Appending character-by-character to an
std::string
in C++ provides predictable performance: https://lemire.me/blog/2023/10/23/appending-to-an-stdstring-character-by-character-how-does-the-capacity-grow/It is a common usage pattern in C++ (with
std::vector
,std::string
, etc.). Complexity-wise, it is linear time with or withour reserve. A reserve may improve the performance, but it does not change the complexity of the algorithm. In my mind, you'd only want to do a reserve if it improves the real-world performance. It is effectively a performance/efficiency optimization.Bugs are always possible, but in my mind, neither this PR nor the previous code (written by @anonrig I think) could degenerate performance-wise. They use linear time, safe algorithms.
I should stress that this code is not nearly as good as it could be, but further gains require changing ada.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're making an observation based on a sample size of one. I suspect you're also working on the assumption that (re)allocation is a O(1) operation.
Now, the change in this PR doesn't make the code materially worse (only longer and more complex) so I'm not going to block it but it also doesn't make it materially better, it's just rummaging in the margin.
Count + alloc on the other hand is a material improvement because it changes the asymptotic runtime behavior.
(I'm not interested in discussing this further because time is finite and if I haven't gotten my point across by now, it's never going to happen.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There must be a misunderstanding. I suspect I explained myself poorly. I can see no reason why we would even disagree. Let me try to clarify.
I am saying that if you benchmark the following function...
... the time per entry (time/volume) is effectively constant with existing C++ runtime libraries (glibc++, libc++, Visual Studio). My blog post has an analysis, and even a benchmark (complete with C++ code).
Another way to put it is that the running time is proportional to volume. And yes, I mean "the measured, wall-clock time".
They make it so because it is common usage. E.g., it would be considered a bug if it were quadratic time. We would know. The dynamic arrays in Swift, early on, could be tricked into doing one allocation per append. They quickly updated it.
I am assuming that allocating (or copying) N bytes takes O(N) time. With this model, insertion in a dynamic array with capacity that is expanded by a constant factor (which is how C++ runtime libraries do it) ensures that inserting an element is constant time (amortized).
Let us consider a toy example where you construct a 1024-char string, doubling each time the capacity, and starting with a capacity of 1 byte... (that's not realistic but it is simple)
If you sum it up you get 1 + 2 + 4 + ... 512 + 1024 = 2047. And the result is general. To construct a string of N bytes, you need ~2N bytes being allocated. The key part is that it has amortized linear complexity. Both glibc++ and libc++ use a factor of 2 whereas Microsoft and Facebook prefer a smaller factor, but the analysis is the same.
Feel free to reach out to me at daniel@lemire.me if I did not clarify that point. We can set up a videoconference if needed.