Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

src: use find instead of char-by-char in FromFilePath() #50288

Merged
merged 6 commits into from
Oct 24, 2023
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 14 additions & 5 deletions src/node_url.cc
Original file line number Diff line number Diff line change
Expand Up @@ -417,12 +417,21 @@ void BindingData::RegisterExternalReferences(
}
}

std::string FromFilePath(const std::string_view file_path) {
std::string escaped_file_path;
for (size_t i = 0; i < file_path.length(); ++i) {
escaped_file_path += file_path[i];
if (file_path[i] == '%') escaped_file_path += "25";
std::string FromFilePath(std::string_view file_path) {
// Avoid unnecessary allocations.
size_t pos = file_path.empty() ? std::string_view::npos : file_path.find('%');
if (pos == std::string_view::npos) {
return ada::href_from_file(file_path);
}
// Escape '%' characters to a temporary string.
std::string escaped_file_path;
do {
escaped_file_path += file_path.substr(0, pos + 1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential poor performance due to reallocating here. Imagine a pathological input like "%".repeat(1e6).

Better to count the number of % characters, then preallocate a buffer of size file_path.size() + 2 * count.

Copy link
Member Author

@lemire lemire Oct 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no reallocation due to the string_view.

Clang compiles the second inner loop to the following:

.LBB0_12: # =>This Inner Loop Header: Depth=1
  lea r14, [r13 + 1]
  cmp r15, r14
  mov rdx, r14
  cmovb rdx, r15
  mov rax, rbp
  sub rax, qword ptr [rbx + 8]
  cmp rax, rdx
  jb .LBB0_13
  mov rdi, rbx
  mov rsi, r12
  call std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_append(char const*, unsigned long)@PLT
  mov rax, qword ptr [rbx + 8]
  and rax, -2
  cmp rax, qword ptr [rsp + 16] # 8-byte Folded Reload
  je .LBB0_17
  mov edx, 2
  mov rdi, rbx
  lea rsi, [rip + .L.str]
  call std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_append(char const*, unsigned long)@PLT
  cmp r15, r13
  jbe .LBB0_21
  add r12, r14
  sub r15, r14
  je .LBB0_24
  mov rdi, r12
  mov esi, 37
  mov rdx, r15
  call memchr@PLT
  test rax, rax
  je .LBB0_27
  mov r13, rax
  sub r13, r12
  cmp r13, -1
  jne .LBB0_12

There are two calls to string append, as you'd expect. Otherwise, there is no allocation. E.g., for example, there are calls to a std::string_view constructor because it gets optimized away. Remember that std::string_view instances are non-allocating. That's why, for example, we pass them by value (not by reference), typically.

The adversarial scenario is one where the entire string is made of '%'. You can benchmark this case using my code, while passing --adversarial to the benchmark program (benchmark --adversarial). In that scenario, the repeated calls to find are not a positive. All tested functions are slow in this adversarial case, but the new code is about 50% slower. That's to be expected. If you do expect strings to contain a large fraction of '%' characters, then the PR is not beneficial. But the point of the PR is that on realistic inputs, the PR can multiply the performance by 10x.

Results on my macBook (LLVM14, you can run your own benchmarks):
Capture d’écran, le 2023-10-22 à 12 20 27

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Empirically, doing a count + reserve does not improve the performance in the adversarial case where every character is a '%'. It makes the code slightly more complicated, however.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remember that std::string_view instances are non-allocating

Yes, but escaped_file_path is a string, not a string_view, and that's what being appended to.

Empirically, doing a count + reserve does not improve the performance in the adversarial case

Honestly, I'm not worried about maximum performance here. I'd rather have predictable worst-case performance.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bnoordhuis I'm going to merge this in a couple of hours since this is not a review that blocks.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather have predictable worst-case performance.

Appending character-by-character to an std::string in C++ provides predictable performance: https://lemire.me/blog/2023/10/23/appending-to-an-stdstring-character-by-character-how-does-the-capacity-grow/

It is a common usage pattern in C++ (with std::vector, std::string, etc.). Complexity-wise, it is linear time with or withour reserve. A reserve may improve the performance, but it does not change the complexity of the algorithm. In my mind, you'd only want to do a reserve if it improves the real-world performance. It is effectively a performance/efficiency optimization.

Bugs are always possible, but in my mind, neither this PR nor the previous code (written by @anonrig I think) could degenerate performance-wise. They use linear time, safe algorithms.

I should stress that this code is not nearly as good as it could be, but further gains require changing ada.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Appending character-by-character to an std::string in C++ provides predictable performance

You're making an observation based on a sample size of one. I suspect you're also working on the assumption that (re)allocation is a O(1) operation.

Now, the change in this PR doesn't make the code materially worse (only longer and more complex) so I'm not going to block it but it also doesn't make it materially better, it's just rummaging in the margin.

Count + alloc on the other hand is a material improvement because it changes the asymptotic runtime behavior.

(I'm not interested in discussing this further because time is finite and if I haven't gotten my point across by now, it's never going to happen.)

Copy link
Member Author

@lemire lemire Oct 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not interested in discussing this further because time is finite and if I haven't gotten my point across by now, it's never going to happen.

There must be a misunderstanding. I suspect I explained myself poorly. I can see no reason why we would even disagree. Let me try to clarify.

I am saying that if you benchmark the following function...

std::string my_string; 
while (my_string.size() <= volume) { my_string += "a"; }

... the time per entry (time/volume) is effectively constant with existing C++ runtime libraries (glibc++, libc++, Visual Studio). My blog post has an analysis, and even a benchmark (complete with C++ code).

Another way to put it is that the running time is proportional to volume. And yes, I mean "the measured, wall-clock time".

They make it so because it is common usage. E.g., it would be considered a bug if it were quadratic time. We would know. The dynamic arrays in Swift, early on, could be tricked into doing one allocation per append. They quickly updated it.

I suspect you're also working on the assumption that (re)allocation is a O(1) operation.

I am assuming that allocating (or copying) N bytes takes O(N) time. With this model, insertion in a dynamic array with capacity that is expanded by a constant factor (which is how C++ runtime libraries do it) ensures that inserting an element is constant time (amortized).

Let us consider a toy example where you construct a 1024-char string, doubling each time the capacity, and starting with a capacity of 1 byte... (that's not realistic but it is simple)

  • first character: allocate 1 byte.
  • second character: double the capacity to 2 bytes.
  • 4th character: double the capacity to 4 bytes.
  • 8th character: double the capacity to 8 bytes.
  • 16th character: double the capacity to 16 bytes.
  • 32nd character: double the capacity to 32 bytes.
  • 64th character: double the capacity to 64 bytes.
  • 128th character: double the capacity to 128 bytes.
  • 256th character: double the capacity to 256 bytes.
  • 512th character: double the capacity to 512 bytes.
  • 1024th character: double the capacity to 1024 bytes.

If you sum it up you get 1 + 2 + 4 + ... 512 + 1024 = 2047. And the result is general. To construct a string of N bytes, you need ~2N bytes being allocated. The key part is that it has amortized linear complexity. Both glibc++ and libc++ use a factor of 2 whereas Microsoft and Facebook prefer a smaller factor, but the analysis is the same.

Feel free to reach out to me at daniel@lemire.me if I did not clarify that point. We can set up a videoconference if needed.

escaped_file_path += "25";
file_path = file_path.substr(pos + 1);
pos = 0;
tniessen marked this conversation as resolved.
Show resolved Hide resolved
lemire marked this conversation as resolved.
Show resolved Hide resolved
} while ((pos = file_path.find('%', pos)) != std::string_view::npos);
escaped_file_path += file_path;
return ada::href_from_file(escaped_file_path);
}

Expand Down
2 changes: 1 addition & 1 deletion src/node_url.h
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,7 @@ class BindingData : public SnapshotableObject {
std::optional<std::string> base);
};

std::string FromFilePath(const std::string_view file_path);
std::string FromFilePath(std::string_view file_path);

} // namespace url

Expand Down