-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize Indexable#join
when all elements are strings
#6635
Optimize Indexable#join
when all elements are strings
#6635
Conversation
src/indexable.cr
Outdated
if all?(&.is_a?(String)) | ||
_join_strings(separator) | ||
else | ||
super(separator) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to use super
here but #6636
src/indexable.cr
Outdated
@@ -281,6 +281,66 @@ module Indexable(T) | |||
self | |||
end | |||
|
|||
# Optimized version of `Enumerable#join` that performs better when | |||
# all of the elements in this indexable are strings: the total string | |||
# btyesize to return can be computed before creating the final string, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: btyesize
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, fixed!
b71fb1d
to
ab90288
Compare
|
||
# Check whether we'll know the final UTF-8 size | ||
if elem.size_known? | ||
size += elem.size |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is wrong because the size of the seperator isn't added. The specs pass though, is there any way to test this in the specs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, good catch! Yes, I'll check it now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed!
@@ -4168,7 +4168,8 @@ class String | |||
return 4 | |||
end | |||
|
|||
protected def size_known? | |||
# :nodoc: | |||
def size_known? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need this in Indexable#join
so I made it :nodoc:
. It might be confusing to expose it as a public method because calling size
always gets you the correct size
. We can discuss exposing this in a separate issue/PR.
ab90288
to
6b022fd
Compare
OK, this PR got a tad bigger now. @RX14 found that I had a bug when computing a string's size. So:
This actually uncovered a bug in |
src/indexable.cr
Outdated
{% end %} | ||
end | ||
|
||
private def _join_strings(separator) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like the convention of using _name
for private names when crystal already has private methods, just private def join_strings
is fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem with private methods is that they can still be accessed by subclasses or including types. But yeah, I'll change it.
Fixes #6634
This PR overrides
Indexable(T)#join
and if:T
is aString
orT
is a union that containsString
, and all elements are stringsthen an optimized version of
join
is executed. In this case we can compute the total bytesize of the returned string:separator.bytesize * (array.size - 1) + array.sum(&.bytesize)
. This prevents potential string reallocations that happen in the default algorithm whenString.build
is used.I used a slightly modified version of the code in #6634 to benchmark this:
Before:
After:
(note the allocated memory per op too)
In Ruby (change
require "benchmark"
to `require "benchmark/ips" using this gem):So now it's faster than Ruby. Just note that if you run the original code from #6634 Crystal seems to be just a little slower than Ruby depending on the run, but a proper benchmark (
Benchmark.ips
) shows the real stuff.Further optimization
There are two cases when this optimization is applied:
T
is aString
orT
is a union that containsString
, and all elements are stringsThe first one will be trivially faster than the previous one.
The second one involves an
all?(&.is_a?(String))
check. I still think doing this check is faster than reallocating memory, so that's probably good. Ruby always does this check (it has no compile-time information) but while doing it it also computes the expected total bytesize, and switches to the default/slow implementation if an element is not a string. We could do the same, computing the length while checking.is_a?(String)
but I think that will make the code a lot harder to read and understand, so maybe it's not worth it. But we could eventually do it if we find it's a significant performance difference (I doubt it).Also, when there's no
String
in the array, the implementation is the default (slow) one.