Performance regression from v1 to v2 #182

bep · 2019-06-03T06:19:46Z

The relevant Hugo benchmarks comparing v1 with v2:

benchmark                                                    old ns/op     new ns/op     delta
BenchmarkI18nTranslate/all-present-4                         775           796           +2.71%
BenchmarkI18nTranslate/present-in-default-4                  1372          49923         +3538.70%
BenchmarkI18nTranslate/present-in-current-4                  777           797           +2.57%
BenchmarkI18nTranslate/missing-4                             1106          47469         +4191.95%
BenchmarkI18nTranslate/file-missing-4                        3729          2109          -43.44%
BenchmarkI18nTranslate/context-provided-4                    2087          1258          -39.72%
BenchmarkI18nTranslate/same-id-and-translation-4             806           792           -1.74%
BenchmarkI18nTranslate/same-id-and-translation-default-4     1413          49749         +3420.81%
BenchmarkI18nTranslate/unknown-language-code-4               1821          2073          +13.84%

benchmark                                                    old allocs     new allocs     delta
BenchmarkI18nTranslate/all-present-4                         6              4              -33.33%
BenchmarkI18nTranslate/present-in-default-4                  10             201            +1910.00%
BenchmarkI18nTranslate/present-in-current-4                  6              4              -33.33%
BenchmarkI18nTranslate/missing-4                             8              192            +2300.00%
BenchmarkI18nTranslate/file-missing-4                        21             11             -47.62%
BenchmarkI18nTranslate/context-provided-4                    15             5              -66.67%
BenchmarkI18nTranslate/same-id-and-translation-4             6              4              -33.33%
BenchmarkI18nTranslate/same-id-and-translation-default-4     10             201            +1910.00%
BenchmarkI18nTranslate/unknown-language-code-4               13             12             -7.69%

benchmark                                                    old bytes     new bytes     delta
BenchmarkI18nTranslate/all-present-4                         152           176           +15.79%
BenchmarkI18nTranslate/present-in-default-4                  216           9315          +4212.50%
BenchmarkI18nTranslate/present-in-current-4                  152           176           +15.79%
BenchmarkI18nTranslate/missing-4                             152           8924          +5771.05%
BenchmarkI18nTranslate/file-missing-4                        600           256           -57.33%
BenchmarkI18nTranslate/context-provided-4                    704           200           -71.59%
BenchmarkI18nTranslate/same-id-and-translation-4             152           165           +8.55%
BenchmarkI18nTranslate/same-id-and-translation-default-4     216           9304          +4207.41%
BenchmarkI18nTranslate/unknown-language-code-4               696           640           -8.05%

I notice some patterns in the above in that it finds a slow path if the string isn't found in the given language, but these tests are very small bundles with like 2 languages.

nicksnyder · 2019-06-05T20:29:50Z

v2.0.1 includes a fix for a fast path that wasn't getting taken when it should have. Can you please re-run your benchmarks with v2.0.1? I suspect some cases involving fallback will get slower due to the more complicated fallback logic, but I can investigate to be sure.

bep · 2019-06-06T18:16:36Z

So, v2.0.0 vs v2.0.1:

benchmark                                                    old ns/op     new ns/op     delta
BenchmarkI18nTranslate/all-present-4                         788           424           -46.19%
BenchmarkI18nTranslate/present-in-default-4                  49753         49105         -1.30%
BenchmarkI18nTranslate/present-in-current-4                  793           427           -46.15%
BenchmarkI18nTranslate/missing-4                             47109         46857         -0.53%
BenchmarkI18nTranslate/file-missing-4                        2108          2024          -3.98%
BenchmarkI18nTranslate/context-provided-4                    1237          1211          -2.10%
BenchmarkI18nTranslate/same-id-and-translation-4             784           418           -46.68%
BenchmarkI18nTranslate/same-id-and-translation-default-4     49628         49124         -1.02%
BenchmarkI18nTranslate/unknown-language-code-4               2048          2050          +0.10%

benchmark                                                    old allocs     new allocs     delta
BenchmarkI18nTranslate/all-present-4                         4              0              -100.00%
BenchmarkI18nTranslate/present-in-default-4                  201            197            -1.99%
BenchmarkI18nTranslate/present-in-current-4                  4              0              -100.00%
BenchmarkI18nTranslate/missing-4                             192            192            +0.00%
BenchmarkI18nTranslate/file-missing-4                        11             11             +0.00%
BenchmarkI18nTranslate/context-provided-4                    5              5              +0.00%
BenchmarkI18nTranslate/same-id-and-translation-4             4              0              -100.00%
BenchmarkI18nTranslate/same-id-and-translation-default-4     201            197            -1.99%
BenchmarkI18nTranslate/unknown-language-code-4               12             12             +0.00%

benchmark                                                    old bytes     new bytes     delta
BenchmarkI18nTranslate/all-present-4                         176           0             -100.00%
BenchmarkI18nTranslate/present-in-default-4                  9315          9138          -1.90%
BenchmarkI18nTranslate/present-in-current-4                  176           0             -100.00%
BenchmarkI18nTranslate/missing-4                             8925          8925          +0.00%
BenchmarkI18nTranslate/file-missing-4                        256           256           +0.00%
BenchmarkI18nTranslate/context-provided-4                    200           200           +0.00%
BenchmarkI18nTranslate/same-id-and-translation-4             165           0             -100.00%
BenchmarkI18nTranslate/same-id-and-translation-default-4     9306          9139          -1.79%
BenchmarkI18nTranslate/unknown-language-code-4               640           640           +0.00%

Compared to v1:

benchmark                                                    old ns/op     new ns/op     delta
BenchmarkI18nTranslate/all-present-4                         768           425           -44.66%
BenchmarkI18nTranslate/present-in-default-4                  1360          49039         +3505.81%
BenchmarkI18nTranslate/present-in-current-4                  769           423           -44.99%
BenchmarkI18nTranslate/missing-4                             1091          46897         +4198.53%
BenchmarkI18nTranslate/file-missing-4                        3653          2029          -44.46%
BenchmarkI18nTranslate/context-provided-4                    2040          1202          -41.08%
BenchmarkI18nTranslate/same-id-and-translation-4             793           416           -47.54%
BenchmarkI18nTranslate/same-id-and-translation-default-4     1401          49009         +3398.14%
BenchmarkI18nTranslate/unknown-language-code-4               1795          2006          +11.75%

benchmark                                                    old allocs     new allocs     delta
BenchmarkI18nTranslate/all-present-4                         6              0              -100.00%
BenchmarkI18nTranslate/present-in-default-4                  10             197            +1870.00%
BenchmarkI18nTranslate/present-in-current-4                  6              0              -100.00%
BenchmarkI18nTranslate/missing-4                             8              192            +2300.00%
BenchmarkI18nTranslate/file-missing-4                        21             11             -47.62%
BenchmarkI18nTranslate/context-provided-4                    15             5              -66.67%
BenchmarkI18nTranslate/same-id-and-translation-4             6              0              -100.00%
BenchmarkI18nTranslate/same-id-and-translation-default-4     10             197            +1870.00%
BenchmarkI18nTranslate/unknown-language-code-4               13             12             -7.69%

benchmark                                                    old bytes     new bytes     delta
BenchmarkI18nTranslate/all-present-4                         152           0             -100.00%
BenchmarkI18nTranslate/present-in-default-4                  216           9138          +4130.56%
BenchmarkI18nTranslate/present-in-current-4                  152           0             -100.00%
BenchmarkI18nTranslate/missing-4                             152           8926          +5772.37%
BenchmarkI18nTranslate/file-missing-4                        600           256           -57.33%
BenchmarkI18nTranslate/context-provided-4                    704           200           -71.59%
BenchmarkI18nTranslate/same-id-and-translation-4             152           0             -100.00%
BenchmarkI18nTranslate/same-id-and-translation-default-4     216           9140          +4131.48%
BenchmarkI18nTranslate/unknown-language-code-4               696           640           -8.05%

The main path is now 2x faster than v1, which is great. But the fall back the other language (the tests above has at most 2 languages in a bundle) is in my head common enough, and 0.05 ms per lookup is on the long side for my use. I may look into the API to see if there is a way to do this manually (if not found, do a lookup for en).

nicksnyder · 2019-06-13T22:16:18Z

The remaining performance regression is caused by the instantiating a NewMatcher in the slow path: https://github.com/nicksnyder/go-i18n/blob/v1/v2/i18n/localizer.go#L185

In the particular case that you are benchmarking, a new matcher is overkill because there is only one language to fallback to (I could optimize this), but in the general case there would still be what you are seeing here.

If there are >2 languages and the translation is not in the preferred language, what should happen? What currently happens is a new matcher is instantiated to return the “best” one of the remaining. Alternatively, we could (1) always fallback to the default message if available (2) always return an error, or (3) return both the default message and an error so callers can choose the desired behavior. What do you think?

bep · 2019-06-13T22:29:37Z

I'm not sure what you regard as "the best alternative", if not found. I assume that means I get the Swedish or Danish version if no Norwegian version is found.

If the above ruleset is solid and works for all languages, it could be worth it. But then you should consider adding a simple cache. The problem with the above benchmark is mainly that if you call translate("this-key-does-not-exist-in-this-language") a million times with the same key (a common use case), it would take 50 seconds.

nicksnyder · 2019-06-17T04:06:51Z

I have a draft PR which simplifies the fallback logic in an appropriate way I think and resolved the performance regression. There are some more tests that I want to add before merging.

nicksnyder · 2020-09-29T03:24:13Z

This significant perf improvement is released in v2.0.4

bep · 2020-09-29T07:07:10Z

Thanks, much appreciated.

bep mentioned this issue Jun 3, 2019

Data race in Template.Parse #181

Closed

nicksnyder mentioned this issue Jun 5, 2019

Fix data race and optimize performance #183

Merged

nicksnyder mentioned this issue Jun 16, 2019

simpler fallback behavior #189

Merged

QuLogic mentioned this issue Sep 18, 2019

Update to go-i18n v2 gohugoio/hugo#5242

Closed

nicksnyder closed this as completed in #189 Sep 28, 2020

nicksnyder mentioned this issue Nov 28, 2024

error on fallback to non-default less specific language #349

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance regression from v1 to v2 #182

Performance regression from v1 to v2 #182

bep commented Jun 3, 2019

nicksnyder commented Jun 5, 2019

bep commented Jun 6, 2019

nicksnyder commented Jun 13, 2019 •

edited

Loading

bep commented Jun 13, 2019

nicksnyder commented Jun 17, 2019

nicksnyder commented Sep 29, 2020

bep commented Sep 29, 2020

Performance regression from v1 to v2 #182

Performance regression from v1 to v2 #182

Comments

bep commented Jun 3, 2019

nicksnyder commented Jun 5, 2019

bep commented Jun 6, 2019

nicksnyder commented Jun 13, 2019 • edited Loading

bep commented Jun 13, 2019

nicksnyder commented Jun 17, 2019

nicksnyder commented Sep 29, 2020

bep commented Sep 29, 2020

nicksnyder commented Jun 13, 2019 •

edited

Loading