Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memrecycle Performance Regression 1.14.3 #5371

Closed
ColeMiller1 opened this issue Apr 16, 2022 · 4 comments · Fixed by #5463
Closed

Memrecycle Performance Regression 1.14.3 #5371

ColeMiller1 opened this issue Apr 16, 2022 · 4 comments · Fixed by #5463

Comments

@ColeMiller1
Copy link
Contributor

ColeMiller1 commented Apr 16, 2022

Identified by @tlapak in #5366, there is a performance regression in 14.3 introduced by #4491 @jangorecki . I'm reporting here so we can separate the Pandas rolling time performance comparison from the regression issue.

The regression was specifically introduced by this commit: e793f53 . Basically, snprintf() performs poorly in spite of the better looking code. See also https://stackoverflow.com/questions/64600389/why-is-calling-snprintf-so-slow.

For performance, Stackoverflow seems to suggest strcpy(). I propose either rolling back the changes or using a helper function during the calls so that the string is only made when needed. Without any changes, users would see performance issues which would be most noticeable when there are many groups in non-Gforce j calls.

set.seed(123L)
n = 1e4L
dt = data.table(id = seq_len(n),
                val = rnorm(n))

### 1.14.2
system.time(dt[, .(vs = (sum(val))), by = .(id)])

##    user  system elapsed 
##    0.04    0.00    0.05

### after commit e793f53
system.time(dt[, .(vs = (sum(val))), by = .(id)])

##    user  system elapsed 
##    1.22    0.03    1.23 

Edit: The main difference is that every call gets this assigned targetDesc. Previously, the character would only be made during warnings, errors, or higher levels of verbosity with types being mismatched. That is, the character would largely not be made. If the second line with snprintf() is commented out, the code runs fast.

  static char targetDesc[501];  // from 1.14.1 coerceAs reuses memrecycle for a target vector, PR#4491
  snprintf(targetDesc, 500, colnum==0 ? _("target vector") : _("column %d named '%s'"), colnum, colname);
@MichaelChirico
Copy link
Member

Thanks for investigating!

using a helper function during the calls so that the string is only made when needed.

This sounds like the first-order fix to me.

We can also do strcopy() -- might be worth benchmarking (1) skip calling sprintf() and (2) (1) plus use strcopy()

@jangorecki jangorecki added this to the 1.14.3 milestone Apr 16, 2022
@jangorecki jangorecki added the dev label Apr 17, 2022
@jangorecki jangorecki changed the title Memrecycle Performance Regression Memrecycle Performance Regression 1.14.3 Jul 19, 2022
@jangorecki

This comment was marked as outdated.

@jangorecki
Copy link
Member

@ColeMiller1 commit you mentioned e793f53 seems to be from 1.14.1, maybe you pasted wrong hash. Anyway could you look if linked PR fixes the problem for you?

@ColeMiller1
Copy link
Contributor Author

ColeMiller1 commented Sep 15, 2022

@jangorecki Thanks!

Yes, the branch resolves the issue.

As for version, wasn't a hotfix applied for 14.2 so many 14.1 DEV was postponed and is included in 14.3? I think it's just been a while since we've had a full release.

@jangorecki jangorecki modified the milestones: 1.14.9, 1.15.0 Oct 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants