groupingsets() adding an extra column with named vectors? #5206

sindribaldur · 2021-10-11T16:46:32Z

groupingsets(
  data.table(iris),
  j = mean(Sepal.Length),
  by = c('Sp' = 'Species'),
  sets = list('Species')
)
     Sp    V1    Species
1: <NA> 5.006     setosa
2: <NA> 5.936 versicolor
3: <NA> 6.588  virginica

I'm using latest development version of data.table and R 4.1.0. This seems not to have been the behaviour about three months ago when I last ran the code that now broke.

The text was updated successfully, but these errors were encountered:

jangorecki · 2021-10-11T20:28:52Z

Could you check verbose=T (if there is)? I think it may not be related to groupingsets, it calls ordinary dt code under the hood.

sindribaldur · 2021-10-11T21:20:55Z

Setting options(datatable.verbose = TRUE) I get:

Argument 'by' after substitute: by
i clause present and columns used in by detected, only these subset: [Species]
Detected that j uses these columns: [Sepal.Length]
lapply optimization is on, j unchanged as 'mean(Sepal.Length)'
Old mean optimization changed j from 'mean(Sepal.Length)' to '.External(Cfastmean, Sepal.Length, FALSE)'
Making each group and running j (GForce FALSE) ... 
  memcpy contiguous groups took 0.000s for 1 groups
  eval(j) took 0.000s for 1 calls
0.010s elapsed (0.020s cpu) 
Argument 'by' after substitute: by.set
Detected that j uses these columns: [Sepal.Length]
Finding groups using forderv ... forder.c received 150 rows and 1 columns
0.000s elapsed (0.000s cpu) 
Finding group sizes from the positions (can be avoided to save RAM) ... 0.000s elapsed (0.000s cpu) 
lapply optimization is on, j unchanged as 'mean(Sepal.Length)'
GForce optimized j to 'gmean(Sepal.Length)'
Making each group and running j (GForce TRUE) ... gforce initial population of grp took 0.002
gforce assign high and low took 0.000
This gmean took (narm=FALSE) ... gather took ... 0.000s
0.000s
gforce eval took 0.001
0.020s elapsed (0.000s cpu)

jangorecki · 2021-10-12T16:43:49Z

Hm, it was split function that had verbose arg. Grouping sets doesn't provide this level of debug.

sindribaldur · 2021-10-18T13:32:56Z

I removed development version and installed again from CRAN and the issue is gone.

ben-schwen · 2021-10-18T15:13:22Z

For reproduceability and can confirm that there is an issue/changing behavior

# 1.14.2
groupingsets(
  data.table(iris),
  j = mean(Sepal.Length),
  by = c('Sp' = 'Species'),
  sets = list('Species')
)
#       Species    V1
#        <fctr> <num>
# 1:     setosa 5.006
# 2: versicolor 5.936
# 3:  virginica 6.588

# current dev 1.14.3
groupingsets(
  data.table(iris),
  j = mean(Sepal.Length),
  by = c('Sp' = 'Species'),
  sets = list('Species')
)
#        Sp    V1    Species
#    <fctr> <num>     <fctr>
# 1:   <NA> 5.006     setosa
# 2:   <NA> 5.936 versicolor
# 3:   <NA> 6.588  virginica

Doing some digging current behavior was introduced by #4713

ben-schwen mentioned this issue Oct 18, 2021

groupingsets: new metaprogramming together with named by #5227

Merged

jangorecki added the regression label Oct 19, 2021

jangorecki added this to the 1.14.3 milestone Oct 19, 2021

mattdowle added the dev label Oct 20, 2021

mattdowle closed this as completed in #5227 Oct 20, 2021

jangorecki modified the milestones: 1.14.9, 1.15.0 Oct 29, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

groupingsets() adding an extra column with named vectors? #5206

groupingsets() adding an extra column with named vectors? #5206

sindribaldur commented Oct 11, 2021 •

edited

Loading

jangorecki commented Oct 11, 2021

sindribaldur commented Oct 11, 2021

jangorecki commented Oct 12, 2021

sindribaldur commented Oct 18, 2021

ben-schwen commented Oct 18, 2021 •

edited

Loading

groupingsets() adding an extra column with named vectors? #5206

groupingsets() adding an extra column with named vectors? #5206

Comments

sindribaldur commented Oct 11, 2021 • edited Loading

jangorecki commented Oct 11, 2021

sindribaldur commented Oct 11, 2021

jangorecki commented Oct 12, 2021

sindribaldur commented Oct 18, 2021

ben-schwen commented Oct 18, 2021 • edited Loading

sindribaldur commented Oct 11, 2021 •

edited

Loading

ben-schwen commented Oct 18, 2021 •

edited

Loading