`position = "jitter"` and `position = position_jitter()` have different behavior #2507

clauswilke · 2018-04-01T05:18:19Z

I noticed today that position = "jitter" is not the same as position = position_jitter(), and the behavior described in the documentation for position = position_jitter() cannot be obtained for position = "jitter". The underlying issue is a ggproto behavior that has caught me off guard many times (see below).

First the reprex. We start with a simple plot with jitter:

df <- data.frame(x = c(1, 2, 1, 2),
                 y = c(1, 1, 2, 2))

set.seed(1234)
ggplot(df, aes(x, y)) + geom_point(position = "jitter")

The following code produces different jitter:

set.seed(1234)
ggplot(df, aes(x, y)) + geom_point(position = position_jitter())

However, this reproduces the jitter from the first example:

set.seed(1234)
ggplot(df, aes(x, y)) + geom_point(position = position_jitter(seed = NULL))

The problem is that position = "jitter" does not use the default arguments defined in position_jitter(), and therefore seed is set to NULL, not to NA, when we use position = "jitter". This same problem happens generally with ggproto, e.g. when we call a geom with geom_*() vs. geom = "*", and it has bitten me many times in different scenarios.

There are two ways to fix this particular problem:

The simple way: Don't set seed = NA as default in position_jitter().
The complicated way: Fix the ggproto behavior so the default arguments apply when objects are specified by their name.

The text was updated successfully, but these errors were encountered:

hadley · 2018-05-01T20:20:13Z

I think maybe this is only a problem with position objects not geoms and stats, because geoms and stats have their state stored elsewhere. I think if you supply position = "jitter" we should call position_jitter() rather than ggproto(NULL, PositionJitter). Does that make sense?

clauswilke · 2018-05-01T20:25:28Z

I agree that position = "jitter" should call position_jitter().

In any case, I'm pretty certain the same problem arises with geoms and stats, because I've run into it repeatedly. For example, all the if (is.null(...)) lines here in ggridges are needed to make stat_density_ridges() have reasonable defaults when it is called from geom_density_ridges(). I'll see if I can find a similar case in the current ggplot2 code.

clauswilke · 2018-05-01T20:45:56Z

Never mind, things seems to be working fine for geoms and stats. Not sure why it doesn't (or didn't) work with my code, but I can't reproduce the problem with StatDensity, for example. It always behaves correctly as far as I can tell.

hadley · 2018-05-01T23:56:05Z

I think the difference is that the params for stats and geoms are stored in the layer; the params for the position adjustment are stored in the position object. It's a weird design.

I'm pretty sure this is not a regression, so I'm going to remove it from the 2.3.0 milestone.

paleolimbot · 2019-07-07T15:29:16Z

If this is still worth fixing, it could be done by moving the seed-sanitizing lines:

ggplot2/R/position-jitter.r

Lines 48 to 51 in 7f317d4

    
           position_jitter <- function(width = NULL, height = NULL, seed = NA) { 
        
             if (!is.null(seed) && is.na(seed)) { 
        
               seed <- sample.int(.Machine$integer.max, 1L) 
        
             }

to PositionJitter$setup_params()

ggplot2/R/position-jitter.r

Lines 67 to 73 in 7f317d4

    
           setup_params = function(self, data) { 
        
             list( 
        
               width = self$width %||% (resolution(data$x, zero = FALSE) * 0.4), 
        
               height = self$height %||% (resolution(data$y, zero = FALSE) * 0.4), 
        
               seed = self$seed 
        
             ) 
        
           },

and adding seed = NA to ggproto() class definition:

ggplot2/R/position-jitter.r

Lines 64 to 68 in 7f317d4

    
           PositionJitter <- ggproto("PositionJitter", Position, 
        
             required_aes = c("x", "y"), 
        
             setup_params = function(self, data) { 
        
               list(

Many ggproto() class definitions include "field" definitions (usually NULL) that are added in the constructor, so I don't think it's too far off. The fact that the constructors of ggproto() objects do not reside within the object but are external methods causes a lot of copy-and-pasting of code when objects are subclassed (unless you subclass the instance, which gets used a lot within ggplot2 but is not really defined behaviour).

alanmejiamaza · 2020-06-16T12:53:21Z

Hi,

Is it possible to plot the dots organised using geom_jitter?

hadley added bug an unexpected problem or unintended behavior layers 📈 labels Apr 27, 2018

hadley added this to the v2.3.0 milestone Apr 27, 2018

hadley removed this from the v2.3.0 milestone May 1, 2018

paleolimbot added positions 🥇 and removed layers 📈 labels Jul 7, 2019

thomasp85 added this to the ggplot2 3.3.4 milestone Mar 25, 2021

thomasp85 mentioned this issue Apr 12, 2021

Move seed setup in position_jitter() to setup_params() from the constructor #4413

Merged

thomasp85 closed this as completed in #4413 Apr 13, 2021

yutannihilation added a commit to yutannihilation/gghighlight that referenced this issue May 20, 2021

Skip test about a workaround for tidyverse/ggplot2#2507 (#166)

27b0901

yutannihilation mentioned this issue May 20, 2021

Issues with next ggplot2 release yutannihilation/gghighlight#165

Closed

yutannihilation mentioned this issue Sep 5, 2024

Remove a workaround for jitter in ggplot2 <3.3.4 yutannihilation/gghighlight#207

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`position = "jitter"` and `position = position_jitter()` have different behavior #2507

`position = "jitter"` and `position = position_jitter()` have different behavior #2507

clauswilke commented Apr 1, 2018

hadley commented May 1, 2018

clauswilke commented May 1, 2018

clauswilke commented May 1, 2018

hadley commented May 1, 2018

paleolimbot commented Jul 7, 2019

alanmejiamaza commented Jun 16, 2020

position = "jitter" and position = position_jitter() have different behavior #2507

position = "jitter" and position = position_jitter() have different behavior #2507

Comments

clauswilke commented Apr 1, 2018

hadley commented May 1, 2018

clauswilke commented May 1, 2018

clauswilke commented May 1, 2018

hadley commented May 1, 2018

paleolimbot commented Jul 7, 2019

alanmejiamaza commented Jun 16, 2020

`position = "jitter"` and `position = position_jitter()` have different behavior #2507

`position = "jitter"` and `position = position_jitter()` have different behavior #2507