Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation request: how to compare environments? #127

Closed
MichaelChirico opened this issue Mar 9, 2022 · 10 comments · Fixed by #133
Closed

Documentation request: how to compare environments? #127

MichaelChirico opened this issue Mar 9, 2022 · 10 comments · Fixed by #133

Comments

@MichaelChirico
Copy link
Contributor

As part of testthat 3e migration for lintr (r-lib/lintr#910), I'm trying to replace this expect_equal() test comparing two environments whose content is identical:

https://github.com/r-lib/lintr/blob/aafba478622c6f43c3fb329c69945c36a4d94b68/tests/testthat/test-cache.R#L164

With <3e, it dispatches to all.equal() and voila.

With the stricter tests in waldo::compare(), however, it's not so simple.

I was unable in a brief search to determine what one should do to compare environments themselves:

  • there's no compare_proxy.environment
  • ?waldo::compare only mentions environments as regards ignore_formula_env and ignore_function_env (not really relevant here)
  • expect_equal(e1, e2, ignore_attr=TRUE) doesn't work, which seems to partially contradict ?waldo::compare ("For backward compatibility with all.equal(), you can also use TRUE..." -- which suggests we will get equivalent behavior to the old expect_equal() by setting ignore_attr=TRUE)
  • no mention of environments on the landing page: https://waldo.r-lib.org/
  • no vignettes

So I'm left wondering what's the recommended approach here.

The simplest fix is expect_identical(as.list(e1), as.list(e2)), but it does feel a bit hacky.

@hadley
Copy link
Member

hadley commented Mar 9, 2022

What properties of the environment are causing the problem? i.e. what do you want to ignore?

In general, all.equal() isn't a great reference because it does so much behind the scenes. In this case, IIRC it ignores the parent environment, which is actually a really important part of the environment.

@hadley
Copy link
Member

hadley commented Mar 9, 2022

Ooops I forgot how waldo works, and now I get what you mean:

e1 <- new.env(parent = emptyenv())
e2 <- new.env(parent = emptyenv())

e1$x <- 1
e2$x <- 1

waldo::compare(e1, e2)
#> `old` is <env:0x7fbe6f31de20>
#> `new` is <env:0x7fbe6f374670>

Created on 2022-03-09 by the reprex package (v2.0.1)

It's comparing the environments by reference, where you want to compare them by value.

@hadley
Copy link
Member

hadley commented Mar 9, 2022

Maybe we need environment_as_map = TRUE ? Or environment_by_value = TRUE?

@MichaelChirico
Copy link
Contributor Author

right.

similar but related issue would be improving the output there... not very actionable to see the hashes of the environments are different (esp. if I don't recognize that's what it's telling me)

@hadley
Copy link
Member

hadley commented Mar 9, 2022

That's not a hash, but a memory address, so it's just telling you that they're not the same environment, which is not very useful. Maybe waldo should just compare unnamed environments (i.e. environments that aren't empty/global/base/package) by value by default?

I think it is ok to use identical() to compare named environments because I think this is useful feedback, and there's no need to recurse into the specific values:

waldo::compare(emptyenv(), baseenv())
#> `old` is <env:empty>
#> `new` is <env:package:base>

@hadley
Copy link
Member

hadley commented Mar 10, 2022

Probably to resolve #117 to fix this to avoid infloops:

x <- new.env()
x$x <- x

@AshesITR
Copy link

AshesITR commented Mar 10, 2022

I actually appreciate the default behaviour for unnamed envs. It helped me discover some problems with complex data structures involving environments in the past.

Default FALSE arguments for env_as_map and / or ignore_parent_env might be useful?

The former would compare elements recursively by value, the latter would allow different parent envs which is only useful with env_as_map = TRUE but I'd still want to know by default if my two compare-by-value envs have different parents.

@MichaelChirico
Copy link
Contributor Author

I think it is ok to use identical() to compare named environments because I think this is useful feedback, and there's no need to recurse into the specific values:

Agree with this behavior -- we can tell at a glance that they're different environments and get something actionable to go on. Whereas (as I think we agree) for the unnamed env case it's "well this jumble of letters is not the same of that one... ok"

The rudimentary version I wrote in r-lib/lintr#910 partially avoids the self-reference problem by using expect_identical() for the objects in ls(), but a more general solution would handle recursive unnamed environments more deftly.

@MichaelChirico
Copy link
Contributor Author

In general, all.equal() isn't a great reference because it does so much behind the scenes. In this case, IIRC it ignores the parent environment, which is actually a really important part of the environment.

to clarify, only mentioned all.equal since that's what 2nd edition expect_equal() ultimately does and this arose during a 2nd->3rd edition refactor. I agree the stronger test in 3e is generally better, just needs a better UX to diagnose failures (somewhere 3e usually excels)

@hadley
Copy link
Member

hadley commented Mar 10, 2022

Take a look at #133; I'm not 100% certain I'm handling the self-reference case 100% correctly, but I think it's pretty close and it shouldn't affect lintr.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants