Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New linter for assignments within conditional expressions #1777

Closed
IndrajeetPatil opened this issue Nov 25, 2022 · 16 comments · Fixed by #1814
Closed

New linter for assignments within conditional expressions #1777

IndrajeetPatil opened this issue Nov 25, 2022 · 16 comments · Fixed by #1814

Comments

@IndrajeetPatil
Copy link
Collaborator

In the wild, I have come across cases where variables are assigned within conditional expressions and their value is then immediately used as a condition and/or inside their scope. I think this makes the code difficult to understand, and we should lint it.

WDYT?

Here is a reprex with a simple assignment, but you can imagine far more complex assignments and the difficulty to understand the code increases with the complexity of the assignment.

library(lintr)

if (x <- 1L) print(x)
#> [1] 1

# currently, this doesn't generate any lints
lint(
  text = "if (x <- 1L) print(x)",
  linters = linters_with_tags(tags = NULL)
)

Created on 2022-11-25 with reprex v2.0.2

The suggested alternative here in the lint message would be the following (i.e. do variable assignment outside the expression and then use the variable instead):

x <- 1L
if (x) print(x)
@MichaelChirico
Copy link
Collaborator

Personally I use these a lot, and am a big fan, but agree their usage should be pretty limited. There are two cases that come to mind when they are usually a good idea:

  1. Scoped assignments, e.g. if (any(idx <- is_bad(x))) stop("Bad entries: ", toString(which(idx))). idx is only needed within the error region, assigning outside the error region is IMO more confusing since in the typical code flow it needn't be considered
  2. Lazy evaluation, e.g. if (x && (idx <- foo(y))) bar(idx). Moving the assignment outside the if() can be inefficient if foo() is slow and x is usually FALSE.

Unfortunately, my own opinion doesn't mean much here :)

Because this is explicitly called out as bad usage in the style guide, I believe we should offer a linter and make it a default:

https://style.tidyverse.org/syntax.html#assignment

@IndrajeetPatil
Copy link
Collaborator Author

Thanks for sharing your use cases! Very helpful to clarify my thinking about this issue.

For the second use case, wouldn't it be better to do something like if (x) bar(foo(y)) instead?
But maybe the actual use case is much more complicated than this and can't be reduced to the alternative pattern I am thinking.

As for the first case, I totally see the utility of doing this!
In fact, what made me think of this issue wasn't the style guide, but the idea discussed in The Art of Readable Code book: make your variable visible by as few lines of code as possible.

Screenshot 2022-11-27 at 09 47 36

And, of course, unlike C++ and Java, R doesn't have block scope, and so I thought of this issue.
But your first use case shows that this idea can still be used to improve readability of R code by restricting the assignments of variables needed only in error conditions to the conditional expressions.

But, as you noted, these can be nolint-ed, and the default should be what the style guide is recommending.

@MichaelChirico
Copy link
Collaborator

Great citation, I agree with it completely.

@IndrajeetPatil IndrajeetPatil changed the title Should assignments within conditional expressions be linted? New linter for assignments within conditional expressions Dec 8, 2022
IndrajeetPatil added a commit that referenced this issue Dec 8, 2022
@IndrajeetPatil
Copy link
Collaborator Author

In addition to capture.output(), are there any other functions that capture side effects and should be included as exceptions?

Screenshot 2022-12-08 at 19 32 17 1

@Bisaloo
Copy link
Contributor

Bisaloo commented Dec 8, 2022

In addition to capture.output(), are there any other functions that capture side effects and should be included as exceptions?

Some testthat functions maybe:

https://github.com/r-lib/testthat/blob/b6eb3b1c67d0d27eb97d83c02df953c230dac876/R/expect-condition.R#L92-L94

# To test message and output, store results to a variable
expect_warning(out <- f(-1), "already negative")
expect_equal(out, -1)

@IndrajeetPatil
Copy link
Collaborator Author

Great suggestions! Thanks.

I guess we should include most of the functions included here.

@AshesITR
Copy link
Collaborator

AshesITR commented Dec 8, 2022

NB the preferred way of using these expectations is to reuse the return value, i.e. nest / pipe expectations, so I would suggest to not suppress these lints by default.

expect_warning(expect_equal(log(-1), NaN))

expect_equal(log(-1), NaN) |>
  expect_warning()

However, the following usages come to my mind:

quote(a <- 1)
bquote(a <- 1)
expression(a <- 1)
test_that("my test", {
  a <- 1
  expect_equal(a, 1)
})
x <- local({
  a <- 1
  2 * a
})

@IndrajeetPatil
Copy link
Collaborator Author

Nice, thanks!

These are a lot of exceptions to handle. Here is what I was thinking:

We can add a parameter to this linter called exceptions.

We can also create a new function called default_implicit_assignment_exceptions(), export it, and have it as a default argument for exceptions parameter. This will make it fully customizable for users to see which exceptions are included and modify the vector further. WDYT?

# default
implicit_assignment_linter(exceptions = default_implicit_assignment_exceptions())

# custom
implicit_assignment_linter(exceptions = c(default_implicit_assignment_exceptions(), "f", "g"))
implicit_assignment_linter(exceptions = c("foo1", "foo2"))
...

@AshesITR
Copy link
Collaborator

AshesITR commented Dec 8, 2022

The parameter should be called except to be consistent with backports_linter(), duplicate_argument_linter() and missing_argument_linter():

#' @param except Character vector of functions to be excluded from linting.

Not sure about default_..._exceptions as a constant.
Let's see how many we really need. If the XPath doesn't capture assignments within braces, the number of default exceptions should be manageable as a default argument value, making the defaults more apparent in the docs.

@IndrajeetPatil
Copy link
Collaborator Author

This is what it looks like right now. And we haven't even included anything from {rlang} yet:

except = c(
  "bquote", "capture.output", "expression",
  "expect_error", "expect_warning", "expect_message", "expect_condition",
  "expect_no_error", "expect_no_warning", "expect_no_message", "expect_no_condition",
  "expect_invisible", "expect_visible", "expect_output", "expect_silent",
  "local", "quote", "test_that"
)

@AshesITR
Copy link
Collaborator

AshesITR commented Dec 8, 2022

I'm arguing for removing expect_* from that list.
Also, we should test whether test_that() and local() is even necessary, since they have an OP-LEFT-BRACE node between the assignment and the SYMBOL_FUNCTION_CALL.

@MichaelChirico
Copy link
Collaborator

IIRC expect_warning() now returns the warning object instead of the expression output, so expect_warning would need such an exception; ditto expect_silent()

@AshesITR
Copy link
Collaborator

AshesITR commented Dec 8, 2022

Still not 100% sure this necessitates excepting expect_warning()

e.g. even in case the value needs to be re-used in a subsequent test, you can do

# bad
expect_warning(prepared_thing <- my_setup_with_warning(), "setup warning")
expect_equal(funny_function(prepared_thing), funny_value) |>
  expect_warning("funny warning")

# good
my_setup_with_warning() |>
  funny_function() |>
  expect_equal(funny_value) |>
  expect_warning("setup warning") |>
  expect_warning("funny warning")

This is also how we structure our tests that expect warnings.

@IndrajeetPatil
Copy link
Collaborator Author

But even {testthat} docs showcase this as a good way of using these functions:

#' # To test message and output, store results to a variable
#' expect_warning(out <- f(-1), "already negative")

And, even in our test suite, we do the same:

expect_warning(l_invalid_loc1 <- xml_nodes_to_lints(
xml = node,
source_expression = expr,
lint_message = "lint_msg",
column_number_xpath = xp_column_number,
range_start_xpath = xp_invalid,
range_end_xpath = xp_range_end
), rex::rex("Could not find range start for lint. Defaulting to start of line."))

@AshesITR
Copy link
Collaborator

AshesITR commented Dec 8, 2022

Okay, I'm convinced 😇

@IndrajeetPatil
Copy link
Collaborator Author

TBH, I agree with what you consider to be good style to write these tests.

But, as this is a default linter and the "bad" style in question seems to be acceptable by style guide and widely used, we better support it; otherwise, users might be surprised by the discrepancy between lintr and the style guide.

IndrajeetPatil added a commit that referenced this issue Dec 8, 2022
* Implement `implicit_assignment_linter()`

Closes #1777

* remove new lints

* exception for functions that capture side-effect

* link to style guide

* description

* get rid of false positives for control flows

* add failing test

* Update XPath

* expand on the exceptions (to be revised)

* more tests

* order by package

* fix test

* more rlang

* tests for lambdas

* reformat vector

* also cover `for` loop

* remove lint
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants