Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Catch panic to generate report and reraise #7341

Merged
merged 11 commits into from
Oct 1, 2021

Conversation

placer14
Copy link
Contributor

@placer14 placer14 commented Sep 17, 2021

an attempt at #7315

Room for improvement?

  • expose journal lookback param as envvar done
  • report rotation param and exposing as envvar?
  • dump more things?
  • maybe don't catch such a broad surface and find more narrow scopes to recover from?
  • make each report optional (similar to GOLOG_LOG_LEVEL_NAMED)?

build/panic_reporter.go Outdated Show resolved Hide resolved
build/panic_reporter.go Outdated Show resolved Hide resolved
cli/helper.go Outdated
defer func() {
if r := recover(); r != nil {
// Generate report in LOTUS_PATH and re-raise panic
build.GeneratePanicReport(os.Getenv("LOTUS_PATH"), "app_panic")
Copy link
Contributor

@travisperson travisperson Sep 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't guaranteed to be set, and each binary may have it's own location eg $LOTUS_MINER_PATH, $LOTUS_WORKER_PATH, etc.

Have you tried to see if you can run this recovery in the After method of urfave cli? This will give you access to the urfave cli context which will allow you to use the cli flags and avoid using os.Getenv.

Being able to customize the location where the reports are generated would also be amazing. Something like LOTUS_PANIC_REPORTS or similar that is accepted on every binary would be great. It could default into the LOTUS_*_PATH. You can use app.Name to do a lookup for the correct cli flag to read from to get the correct repo path for defaults + overrides using env vars.

The second argument could then also be the application name as well. This in conjunction with a LOTUS_PANIC_REPORTS would mean someone could have all their panics sent to a single location and then identified by the label.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used the local context instead of a lookup based on app.Name. LMK if you think this is sufficient. (Was trying to avoid brittle lookup regressions if names change.) I think I got all of your suggestions in here now.

@placer14 placer14 marked this pull request as ready for review September 17, 2021 22:15
@placer14 placer14 requested a review from a team as a code owner September 17, 2021 22:15
build/panic_reporter.go Outdated Show resolved Hide resolved
build/panic_reporter.go Outdated Show resolved Hide resolved
build/panic_reporter.go Show resolved Hide resolved
@placer14 placer14 force-pushed the mg/feat/panic-reporter branch from 9c83b40 to 78cd690 Compare September 20, 2021 15:48
@placer14 placer14 requested a review from magik6k September 20, 2021 15:48
@placer14 placer14 force-pushed the mg/feat/panic-reporter branch from 78cd690 to 926858a Compare September 20, 2021 19:41
@codecov
Copy link

codecov bot commented Sep 20, 2021

Codecov Report

Merging #7341 (2f8a2fc) into master (4fc78bf) will decrease coverage by 0.00%.
The diff coverage is 1.51%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #7341      +/-   ##
==========================================
- Coverage   39.09%   39.08%   -0.01%     
==========================================
  Files         614      615       +1     
  Lines       64997    65125     +128     
==========================================
+ Hits        25408    25452      +44     
- Misses      35172    35263      +91     
+ Partials     4417     4410       -7     
Impacted Files Coverage Δ
build/panic_reporter.go 0.00% <0.00%> (ø)
cmd/lotus-miner/main.go 4.27% <0.00%> (-0.45%) ⬇️
cmd/lotus-seal-worker/main.go 0.00% <0.00%> (ø)
cmd/lotus/main.go 0.00% <0.00%> (ø)
build/version.go 25.00% <100.00%> (ø)
markets/loggers/loggers.go 89.28% <0.00%> (-10.72%) ⬇️
node/hello/hello.go 63.63% <0.00%> (-3.41%) ⬇️
chain/messagepool/selection.go 80.12% <0.00%> (-0.41%) ⬇️
chain/store/store.go 63.31% <0.00%> (-0.36%) ⬇️
chain/messagepool/messagepool.go 56.93% <0.00%> (-0.25%) ⬇️
... and 15 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4fc78bf...2f8a2fc. Read the comment docs.

Comment on lines 82 to 84
ignoreCommitBefore := os.Getenv("LOTUS_VERSION_IGNORE_COMMIT")
os.Setenv("LOTUS_VERSION_IGNORE_COMMIT", "") //nolint:errcheck
defer os.Setenv("LOTUS_VERSION_IGNORE_COMMIT", ignoreCommitBefore) //nolint:errcheck
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be much cleaner to just do build.BuildVersion + build.BuildType() + build.CurrentCommit (Just make buildType public)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opted to keep version func calls private until we know we want it to live external to build package. Didn't want to expose more if we don't really need to.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was no good reason for it to be private other than nothing needed it to be public before

Copy link
Contributor Author

@placer14 placer14 Sep 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, sounds like you really want it. 574b5c03d

@placer14 placer14 force-pushed the mg/feat/panic-reporter branch from 7525e74 to 574b5c0 Compare September 22, 2021 13:58
@jennijuju jennijuju added this to the v1.13.0 milestone Sep 26, 2021
}
}

syscall.Umask(0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any specific reason to do this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(If yes, would add a comment explaining why)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was debugging something and accidentally committed it. Removed.

build/panic_reporter.go Outdated Show resolved Hide resolved
build/panic_reporter.go Outdated Show resolved Hide resolved
@placer14 placer14 requested a review from magik6k September 28, 2021 15:32
Copy link
Contributor

@magik6k magik6k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

panic_reporter.go looks good, just some nits/questions on the report paths in miner/worker

cmd/lotus-seal-worker/main.go Outdated Show resolved Hide resolved
cmd/lotus-miner/main.go Outdated Show resolved Hide resolved
cmd/lotus-miner/main.go Outdated Show resolved Hide resolved
Co-authored-by: Łukasz Magiera <magik6k@users.noreply.github.com>
Copy link
Contributor

@magik6k magik6k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be great to have a test for this someday, but it looks like it should work

@magik6k magik6k merged commit 95e8b59 into master Oct 1, 2021
@magik6k magik6k deleted the mg/feat/panic-reporter branch October 1, 2021 09:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants