Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

catch potential write errors on fatal flush sync #740

Merged
merged 5 commits into from
Nov 18, 2019

Conversation

allevo
Copy link
Contributor

@allevo allevo commented Nov 11, 2019

fix #739

This PR allows the developer to choose if flush sync on fatal.
The default is false

Bench:

this branch

% npm run bench-basic

> pino@5.13.6 bench-basic /Users/tommasoallevi/Downloads/repo/pino
> node benchmarks/utils/runbench basic

Running BASIC benchmark

benchBunyan*10000: 482.314ms
benchWinston*10000: 364.687ms
benchBole*10000: 348.318ms
benchDebug*10000: 225.755ms
benchLogLevel*10000: 216.953ms
benchPino*10000: 460.753ms
benchPinoExtreme*10000: 185.467ms
benchPinoNodeStream*10000: 175.491ms
benchBunyan*10000: 408.892ms
benchWinston*10000: 283.859ms
benchBole*10000: 181.260ms
benchDebug*10000: 225.092ms
benchLogLevel*10000: 203.503ms
benchPino*10000: 395.198ms
benchPinoExtreme*10000: 178.076ms
benchPinoNodeStream*10000: 148.326ms

==========
BASIC benchmark averages
Bunyan average: 445.603ms
Winston average: 324.273ms
Bole average: 264.789ms
Debug average: 225.423ms
LogLevel average: 210.228ms
Pino average: 427.976ms
PinoExtreme average: 181.772ms
PinoNodeStream average: 161.909ms
==========
System: Darwin/darwin x64 19.0.0 ~ Intel(R) Core(TM) i5-8279U CPU @ 2.40GHz (cores/threads: 8)

master

% npm run bench-basic

> pino@5.13.6 bench-basic /Users/tommasoallevi/Downloads/repo/pino
> node benchmarks/utils/runbench basic

Running BASIC benchmark

benchBunyan*10000: 465.974ms
benchWinston*10000: 371.409ms
benchBole*10000: 363.848ms
benchDebug*10000: 235.101ms
benchLogLevel*10000: 216.776ms
benchPino*10000: 389.143ms
benchPinoExtreme*10000: 184.884ms
benchPinoNodeStream*10000: 165.704ms
benchBunyan*10000: 392.238ms
benchWinston*10000: 286.630ms
benchBole*10000: 172.467ms
benchDebug*10000: 223.721ms
benchLogLevel*10000: 204.255ms
benchPino*10000: 389.809ms
benchPinoExtreme*10000: 190.509ms
benchPinoNodeStream*10000: 147.684ms

==========
BASIC benchmark averages
Bunyan average: 429.106ms
Winston average: 329.019ms
Bole average: 268.158ms
Debug average: 229.411ms
LogLevel average: 210.516ms
Pino average: 389.476ms
PinoExtreme average: 187.696ms
PinoNodeStream average: 156.694ms
==========
System: Darwin/darwin x64 19.0.0 ~ Intel(R) Core(TM) i5-8279U CPU @ 2.40GHz (cores/threads: 8)

@davidmarkclements
Copy link
Member

Instead of the flag can we use a try/catch of a sync write and then fallback to the non-sync write in the catch

@allevo
Copy link
Contributor Author

allevo commented Nov 11, 2019

Thanks for the fast feedback.
Does try/catch drop the performance down?

@davidmarkclements
Copy link
Member

davidmarkclements commented Nov 11, 2019 via email

@davidmarkclements
Copy link
Member

I’m with @mcollina right now, we’re discussing it. If the sync write fails the async write will fail too (but silently). So if we want silent failure of the write it should be wrapped in the try catch but with an empty catch. Or rethrow with a more descriptive error

@allevo
Copy link
Contributor Author

allevo commented Nov 11, 2019

If the sync write fails the async write will fail too are you sure? Is this still valid for a network socket?

@mcollina
Copy link
Member

What's your use case of flushing data on fatal? If it's something like the following (which is the correct meaning of a fatal log):

logger.fatal('something went really wrong, crashing')
process.exit(1)

In this case, if we remove the flushSync(), your fatal log message might not be logged if another write is in progress.
Digging deep specifically in #739, the socket has its buffer full (that's the meaning of EAGAIN) and as a result your fatal log message will be lost when you do process.exit(1) anyway.

Is a bad crash better or worse than a swallowed log message?

@allevo
Copy link
Contributor Author

allevo commented Nov 13, 2019

It's clear the reason for flushSync. I prefer to lose a message instead of crash the app.

NB:

logger.fatal('something went really wrong, crashing')
await closeAllFD()
process.exit(1)

this lines work differently allowing the fatal log to crash the app without a fd cleaning up.

I've implemented the try catch without the retry.

Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mcollina
Copy link
Member

Can you update the title of this PR?

@allevo allevo changed the title Add forceFlushOnFatal flag Catch error of flushSync Nov 14, 2019
@allevo
Copy link
Contributor Author

allevo commented Nov 14, 2019

Done

@davidmarkclements davidmarkclements changed the title Catch error of flushSync catch potential write errors on fatal flush sync Nov 14, 2019
@davidmarkclements
Copy link
Member

I'm not entirely sure we should swallow the error though, wouldn't we want to let the user know that their final log message wasn't written? Instead shouldn't we rethrow an error explaining that?

lib/levels.js Outdated
try {
stream.flushSync()
} catch (e) {
// do nothing
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually think we should do something. @mcollina @jsumners what's your thoughts

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you can't write a log message that the user will maybe see, what do you intend to do to alert them to that message not being written? From the limited understanding I have of the issue, at this point everything is fucked at this point so 🤷‍♂

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right but it's worth knowing about? this just swallows it

if we throw an error there will be an understanding that something is wrong

or we can even just console.warn if throwing seems too drastic

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the try/catch and it will throw. Wasn't the point of this to remove the throw?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep. but the message thrown has clearly caused confusion, we could throw an error with more of an explanation

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, sounds good to me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤷‍♂ @mcollina

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think swallowing the error is ok here. Essentially we can not write a log line that says we are dying by a fatal error. Notifying a developer is usually done via log files.. which we cannot use in this stance. I think this is correct.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compromise: update the useless “do nothing” comment with a link to the discussion.

Copy link
Member

@davidmarkclements davidmarkclements left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this isn't necessarily a request changes, this is a request for comment from @mcollina and @jsumners

lib/levels.js Outdated Show resolved Hide resolved
Copy link
Member

@davidmarkclements davidmarkclements left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for your patience @allevo - if you're happy to accept the suggestion we should be good to merge

Co-Authored-By: David Mark Clements <david.mark.clements@gmail.com>
test/levels.test.js Outdated Show resolved Hide resolved
@allevo
Copy link
Contributor Author

allevo commented Nov 15, 2019

You are welcome. Opinion sharing is needed to build better software 😄

@mcollina
Copy link
Member

mcollina commented Nov 15, 2019 via email

@mcollina mcollina merged commit 919f5ad into pinojs:master Nov 18, 2019
@allevo allevo deleted the fix/739 branch November 18, 2019 09:01
@github-actions
Copy link

github-actions bot commented Feb 3, 2022

This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 3, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

"EAGAIN: resource temporarily unavailable, write" on fatal
4 participants