Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-105623 Fix performance degradation in logging RotatingFileHandler #105887

Merged
merged 7 commits into from
Jun 27, 2024

Conversation

zhatt
Copy link
Contributor

@zhatt zhatt commented Jun 17, 2023

The check for whether the log file is a real file is expensive on NFS filesystems. This commit reorders the rollover condition checking to not do the file type check if the expected file size is less than the rotation threshold.

…ndler

The check for whether the log file is a real file is expensive on NFS
filesystems.  This commit reorders the rollover condition checking to
not do the file type check if the expected file size is less than the
rotation threshold.
@cpython-cla-bot
Copy link

cpython-cla-bot bot commented Jun 17, 2023

All commit authors signed the Contributor License Agreement.
CLA signed

@arhadthedev arhadthedev added the performance Performance or resource usage label Jun 17, 2023
…G06od.rst

Co-authored-by: Oleg Iarygin <oleg@arhadthedev.net>
@vsajip vsajip added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Jun 19, 2023
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @vsajip for commit 84ad7d8 🤖

If you want to schedule another build, you need to add the 🔨 test-with-buildbots label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Jun 19, 2023
@arhadthedev
Copy link
Member

From buildbots:

Objects/object.c:2197: _Py_ForgetReference: Assertion failed: invalid object chain
Enable tracemalloc to get the memory block allocation traceback
object address  : 0x7ff344737770
object refcount : 0
object type     : 0x560543494660
object type name: tuple
object repr     : <refcnt 0 at 0x7ff344737770>

This PR is not the cause (Python-only change cannot break reference counting). However, we need to fix the crash before merging the PR.

Comment on lines -190 to -198
# See bpo-45401: Never rollover anything other than regular files
if os.path.exists(self.baseFilename) and not os.path.isfile(self.baseFilename):
return False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even with a local file-system this is a substantive chunk of the run time

Screenshot 2023-07-12 at 6 22 51 PM

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bdraco I'm not familiar with the report generated by py-spy. Are you seeing that the os.path.exists() and os.path.isfile() are expensive or the self.format()? The self.format() is probably a different bug that needs fixed since rollover is formatting the message once to get its length and then emit later formats it again.

Copy link
Contributor

@bdraco bdraco Jul 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd expect the format and emit to be expensive here (relatively ... the logger is quite busy). What is unexpected is the cost of os.path.exists and os.path.isfile which is solved by this PR.

Sorry for the lack of clarity. This looks like a good fix and I wanted to point out the problem is not limited to NFS.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Under normal usage, self.baseFilename doesn't change. Perhaps we just cache the result of the existence/is-file test when we open the file? Wouldn't that be a better approach?

Copy link
Member

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. In the current paradigm it is the right solution.

I wonder whether it would be better to perform this check in doRollover(), before renaming files. But this is a different and not so easy issue.

@fantabolous
Copy link

fantabolous commented Jun 27, 2024

Thanks, in my rather logging-heavy Windows NTFS project, shouldRollover went from 25-35% of cpu to <2% after monkey patching this in. (According to yappi.)

@serhiy-storchaka serhiy-storchaka enabled auto-merge (squash) June 27, 2024 16:36
@serhiy-storchaka serhiy-storchaka merged commit e9b4ec6 into python:main Jun 27, 2024
33 checks passed
@serhiy-storchaka serhiy-storchaka added needs backport to 3.12 bug and security fixes needs backport to 3.13 bugs and security fixes labels Jun 28, 2024
@miss-islington-app
Copy link

Thanks @zhatt for the PR, and @serhiy-storchaka for merging it 🌮🎉.. I'm working now to backport this PR to: 3.12.
🐍🍒⛏🤖

@miss-islington-app
Copy link

Thanks @zhatt for the PR, and @serhiy-storchaka for merging it 🌮🎉.. I'm working now to backport this PR to: 3.13.
🐍🍒⛏🤖

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Jun 28, 2024
…ndler (pythonGH-105887)

The check for whether the log file is a real file is expensive on NFS
filesystems.  This commit reorders the rollover condition checking to
not do the file type check if the expected file size is less than the
rotation threshold.

(cherry picked from commit e9b4ec6)

Co-authored-by: Craig Robson <craig@zhatt.com>
Co-authored-by: Oleg Iarygin <oleg@arhadthedev.net>
miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Jun 28, 2024
…ndler (pythonGH-105887)

The check for whether the log file is a real file is expensive on NFS
filesystems.  This commit reorders the rollover condition checking to
not do the file type check if the expected file size is less than the
rotation threshold.

(cherry picked from commit e9b4ec6)

Co-authored-by: Craig Robson <craig@zhatt.com>
Co-authored-by: Oleg Iarygin <oleg@arhadthedev.net>
@bedevere-app
Copy link

bedevere-app bot commented Jun 28, 2024

GH-121116 is a backport of this pull request to the 3.12 branch.

@bedevere-app bedevere-app bot removed the needs backport to 3.12 bug and security fixes label Jun 28, 2024
@bedevere-app
Copy link

bedevere-app bot commented Jun 28, 2024

GH-121117 is a backport of this pull request to the 3.13 branch.

@bedevere-app bedevere-app bot removed the needs backport to 3.13 bugs and security fixes label Jun 28, 2024
@serhiy-storchaka
Copy link
Member

Thank you for your contribution @zhatt, and thank you for reminder @fantabolous. I forgot about this PR.

serhiy-storchaka pushed a commit that referenced this pull request Jun 28, 2024
…andler (GH-105887) (GH-121116)

The check for whether the log file is a real file is expensive on NFS
filesystems.  This commit reorders the rollover condition checking to
not do the file type check if the expected file size is less than the
rotation threshold.

(cherry picked from commit e9b4ec6)

Co-authored-by: Craig Robson <craig@zhatt.com>
Co-authored-by: Oleg Iarygin <oleg@arhadthedev.net>
serhiy-storchaka pushed a commit that referenced this pull request Jun 28, 2024
…andler (GH-105887) (GH-121117)

The check for whether the log file is a real file is expensive on NFS
filesystems.  This commit reorders the rollover condition checking to
not do the file type check if the expected file size is less than the
rotation threshold.

(cherry picked from commit e9b4ec6)

Co-authored-by: Craig Robson <craig@zhatt.com>
Co-authored-by: Oleg Iarygin <oleg@arhadthedev.net>
mrahtz pushed a commit to mrahtz/cpython that referenced this pull request Jun 30, 2024
…ndler (pythonGH-105887)

The check for whether the log file is a real file is expensive on NFS
filesystems.  This commit reorders the rollover condition checking to
not do the file type check if the expected file size is less than the
rotation threshold.

Co-authored-by: Oleg Iarygin <oleg@arhadthedev.net>
noahbkim pushed a commit to hudson-trading/cpython that referenced this pull request Jul 11, 2024
…ndler (pythonGH-105887)

The check for whether the log file is a real file is expensive on NFS
filesystems.  This commit reorders the rollover condition checking to
not do the file type check if the expected file size is less than the
rotation threshold.

Co-authored-by: Oleg Iarygin <oleg@arhadthedev.net>
estyxx pushed a commit to estyxx/cpython that referenced this pull request Jul 17, 2024
…ndler (pythonGH-105887)

The check for whether the log file is a real file is expensive on NFS
filesystems.  This commit reorders the rollover condition checking to
not do the file type check if the expected file size is less than the
rotation threshold.

Co-authored-by: Oleg Iarygin <oleg@arhadthedev.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Performance or resource usage
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants