Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move write aggregation memory copy out of vq_lock #8890

Merged
merged 1 commit into from
Jun 13, 2019

Conversation

amotin
Copy link
Member

@amotin amotin commented Jun 12, 2019

Motivation and Context

Memory copy is too heavy operation to do under the congested lock.
Moving it out reduces congestion by many times to almost invisible.
Since the original zio removed from the queue, and the child zio is
not executed yet, I don't see why would the copy need protection.
My guess it just remained like this from the time when lock was not
dropped here, which was added later to fix lock ordering issue.

Description

How Has This Been Tested?

I've tested this patch of FreeBSD head on dual-socket Xeon 4110.
Sequential write to 12 ZVOLs on both HDD and SSD striped pools
with ZVOL block sizes of 4KB, 16KB, 64KB and 128KB all show major
reduction of lock congestion, saving from 15% to 35% of CPU time
and increasing throughput from 10% to 40%.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (a change to man pages or other documentation)

Checklist:

Memory copy is too heavy operation to do under the congested lock.
Moving it out reduces congestion by many times to almost invisible.
Since the original zio removed from the queue, and the child zio is
not executed yet, I don't see why would the copy need protection.
My guess it just remained like this from the time when lock was not
dropped here, which was added later to fix lock ordering issue.

Multi-threaded sequential write tests with both HDD and SSD pools
with ZVOL block sizes of 4KB, 16KB, 64KB and 128KB all show major
reduction of lock congestion, saving from 15% to 35% of CPU time
and increasing throughput from 10% to 40%.

Signed-off-by:  Alexander Motin <mav@FreeBSD.org>
@amotin
Copy link
Member Author

amotin commented Jun 12, 2019

@ahrens @behlendorf You've touched this code before, what do you think? Any objections?

Copy link
Contributor

@ryao ryao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great timing, I happened to be looking at contention on this same lock. I agree, as long as the vdev_queue_io_remove operation is kept under the lock it's safe to move the copy itself outside.

Related to this I suggest also taking a look at the vdev_queue_io_{add,remove} and vdev_queue_pending_{add,remove} functions. They're called under the same queue lock but may end up blocked on a per-pool kstat lock. This doesn't currently appear to be a major bottleneck, but clearly isn't a good thing.

@behlendorf behlendorf added the Status: Code Review Needed Ready for review and testing label Jun 12, 2019
@behlendorf behlendorf requested a review from ahrens June 12, 2019 19:27
@ahrens ahrens added the Type: Performance Performance improvement or performance problem label Jun 12, 2019
@amotin
Copy link
Member Author

amotin commented Jun 12, 2019

Related to this I suggest also taking a look at the vdev_queue_io_{add,remove} and vdev_queue_pending_{add,remove} functions. They're called under the same queue lock but may end up blocked on a per-pool kstat lock. This doesn't currently appear to be a major bottleneck, but clearly isn't a good thing.

In FreeBSD's native ZFS those parts are commented out as illumos'isms. That is why I don't see them in profiler. I can imagine pool-wide lock at those places must be terrible under high IOPS.

@richardelling
Copy link
Contributor

per pool and per objset kstats can be aggsummed eliminating their locks altogether

@codecov
Copy link

codecov bot commented Jun 13, 2019

Codecov Report

Merging #8890 into master will increase coverage by 0.06%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #8890      +/-   ##
==========================================
+ Coverage   78.64%    78.7%   +0.06%     
==========================================
  Files         382      382              
  Lines      117808   117808              
==========================================
+ Hits        92645    92719      +74     
+ Misses      25163    25089      -74
Flag Coverage Δ
#kernel 79.32% <100%> (-0.06%) ⬇️
#user 67.25% <100%> (+0.29%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5662fd5...d00441c. Read the comment docs.

@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Jun 13, 2019
@behlendorf behlendorf merged commit ae5c78e into openzfs:master Jun 13, 2019
tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request Aug 13, 2019
Memory copy is too heavy operation to do under the congested lock.
Moving it out reduces congestion by many times to almost invisible.
Since the original zio removed from the queue, and the child zio is
not executed yet, I don't see why would the copy need protection.
My guess it just remained like this from the time when lock was not
dropped here, which was added later to fix lock ordering issue.

Multi-threaded sequential write tests with both HDD and SSD pools
with ZVOL block sizes of 4KB, 16KB, 64KB and 128KB all show major
reduction of lock congestion, saving from 15% to 35% of CPU time
and increasing throughput from 10% to 40%.

Reviewed-by: Richard Yao <ryao@gentoo.org>
Reviewed-by: Matt Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by:  Alexander Motin <mav@FreeBSD.org>
Closes openzfs#8890
tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request Aug 22, 2019
Memory copy is too heavy operation to do under the congested lock.
Moving it out reduces congestion by many times to almost invisible.
Since the original zio removed from the queue, and the child zio is
not executed yet, I don't see why would the copy need protection.
My guess it just remained like this from the time when lock was not
dropped here, which was added later to fix lock ordering issue.

Multi-threaded sequential write tests with both HDD and SSD pools
with ZVOL block sizes of 4KB, 16KB, 64KB and 128KB all show major
reduction of lock congestion, saving from 15% to 35% of CPU time
and increasing throughput from 10% to 40%.

Reviewed-by: Richard Yao <ryao@gentoo.org>
Reviewed-by: Matt Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by:  Alexander Motin <mav@FreeBSD.org>
Closes openzfs#8890
tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request Aug 23, 2019
Memory copy is too heavy operation to do under the congested lock.
Moving it out reduces congestion by many times to almost invisible.
Since the original zio removed from the queue, and the child zio is
not executed yet, I don't see why would the copy need protection.
My guess it just remained like this from the time when lock was not
dropped here, which was added later to fix lock ordering issue.

Multi-threaded sequential write tests with both HDD and SSD pools
with ZVOL block sizes of 4KB, 16KB, 64KB and 128KB all show major
reduction of lock congestion, saving from 15% to 35% of CPU time
and increasing throughput from 10% to 40%.

Reviewed-by: Richard Yao <ryao@gentoo.org>
Reviewed-by: Matt Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by:  Alexander Motin <mav@FreeBSD.org>
Closes openzfs#8890
tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request Sep 17, 2019
Memory copy is too heavy operation to do under the congested lock.
Moving it out reduces congestion by many times to almost invisible.
Since the original zio removed from the queue, and the child zio is
not executed yet, I don't see why would the copy need protection.
My guess it just remained like this from the time when lock was not
dropped here, which was added later to fix lock ordering issue.

Multi-threaded sequential write tests with both HDD and SSD pools
with ZVOL block sizes of 4KB, 16KB, 64KB and 128KB all show major
reduction of lock congestion, saving from 15% to 35% of CPU time
and increasing throughput from 10% to 40%.

Reviewed-by: Richard Yao <ryao@gentoo.org>
Reviewed-by: Matt Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by:  Alexander Motin <mav@FreeBSD.org>
Closes openzfs#8890
tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request Sep 18, 2019
Memory copy is too heavy operation to do under the congested lock.
Moving it out reduces congestion by many times to almost invisible.
Since the original zio removed from the queue, and the child zio is
not executed yet, I don't see why would the copy need protection.
My guess it just remained like this from the time when lock was not
dropped here, which was added later to fix lock ordering issue.

Multi-threaded sequential write tests with both HDD and SSD pools
with ZVOL block sizes of 4KB, 16KB, 64KB and 128KB all show major
reduction of lock congestion, saving from 15% to 35% of CPU time
and increasing throughput from 10% to 40%.

Reviewed-by: Richard Yao <ryao@gentoo.org>
Reviewed-by: Matt Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by:  Alexander Motin <mav@FreeBSD.org>
Closes openzfs#8890
tonyhutter pushed a commit that referenced this pull request Sep 26, 2019
Memory copy is too heavy operation to do under the congested lock.
Moving it out reduces congestion by many times to almost invisible.
Since the original zio removed from the queue, and the child zio is
not executed yet, I don't see why would the copy need protection.
My guess it just remained like this from the time when lock was not
dropped here, which was added later to fix lock ordering issue.

Multi-threaded sequential write tests with both HDD and SSD pools
with ZVOL block sizes of 4KB, 16KB, 64KB and 128KB all show major
reduction of lock congestion, saving from 15% to 35% of CPU time
and increasing throughput from 10% to 40%.

Reviewed-by: Richard Yao <ryao@gentoo.org>
Reviewed-by: Matt Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by:  Alexander Motin <mav@FreeBSD.org>
Closes #8890
@amotin amotin deleted the vq_lock branch August 24, 2021 20:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested) Type: Performance Performance improvement or performance problem
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants