-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix mmap / libaio deadlock #7339
Conversation
f05b70c
to
16df384
Compare
Codecov Report
@@ Coverage Diff @@
## master #7339 +/- ##
==========================================
+ Coverage 76.33% 76.36% +0.02%
==========================================
Files 329 329
Lines 104191 104191
==========================================
+ Hits 79537 79564 +27
+ Misses 24654 24627 -27
Continue to review full report at Codecov.
|
How do we guarantee the page does not become evicted after the pre-fault and hence end up in the same page lock condition? |
Just to add, this fix works fine, but I'm concerned about the unlikely event we get page-eviction post the pre-fault and hence end up in the same page lock condition. |
@ColinIanKing good question, technically this is possible but we're a bit more protected than it initially appears. In this case, where we're faulting in a shared page, the newly faulted page will immediately be dirtied in
Which means that for this page to get evicted at a minimum We could probably close this race entirely by registering our own custom |
module/zcommon/zfs_uio.c
Outdated
cnt -= incr; | ||
|
||
if (rw == UIO_READ) | ||
error = -fault_in_pages_writeable(p, cnt); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the provided reproducer, the destination page is the same as the source page. Wouldn't doing this modify the result?
Also, fault_in_pages_writeable seems to fault in at most 2 pages.
@behlendorf |
@tuxoko the contents of the page are supposed to be updated under the Probably the right way to handle this long term is to register our own
How do you mean? This does only cover one specific case and I agree it would be nice to update the test case to cover a few more variants of potential mmap / libaio races.
Are you sure? It looks to me like it'll fault in the entire range passed. |
Yes, but here we are reading the content of the page, not filling it or modifying it. There's also an
What I meant is the destination and source of uiomove is the same page, so if you prefault by writing zero to destination, wouldn't it modify the source as well?
It seems it depends on which version. https://elixir.bootlin.com/linux/v4.0.9/source/include/linux/pagemap.h#L542 |
Yes good point, and additionally we do hold the inode's range lock when when accessing the page via
Ahh, I see what you're getting at. I've dropped the prefault code as cleanup since it was was non-functional anyway and the updated version wasn't going to be 100% reliable.
That's unfortunate, but it appears we can do without. |
@behlendorf |
@behlendorf
|
By non-functional I meant that it wasn't actually needed, although clearly that isn't the case given that stack trace. I've updated commit message, and made the minimal change to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pretty straightforward since we don't actually need to lock the page for uiomove
.
It might be worth creating a separate issue for the (remote) possibility of a page being evicted after prefault, which would exhibit the same symptoms as removing the prefault code (for a partial write?).
@behlendorf diff --git a/module/zfs/zfs_vnops.c b/module/zfs/zfs_vnops.c
index 72224a7..14caa80 100644
--- a/module/zfs/zfs_vnops.c
+++ b/module/zfs/zfs_vnops.c
@@ -397,8 +397,11 @@ mappedread(struct inode *ip, int nbytes, uio_t *uio)
for (start &= PAGE_MASK; len > 0; start += PAGE_SIZE) {
bytes = MIN(PAGE_SIZE - off, len);
- pp = find_get_page(mp, start >> PAGE_SHIFT);
+ pp = find_lock_page(mp, start >> PAGE_SHIFT);
if (pp) {
+ ASSERT(PageUptodate(pp));
+ unlock_page(pp);
+
pb = kmap(pp);
error = uiomove(pb + off, bytes, UIO_READ, uio);
kunmap(pp); |
Refreshed. |
Calling uiomove() in mappedread() can result in deadlock if the user space page needs to be faulted in. Resolve the issue by only taking a reference on the page when copying it and not the page lock. The inode range lock protects against concurrent updates via zfs_read() and zfs_write(). Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue openzfs#7335
Calling uiomove() in mappedread() under the page lock can result in a deadlock if the user space page needs to be faulted in. Resolve the issue by dropping the page lock before the uiomove(). The inode range lock protects against concurrent updates via zfs_read() and zfs_write(). Reviewed-by: Albert Lee <trisk@forkgnu.org> Reviewed-by: Chunwei Chen <david.chen@nutanix.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes openzfs#7335 Closes openzfs#7339
Calling uiomove() in mappedread() under the page lock can result in a deadlock if the user space page needs to be faulted in. Resolve the issue by dropping the page lock before the uiomove(). The inode range lock protects against concurrent updates via zfs_read() and zfs_write(). Reviewed-by: Albert Lee <trisk@forkgnu.org> Reviewed-by: Chunwei Chen <david.chen@nutanix.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes openzfs#7335 Closes openzfs#7339
Calling uiomove() in mappedread() under the page lock can result in a deadlock if the user space page needs to be faulted in. Resolve the issue by dropping the page lock before the uiomove(). The inode range lock protects against concurrent updates via zfs_read() and zfs_write(). Reviewed-by: Albert Lee <trisk@forkgnu.org> Reviewed-by: Chunwei Chen <david.chen@nutanix.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #7335 Closes #7339
Description
Calling
uiomove()
inmappedread()
can result in deadlock if theuser space page needs to be faulted in.
This issue is that
uiomove()
must be called with the page lock heldin order to safely populate the page date. If the page needs to be
faulted in by
filemap_page_mkwrite()
then it will also take the pagelock resulting in a double-lock.
Normally this isn't an issue since the pages are very likely to be
already faulted in. This patch makes sure that is always the case
by prefaulting in the user space pages.
Motivation and Context
Issue #7335
How Has This Been Tested?
Added test case provided by @ColinIanKing. Without the patch it deadlocks immediately as described in #7335, with the patch applied it passes 100 iterations of the test.
The test case has been added to the ZTS to prevent any future regression.
Types of changes
Checklist:
Signed-off-by
.