Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-38319: Fix shutil._fastcopy_sendfile(): set sendfile() max block size #16491

Merged
merged 6 commits into from
Oct 1, 2019
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions Lib/shutil.py
Original file line number Diff line number Diff line change
Expand Up @@ -136,6 +136,7 @@ def _fastcopy_sendfile(fsrc, fdst):
# changes while being copied.
try:
blocksize = max(os.fstat(infd).st_size, 2 ** 23) # min 8MB
blocksize = min(blocksize, 2 ** 31 - 2) # max 2GB
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not 2 ** 31 - 1?

Would not be better to use a power of two, 2**30?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, used 2**30.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds inefficient on 64-bit system: files larger than 2 GB will need multiple sendfile() syscalls, rather currently a single one is enough, no? Maybe use sys.maxsize instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On my 64-bit Ubuntu 18.04 box sendfile() always copies up to 2GB per call, never more than that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well ok, but the kernel can be updated to support larger copies. Why should Python put an arbitrary limit on a syscall? Well, you're the expert, it's up to you. It was just a comment ;-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I think it makes sense. I added:

+        if sys.maxsize < 2 ** 32:  # 32-bit architecture
+            blocksize = min(blocksize, 2 ** 30)  # max 1GiB

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...I did that for shutil only. In socket.py I decided to always limit to max 1GiB: since we're dealing with a socket I want to exercise the loop more often (and hit the selector(), which implements the timeout logic).

except Exception:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What exceptions besides OSError can be raised? I do not feel comfortable with silencing a wide range of exceptions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.

blocksize = 2 ** 27 # 128MB

Expand Down
1 change: 1 addition & 0 deletions Lib/socket.py
Original file line number Diff line number Diff line change
Expand Up @@ -357,6 +357,7 @@ def _sendfile_use_sendfile(self, file, offset=0, count=None):
if not fsize:
return 0 # empty file
blocksize = fsize if not count else count
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
blocksize = fsize if not count else count
blocksize = count or fsize

blocksize = min(blocksize, 2 ** 31 - 2) # max 2GB

timeout = self.gettimeout()
if timeout == 0:
Expand Down