Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Corrupted file read/write #50717

Open
ronag opened this issue Nov 13, 2023 · 23 comments
Open

Corrupted file read/write #50717

ronag opened this issue Nov 13, 2023 · 23 comments
Labels
fs Issues and PRs related to the fs subsystem / file system.

Comments

@ronag
Copy link
Member

ronag commented Nov 13, 2023

We noticed some corrupted files on our system and basically it seems that file read/write does not seem to increment the file handle offset. Hence writes will always write to offset 0 and reads will always read from offset 0.

From preliminary testing the issue seems to have been introduced in v20.3. I suspect #48078 and libuv/libuv#3952.

Host kernel: 5.15.104-0-lts
Docker Daemon: 20.10.9
Docker Image: node:21.1.0-bullseye and 20.9.0-bullseye

The issue only seems to occur when writing/reading to a specific SMB mount.

volumes:
  backup:
    driver: local
    driver_opts:
      type: cifs
      o: 'username=secret,password=secret'
      device: '//somednsname/nxt_backup'
import fs from 'fs'
import assert from 'assert'
const srcPath = '/backup/backup/0/0/0/0006d3befc1c41.mp4/0-TPtirLqtxVa6LP'
const dstPath = srcPath + '.tmp'
let position = 0
const src = await fs.promises.open(srcPath, 'r')
const dst = await fs.promises.open(dstPath, 'wx+')
try {
  while (true) {
    const { bytesRead, buffer } = await src.read() // { position })
    console.log('chunk', bytesRead, position)
    if (bytesRead === 0) {
      break
    }
    const chunk = buffer.subarray(0, bytesRead)
    await dst.write(chunk) // { position })
    position += bytesRead
  }
  await dst.sync()
} finally {
  await dst.close()
  await src.close()
}
const stat = fs.statSync(dstPath)
console.log('copied', dstPath, stat.size, position)
assert(position === stat.size, `${dstPath}: ${position} !== ${stat.size}`)

If we instead pass an explicit position to write/read then everything works as expected.

This also affects, fs.createWriteStream and fs.createReadStream.

@ronag ronag added the fs Issues and PRs related to the fs subsystem / file system. label Nov 13, 2023
@ronag
Copy link
Member Author

ronag commented Nov 13, 2023

@deadbeef84

@ronag
Copy link
Member Author

ronag commented Nov 13, 2023

I'm hopeful https://github.com/libuv/libuv/releases/tag/v1.47.0 might resolve it. But until then this is quite a serious issue.

@ronag
Copy link
Member Author

ronag commented Nov 13, 2023

@nodejs/fs

@ronag
Copy link
Member Author

ronag commented Nov 13, 2023

@bnoordhuis

@ronag
Copy link
Member Author

ronag commented Nov 14, 2023

UV_USE_IO_URING=0 resolves the issue. We have applied this to all our services in production as a safety measure.

@ronag
Copy link
Member Author

ronag commented Nov 14, 2023

If necessary I can demo the issue on the production system where it occurs over a Zoom call.

@santigimeno
Copy link
Member

Does calling await dst.write(chunk) help somehow?

@ronag
Copy link
Member Author

ronag commented Nov 14, 2023

Sorry that was a typo

@santigimeno
Copy link
Member

One more thing. Are you checking for partial writes?

@ronag
Copy link
Member Author

ronag commented Nov 15, 2023

One more thing. Are you checking for partial writes?

Partial writes should not be possible here. But even if they were, not sure it matters? The file size is exactly the highWaterMark and stays that size regardless how much many times write is called.

Also the read never completes and reads infinite amount of data.

@santigimeno
Copy link
Member

santigimeno commented Nov 15, 2023

fwiw, I've setup a shared samba folder in my computer which I'm accessing via the cifs driver and couldn't reproduce the issue.
These are the specs:
Host Kernel: Ubuntu 5.15.0-88.98-generic 5.15.126
Docker version: Docker version 24.0.7
Docker Image: node:21.1.0-bullseye

@ronag
Copy link
Member Author

ronag commented Nov 15, 2023

Hm. Must be some combination of kernel ver + docker ver + smb + ioring.

@ronag
Copy link
Member Author

ronag commented Nov 15, 2023

We are working on upgrading kernel + docker version. However, it will take a few weeks before I can confirm whether or not that fixes the issue.

@santigimeno
Copy link
Member

In the meantime, can you try running your script for a bit with perf like this and post the results somewhere? There might be some useful info there:

$ perf record -e "io_uring:*" -e "cifs:*" -- node test.mjs
# store the data into a file
$ perf script -i perf.data > output.log

@ronag
Copy link
Member Author

ronag commented Nov 16, 2023

Unfortunately that doesn't seem to work so well:

Inside node bullseye container:

root@a4bae0cec5c2:/usr/src/app# apt-get update
root@a4bae0cec5c2:/usr/src/app# apt-get install linux-perf 
root@a4bae0cec5c2:/usr/src/app# perf
/usr/bin/perf: line 13: exec: perf_5.15: not found

@santigimeno
Copy link
Member

What's happenning is that the perf version installed in the container doesn't match the host kernel version. You should manually build and install the correct perf version in the container (here's a useful link with instructions you can adapt to your use case). Also, remember running the container with the --privileged flag so perf can run.

@deadbeef84
Copy link
Contributor

Here's the result from perf running in the node:20.3.0-bullseye-slim docker image.

output.log

@santigimeno
Copy link
Member

santigimeno commented Nov 26, 2023

@deadbeef84 Thanks for the sending the results.

From what I understand, it's the write operation the issue here as it's always reading from offset 0. The read operation seems to be working just fine. See from this portion of the logs in bold how the offset in the reads are changing correctly while not in the writes:

io_uring:io_uring_file_get: ring 0xffff97357a326000, fd 26
io_uring:io_uring_submit_sqe: ring 0xffff97357a326000, req 0xffff9738c0571900, op 1, data 0x66b37c8, flags 16777216, non block 1, sq_thread 1current_mid=7515 credits=8189 credit_change=63 in_flight=1
cifs:smb3_add_credits: conn_id=0x36 server=10.23.71.42 current_mid=7515 credits=8126 credit_change=-64 in_flight=1
cifs:smb3_add_credits: conn_id=0x36 server=10.23.71.42 current_mid=7515 credits=8189 credit_change=63 in_flight=1
cifs:smb3_read_enter: xid=0 sid=0x6f63d1 tid=0xf97b1104 fid=0xecd155ed offset=0x0 len=0x4000
cifs:smb3_cmd_enter: sid=0x6f63d1 tid=0xf97b1104 cmd=8 mid=7515
io_uring:io_uring_complete: ring 0xffff97357a326000, user_data 0x66b37c8, result 16384, cflags 0
io_uring:io_uring_file_get: ring 0xffff97357a326000, fd 27
io_uring:io_uring_submit_sqe: ring 0xffff97357a326000, req 0xffff9738c0571900, op 2, data 0x66b1898, flags 16777216, non block 1, sq_thread 1
io_uring:io_uring_queue_async_work: ring 0xffff97357a326000, request 0xffff9738c0571900, flags 17039360, hashed queue, work 0xffff9738c05719b8
cifs:smb3_enter: cifs_write_from_iter: xid=19212585
cifs:smb3_add_credits: conn_id=0x36 server=10.23.71.42 current_mid=7516 credits=8126 credit_change=-64 in_flight=1
cifs:smb3_add_credits: conn_id=0x36 server=10.23.71.42 current_mid=7516 credits=8189 credit_change=63 in_flight=1
cifs:smb3_write_enter: xid=0 sid=0x6f63d1 tid=0xf97b1104 fid=0x535c0559 offset=0x0 len=0x4000
cifs:smb3_cmd_enter: sid=0x6f63d1 tid=0xf97b1104 cmd=9 mid=7516
cifs:smb3_exit_done: cifs_write_from_iter: xid=19212585
io_uring:io_uring_complete: ring 0xffff97357a326000, user_data 0x66b1898, result 16384, cflags 0
io_uring:io_uring_file_get: ring 0xffff97357a326000, fd 26
io_uring:io_uring_submit_sqe: ring 0xffff97357a326000, req 0xffff9738c0571900, op 1, data 0x66b37c8, flags 16777216, non block 1, sq_thread 1
cifs:smb3_add_credits: conn_id=0x36 server=10.23.71.42 current_mid=7517 credits=8126 credit_change=-64 in_flight=1
cifs:smb3_add_credits: conn_id=0x36 server=10.23.71.42 current_mid=7517 credits=8189 credit_change=63 in_flight=1
cifs:smb3_read_enter: xid=0 sid=0x6f63d1 tid=0xf97b1104 fid=0xecd155ed offset=0x4000 len=0x4000
cifs:smb3_cmd_enter: sid=0x6f63d1 tid=0xf97b1104 cmd=8 mid=7517
io_uring:io_uring_complete: ring 0xffff97357a326000, user_data 0x66b37c8, result 16384, cflags 0
io_uring:io_uring_file_get: ring 0xffff97357a326000, fd 27
io_uring:io_uring_submit_sqe: ring 0xffff97357a326000, req 0xffff9738c0571900, op 2, data 0x66b1898, flags 16777216, non block 1, sq_thread 1
io_uring:io_uring_queue_async_work: ring 0xffff97357a326000, request 0xffff9738c0571900, flags 17039360, hashed queue, work 0xffff9738c05719b8
cifs:smb3_enter: cifs_write_from_iter: xid=19212586
cifs:smb3_add_credits: conn_id=0x36 server=10.23.71.42 current_mid=7518 credits=8126 credit_change=-64 in_flight=1
cifs:smb3_add_credits: conn_id=0x36 server=10.23.71.42 current_mid=7518 credits=8189 credit_change=63 in_flight=1
cifs:smb3_write_enter: xid=0 sid=0x6f63d1 tid=0xf97b1104 fid=0x535c0559 offset=0x0 len=0x4000
cifs:smb3_cmd_enter: sid=0x6f63d1 tid=0xf97b1104 cmd=9 mid=7518
cifs:smb3_exit_done: cifs_write_from_iter: xid=19212586
io_uring:io_uring_complete: ring 0xffff97357a326000, user_data 0x66b1898, result 16384, cflags 0

I'm not sure that upgrading kernel versions will fix this as I was briefly looking at the commits in io_uring and cifs between those versions and not sure there's anything related. Worth giving it a try though.
In the meantime, I saw you're using a much larger file than the one I used in my testing. Is this happening with files of any size?

@richardlau
Copy link
Member

Are you able to test with #50650 to see if the further kernel restrictions in libuv around when io_uring is enabled helps?

@Rush
Copy link

Rush commented Dec 29, 2023

I believe we encountered a potentially related issue but with WebSockets that intermittently disconnect in a production environment with more users using our platform. Doesn't happen in testing with one/two connections. We're using cWS https://github.com/encharm/cWS

The issue started happening after we updated to Ubuntu 22.04 LTS from 20.04 LTS. The update triggered the io uring code path which wasn't available in an older kernel.

I confirmed UV_USE_IO_URING=0 to fix the issue.

Will libuv 1.47.0 be backported into Node 20 LTS?

@deadbeef84
Copy link
Contributor

From what I understand, it's the write operation the issue here as it's always reading from offset 0. The read operation seems to be working just fine. See from this portion of the logs in bold how the offset in the reads are changing correctly while not in the writes:

Sorry, I think I ran a modified version with explicit { position } specified in the read call - otherwise the code would get stuck in an infinite loop - so this does indeed effect both read/write.

@santigimeno
Copy link
Member

@deadbeef84 thanks for the response. I would like to keep investigating this but I would need to be able to reproduce your environment. How difficult would it be?

@santigimeno
Copy link
Member

I believe we encountered a potentially related issue but with WebSockets that intermittently disconnect in a production environment with more users using our platform. Doesn't happen in testing with one/two connections. We're using cWS https://github.com/encharm/cWS

@Rush, that seems an unrelated issue (though also related to io_uring). Could you open a different issue describing the issue and, if possible, a reproducer? Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fs Issues and PRs related to the fs subsystem / file system.
Projects
None yet
Development

No branches or pull requests

5 participants