Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The RetryHandler receives a duplicate body when the server does not support Range requests. #3356

Closed
zbinlin opened this issue Jun 22, 2024 · 11 comments · Fixed by #3389
Closed
Labels
bug Something isn't working

Comments

@zbinlin
Copy link
Contributor

zbinlin commented Jun 22, 2024

import { createServer } from "node:http"
import { once } from "node:events"
import { RetryAgent, Agent, fetch } from "undici";
import asserts from "node:assert";

let retry = true;
const server = createServer((req, res) => {
  res.writeHead(200, { "Content-Type": "text/plain" });
	if (retry) {
		retry = false;
		res.flushHeaders();
		res.write('H');
		setTimeout(() => {
			res.end('ello World');
		}, 1000);
	} else {
		res.end('Hello World');
	}
})

server.listen();
await once(server, 'listening')

const agent = new RetryAgent(new Agent({bodyTimeout: 500}), {
	errorCodes: ['UND_ERR_BODY_TIMEOUT'],
});
const { port } = server.address()
const res = await fetch(`http://localhost:${port}`, {
    dispatcher: agent,
});
asserts.equal(await res.text(), 'Hello World');

server.close();

output:

AssertionError [ERR_ASSERTION]: 'HHello World' == 'Hello World'
@zbinlin zbinlin added the bug Something isn't working label Jun 22, 2024
@metcoder95
Copy link
Member

Hmm, yeah this indeed is a bug, often due to the fact that we buffer the response in an effort of providing a range request over a next retry.

It's not an easy fix, as the data get's automatically forwarded to the main handler, so the fetch handler is the one that buffered the body. We can either have a flag that disables that range request or change the way we buffer the body (maybe keep it within the Retry handler) to identify if a request can be safely resumed with the buffered body or totally discard it.

Servers are not meant to reply in a specific way, if they do not support it, they can safely reply with a HTTP 200.

@ronag
Copy link
Member

ronag commented Jun 24, 2024

@metcoder95 I'm not sure I understand what the problem is? Cna you try to explain? Or huddle me on slack.

@ronag
Copy link
Member

ronag commented Jun 24, 2024

Ah, I see now. You need to fail the retry if position > 0 and response is not a range request. See how https://github.com/nxtedition/nxt-undici/blob/main/lib/interceptor/response-retry.js does it. In particular look at https://github.com/nxtedition/nxt-undici/blob/main/lib/interceptor/response-retry.js#L116.

@climba03003
Copy link
Contributor

climba03003 commented Jun 24, 2024

It is more like the limitation of combination between fetch and RetryAgent.
fetch is using Readable for the response, once the data chunk is received you have no way to pause the buffering.

Note that even changing the internal state of Readable is not working, since the Readable is immediately consumed by the Reader which own it's internal state too.

We may buffer the response inside RetryAgent and call onData when completed, but I doubt you want this behavior.

@metcoder95
Copy link
Member

Yeah, is basically what @climba03003 suggested; once as long as we started buffering we cannot revert what we've already buffered.

So, here I'm also not really a fan of buffering ourselves but rather better fail on it, but wanted to put all options over the table. But believe that failing might be the best, so we avoid double-buffering.

@climba03003
Copy link
Contributor

So, here I'm also not really a fan of buffering ourselves but rather better fail on it, but wanted to put all options over the table. But believe that failing might be the best, so we avoid double-buffering.

If it throw in this case, what is the meaning of retry handler exist?

The issue is demonstrating a slow network or sudden peak of the server will break the retry mechanism.
Throwing an error in such common case means the retry handler only provide limited usage.

Another solution is remember the received bytes and skip all already handled bytes.
But I can see more issue will comes when it change to this method. For example,

  1. Must ensure body checksum exist ( e.g. Etag ), otherwise the payload cannot be safely concatenated.

@metcoder95
Copy link
Member

The issue is demonstrating a slow network or sudden peak of the server will break the retry mechanism.
Throwing an error in such common case means the retry handler only provide limited usage.

I'd say that the biggest problem comes with the fact that the servers might or might not support the range request. If a network issue occurs or the server just breaks, the retry can identify and continue the next retry based on the last byte read.

I believe that the retry needs to also be smart enough to understand the situations where it cannot recover, avoiding the duplication of the body based on HTTP semantics might be better for user sanity, so they can understand as well under which circumstances it might/might-not set a retry handler for that given origin.

I also imagine a situation when they want to disable that, but that's a topic of another issue

@climba03003
Copy link
Contributor

climba03003 commented Jun 26, 2024

I'd say that the biggest problem comes with the fact that the servers might or might not support the range request.

I believe we should think of the worst case, most server do not support Range header.
I don't think an API server will ever want to support Range header, but content server do want.

If a network issue occurs or the server just breaks, the retry can identify and continue the next retry based on the last byte read.

Yes, but the current implementation is heavily based on existence of Range header.
Once the Range header is not exist, the data is easily do break.

The current use case will be limited to either

  1. When no data is ever consumed.
  2. The server must support Range when data is consumed.

We have no way to knows if the down stream ever consumed the data, so provide a fix to block seems unrealistic.
Or just block when no Range header is provide, but it further restrict the use-case.

I believe that the retry needs to also be smart enough to understand the situations where it cannot recover, avoiding the duplication of the body based on HTTP semantics might be better for user sanity, so they can understand as well under which circumstances it might/might-not set a retry handler for that given origin.

The retry should support any methods that are safe (or even idempotent).
It should not rely on the existence of headers, but behavior should be improved based on headers (like the current Range and Etag).

@mcollina
Copy link
Member

If it throw in this case, what is the meaning of retry handler exist?

The retry handler is useful to retry in case there is a bad response code or an error in a network condition. I used it with APIs all the time.

The case of a truncated response is tricky.
In theory, a server that responds with a partial response should return 206. So, if we have already starting to flow the response, and the connection is interrupted, if the server does not respond with a 206 we could just error with "server does not support range requests".

@climba03003
Copy link
Contributor

climba03003 commented Jun 26, 2024

The retry handler is useful to retry in case there is a bad response code or an error in a network condition.

It is only true for bad response code, but not really for network condition. I means network interruption in the middle.

So, if we have already starting to flow the response, and the connection is interrupted, if the server does not respond with a 206 we could just error with "server does not support range requests".

The problem is that the onData is always called and passing on.
You have no way to determine if it means to be consumed by user.
So, the solution turns out to be, always throw no matter the response is consumed or not.

'use strict'

const { tspl } = require('@matteo.collina/tspl')
const { test, after } = require('node:test')
const { createServer } = require('node:http')
const { once } = require('node:events')

const { fetch, Agent, RetryAgent } = require('..')

test('https://github.com/nodejs/undici/issues/3356', async (t) => {
  t = tspl(t, { plan: 2 })

  let shouldRetry = true
  const server = createServer()
  server.on('request', (req, res) => {
    res.writeHead(200, { 'content-type': 'text/plain' })
    if (shouldRetry) {
      shouldRetry = false

      res.flushHeaders()
      res.write('h')
      setTimeout(() => { res.end('ello world!') }, 1000)
    } else {
      res.end('hello world!')
    }
  })

  server.listen(0)

  await once(server, 'listening')

  after(async () => {
    server.close()

    await once(server, 'close')
  })

  const agent = new RetryAgent(new Agent({ bodyTimeout: 500 }), {
    errorCodes: ['UND_ERR_BODY_TIMEOUT']
  })

  const response = await fetch(`http://localhost:${server.address().port}`, {
    dispatcher: agent
  })
  setTimeout(async () => {
    console.log('start consume')
    t.equal(response.status, 200)
    t.equal(await response.text(), 'hello world!') // you can see even delayed the consume, it is the same result.
  }, 2000)

  await t.completed
})

@metcoder95
Copy link
Member

So, the solution turns out to be, always throw no matter the response is consumed or not.

Yes, aborting the request might be the best option for now. Only upon responses of servers not supporting range requests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
5 participants