Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

websocket (since 1.32.4 after switching to fastwebsockets) server stops receiving messages on client tcp session hang #18808

Closed
github-enetrino-com opened this issue Apr 23, 2023 · 3 comments · Fixed by #18916
Assignees

Comments

@github-enetrino-com
Copy link

github-enetrino-com commented Apr 23, 2023

I'm developing a service that depends on websockets, and found that server side of websocket in deno 1.32.4 and 1.32.5 hangs if websocket client disappears (terminates, reboots) without correctly closing tcp session, until the tcp session resets.

This behavior leads to unpredictable stalls due to hanging/disappearing websocket client connections (which itself is a normal for network clients).

I've written a deno test to reproduce which sets up a websocket server and launches worker that sets up websocket client connection and sends messages every few seconds, then test suite terminates the worker (with worker.terminate()) to make sure tcp connection is abruptly hung. This leads to both websocket server stopping receiving messages until the causing tcp session resets, and something happens to worker.terminate too as until the tcp session is reset the client working async function doesn't return.

Here's the deno test files to reproduce (works fine with deno < 1.32.4, doesn't work with 1.32.4, 1.32.5, canary as of 2023-04-23):


websocket_test.js

import { assertEquals } from "https://deno.land/std@0.184.0/testing/asserts.ts";
import { serve } from "https://deno.land/std@0.184.0/http/mod.ts";
import { delay } from "https://deno.land/std@0.184.0/async/delay.ts";

Deno.test("test websocket server 1.32.[1-5] hang for some time", async (t) => {
  const testContentServer = '{"command":"test"}';
  const testContentClient = '{"processDetails":[{"pid":"123","state":"Ss","user":"root","start":"9Apr23","args":"/usr/sbin/syslogd"},{"pid":"16833","state":"S+","user":"dennis","start":"Fri02AM","args":"deno run ./test_service_1.ts"}],"allServices":[{"service":"test_service_1","up":true},{"service":"syslogd","up":true}],"allUp":true}';
  const abortController = new AbortController();
  const port = 8002;
  let testDataServer = undefined;
  let testDataClient = undefined;

  console.log(Deno.version);

  const testServer = async () => {
    const signal = abortController.signal;

    serve((req) => {
      // console.log('new request');
      if (req.headers.get("upgrade") != "websocket") {
        return new Response(null, { status: 501 });
      }
      const { socket: ws, response } = Deno.upgradeWebSocket(req);
      // ws.onopen = () => console.log('conn open');
      const handleMessage = async (m) => {
        testDataServer = m.data;
        await delay(100);
        ws.send(testContentClient);
      };

      ws.onmessage = async (m) => await handleMessage(m);
      return response;
    },
    {
      port,
      signal,
      onError: (error) => {
        console.log(`onError ${error}`);
        return new Response(null, { status: 501 });
      },
      onListen: () => { }
    });

    try {
     await delay(60000, { signal });
    } catch(e) {
      console.log(e);
    }

    // assertEquals(1,1);
  }

  const testClient = async () => {
    // console.log(port);
    const workerUrl = new URL("./websocket_test_client_worker.js", import.meta.url).href;
    let worker = new Worker(workerUrl, {
      type: "module",
    });

    worker.onmessage = (m) => testDataClient = m.data;

    console.log('t 62');
    await delay(3000);
    console.log('t 64');
    worker.terminate();
    console.log('t 66');
    worker = undefined;

    assertEquals(testDataServer, testContentServer);
    console.log('t 70');

    assertEquals(testDataClient, testContentClient);
    console.log('t 73');

    testDataServer = undefined;
    testDataClient = undefined;
    console.log('t 77');

    // }
  }

  testServer();
  await delay(1000);

  for(let step = 1; step <= 10; step++) {
    console.log('t 86');
    await t.step(`step ${step}`, async () => await testClient());
    console.log('t 88');
  }

  abortController.abort();
});

websocket_test_client_worker.js

import { delay } from "https://deno.land/std@0.184.0/async/delay.ts";
// await delay(50000);
console.log('3');
const port = 8002;

const testContentServer = '{"command":"test"}';
console.log('7');

const testClient = async () => {
  console.log('10');

  // console.log(port);
  const ws = new WebSocket(`ws://localhost:${port}`);
  console.log('14');

  ws.onmessage = (m) => self.postMessage(m.data);
  console.log('17');

  // ws.onopen = async () => { 
  await delay(1000);
  console.log('21');

  ws.send(testContentServer);
  console.log('24');

  await delay(1000);
  console.log('27');

  ws.send(testContentServer); 
  console.log('30');

  await delay(1000);
  console.log('33');

  ws.send(testContentServer); 
  console.log('36');

  await delay(1000);
  console.log('39');

  ws.send(testContentServer);
  console.log('42');

  await delay(1000);
  console.log('45');

  ws.close();
  console.log('48');

}
console.log('51');

testClient();

running tests with

deno test --allow-net --allow-read --trace-ops
@github-enetrino-com
Copy link
Author

github-enetrino-com commented Apr 23, 2023

after many test runs, tests always pass on deno 1.31.3, sometimes fail on 1.32.0-1.32.3 due to lack of try/catch on ws.send which happen after stalling, and never pass on 1.32.4+

failing with these messages:

error: Leaking async ops:
  - 1 async operation to sleep for a duration was started in this test, but never completed. This is often caused by not cancelling a `setTimeout` or `setInterval` call. The operation was started here:
    at handleOpCallTracing (ext:core/01_core.js:228:42)
    at Object.opAsync2 (ext:core/01_core.js:205:15)
    at runAfterTimeout (ext:deno_web/02_timers.js:218:29)
    at initializeTimer (ext:deno_web/02_timers.js:184:3)
    at setTimeout (ext:deno_web/02_timers.js:317:10)
    at WebSocket.[[[serverHandleIdleTimeout]]] (ext:deno_websocket/01_websocket.js:491:35)
    at WebSocket.[[[eventLoop]]] (ext:deno_websocket/01_websocket.js:405:41)

@bartlomieju
Copy link
Member

@littledivy please take a look

@github-enetrino-com
Copy link
Author

github-enetrino-com commented Apr 29, 2023

described problem with hanging websocket server when tcp peer stops responding is still reproducible on v1.33.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants