Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unable to end node process #331

Closed
nebgor opened this issue Sep 20, 2017 · 15 comments
Closed

unable to end node process #331

nebgor opened this issue Sep 20, 2017 · 15 comments
Labels
Milestone

Comments

@nebgor
Copy link

nebgor commented Sep 20, 2017

in simple hello world script, after calling a call like
appInsights.setup().setInternalLogging(isDev, true).start()

the node process does not end.

@OsvaldoRosado
Copy link
Member

This is due to setInterval loops for automatic collection of performance counters and buffering of telemetry sending.
See: #54 #310

Using defaultClient.flush() should allow your script to end

@nebgor
Copy link
Author

nebgor commented Sep 20, 2017

doesn't work. (applicationinsights@0.22.0 , node v6.11.0)

test code:

const appInsights = require("applicationinsights");

console.log(new Date())
appInsights.setup(process.env.APPINSIGHTS_INSTRUMENTATIONKEY)
.setInternalLogging(true, true)
.setAutoCollectConsole(true)
.start()

console.log("test")

appInsights.defaultClient.flush()
//appInsights.dispose()
console.log(new Date())

@OsvaldoRosado
Copy link
Member

OsvaldoRosado commented Sep 20, 2017

It looks like the call to dispose() you have commented out here will resolve this issue. Flush not working as well is a strange regression however.

What you're seeing is a side effect of the autocollection features of this SDK assuming a long-running process (Typically, a server. Collection of requests/dependencies/performance counters don't typically make sense outside of a server context).

You can avoid this problem by not using autocollection at all. Simply remove the call to .start() to not start any autocollection.

As a bugfix, we should consider using timeout.unref() on our autocollection loops to fix the root cause and allow apps to always close when their own logic is complete regardless of our internal state.

@nebgor
Copy link
Author

nebgor commented Sep 21, 2017

The dispose() is what i had tried earlier and it failed. As said, flush() too. not using .start() does allow the process to end immediately.
Adding a manual trackEvent call (on both 0.22 and 0.21) adds about 15s execution time for me before node the process exits. That time gets expensive at scale.

without .start() , without trackEvent
$ time node testAI.js
2017-09-21T02:19:31.926Z
test
2017-09-21T02:19:31.937Z
(11)

real 0m3.059s
user 0m0.046s
sys 0m0.061s

without .start() , with a 1 to 3 calls to trackEvent
$ time node testAI.js
2017-09-21T02:19:08.566Z
test
2017-09-21T02:19:08.582Z
(17)

real 0m18.394s
user 0m0.062s
sys 0m0.046s

@OsvaldoRosado
Copy link
Member

OsvaldoRosado commented Sep 21, 2017

The dispose() is what i had tried earlier and it failed

Interesting. When I tried your code sample with dispose it closed for me. Sounds like there's some kind of race condition involved.

Adding a manual trackEvent call (on both 0.22 and 0.21) adds about 15s execution time for me before node the process exits

This is by design. The primary intention for this SDK is long-running server apps, so we do not send telemetry immediately, but buffer it into batches for efficiency. You can change this behavior by changing the batch interval. See the "Advanced configuration options" section of the readme for more details on the various batching options, but the one you probably want to change is client.config.maxBatchIntervalMs. The default value is to wait 15 seconds for a batch to fill up.

@nebgor
Copy link
Author

nebgor commented Sep 22, 2017

cool, thanks. i missed that config.

regarding dispose: could the fact i'm testing on windows have anything to do with it? i will try this on other environments too.

My use case if for monitoring backend services/functions/lambdas, basically deployed serverless. The runtimes can vary a lot.

OsvaldoRosado added a commit that referenced this issue Sep 22, 2017
@OsvaldoRosado OsvaldoRosado added this to the 1.0 milestone Sep 22, 2017
@OsvaldoRosado
Copy link
Member

The PR linked here (#334) addresses a few of these issues. Autocollection will no longer keep the app running longer than it would otherwise, and flushing can be used at the end of your app to allow it to close quickly. This is generally cleaner than setting the batchInterval to a short amount of time.

OsvaldoRosado added a commit that referenced this issue Sep 25, 2017
* Fix issue #331

* Fix #330

* Fix #329

* Fix #321

* Fix #333

* Add missing changes

* Typo in travis.yml

* Give extra time to test runner

* Work around some travis issues

* Fix typing specificity

* Extra time for slow travis runners

* Disable app termination tests in node 0.10
@OsvaldoRosado
Copy link
Member

OsvaldoRosado commented Sep 25, 2017

Closing as the PR mentioned above has been merged. Feel free to re-open if you're still experiencing trouble with app termination after flush, in the next release.

@cole-brown
Copy link

cole-brown commented Sep 9, 2021

This is still a problem.

The following script takes several minutes to exit after it is 'done'.

Smash cut it down from my actual script, so probably minor bugs, but it runs and demonstrates the problem:

#!/usr/bin/env node

const appInsights = require('applicationinsights');
const argv = require('yargs')
      .string('conn')
      .nargs('conn', 1)
      .describe('conn', 'App Insights: Connection String')
      .argv;
const moment = require('moment');

const metricName = 'test metric';
let apiKey = null;

function log(...args) {
    console.log('[' + moment().toISOString() + ']:', ...args);
}

function setUp(connStr) {
    apiKey = connStr;
    log(`setUp: Connection String: ${(!!apiKey ? '******' : 'MISSING!!!')}`);

    // Disable everything?!
    appInsights.setup(apiKey)
        .setDistributedTracingMode(appInsights.DistributedTracingModes.AI)
        .setAutoCollectConsole(false, false)
        .setAutoCollectExceptions(false)
        .setAutoCollectExceptions(false)
        .setAutoCollectPerformance(false, false)
        .setAutoCollectPreAggregatedMetrics(false)
        .setAutoCollectHeartbeat(false)
        .setAutoCollectRequests(false)
        .setAutoCollectDependencies(false)
        .setAutoDependencyCorrelation(false, false)
        .setUseDiskRetryCaching(false)
        .setInternalLogging(false, false)
        .setSendLiveMetrics(false);
    // This makes no difference.
    // .start();

    // These seem to make no difference?
    const clientConfig = appInsights.defaultClient.config;
    clientConfig.maxBatchIntervalMs = 0;
    clientConfig.maxBatchSize = 0;
}

async function tearDown() {
    // Flush wants a callback, so convert it into a Promise...
    const waitForFlush = new Promise((resolve) => appInsights.defaultClient.flush({
        callback: resolve,
        // // This is /supposed/ to make it flush/quit faster, but... doesn't?
        // isAppCrashing: true,
    }));

    // ...and chain dispose on after the flush has completed.
    return waitForFlush.finally(appInsights.dispose)
    .then(log('tearDown: AppInsights flushed and disposed.'));
}

async function doMetric() {
    // Sending a value - but can set up Alert to use just the count if we want.
    log(`Sending '${metricName}' to App Insights...`);
    appInsights.defaultClient.trackMetric({ name: metricName, value: 100 });
    // Explicitly flush to maybe make exiting faster?
    return new Promise((resolve) => appInsights.defaultClient.flush({ callback: resolve }))
        .then(() => log('Sent metric to App Insights.'));
}

async function start(conn) {
    setUp(conn);
    await doMetric();
}

async function end(success) {
    await tearDown();
    log('The End.');

    // We /could/ wait for 2-3 minutes for App Insights to notice it's dead and die so we can exit.
    //   log('...Waiting for AppInsights to allow us to exit...');
    // Or we can give up now:
    //   process.exit(success ? 0 : 1);
}

(async () => {
    await end(await start(argv.conn));
})();

Output from running on Ubuntu 20.04 on WSL2 on Windows 10:

work@compy:/mnt/d/blah$ date --iso-8601=seconds && ./test.js --conn "InstrumentationKey=<...>;IngestionEndpoint=<...>" && date --iso-8601=seconds
2021-09-09T15:22:41-07:00
[2021-09-09T22:22:44.174Z]: setUp: Connection String: ******
[2021-09-09T22:22:44.188Z]: Sending 'test metric' to App Insights...
[2021-09-09T22:22:44.193Z]: Sent metric to App Insights.
[2021-09-09T22:22:44.193Z]: tearDown: AppInsights flushed and disposed.
[2021-09-09T22:22:45.658Z]: The End.
2021-09-09T15:24:58-07:00
work@compy:/mnt/d/blah$

@hectorhdzg
Copy link
Member

@cole-brown please try following code and let me know if there is any difference
const clientConfig = appInsights.defaultClient.config;
clientConfig.httpsAgent = new https.Agent();
We noticed change we made to use keepAlive in HTTP Agent(#739) is causing issues in terminating some processes, mainly in auto attach scenarios in Azure environments but is possible is related here, we are still investigating the root cause

@cole-brown
Copy link

@hectorhdzg Thanks but no change from that setting:

Still a few minutes before test.js exists w/ the new https.Agent() line added:

work@compy:/mnt/d/etc$ date --iso-8601=seconds && ./test.js --conn "InstrumentationKey=<...>;IngestionEndpoint=<...>" && date --iso-8601=seconds
2021-09-09T15:42:27-07:00
[2021-09-09T22:42:29.961Z]: setUp: Connection String: ******
[2021-09-09T22:42:29.975Z]: Sending 'test metric' to App Insights...
[2021-09-09T22:42:29.981Z]: Sent metric to App Insights.
[2021-09-09T22:42:29.981Z]: tearDown: AppInsights flushed and disposed.
[2021-09-09T22:42:31.489Z]: The End.
2021-09-09T15:44:41-07:00

@PavelBansky
Copy link

PavelBansky commented May 2, 2022

@hectorhdzg , what is the status of this problem? Can AppInsights team provide some update?

We are seeing the same issue, the Node.JS takes minutes (sometimes hours) finish the process that it's already completed.
This is very annoying since we are using in the Azure Dev Ops extension that is injected into every build.

@cole-brown
Copy link

Nothing I tried worked, so I ended up just hard/ungracefully killing it after it was done and just waiting for AppInsights to notice:

/**
 * Do whatever clean-up/tear-down/etc is necessary to exit cleanly.
 * @param {Promise.<bool>} success/failure of `start()` - used for exit code
 */
async function end(success) {
    // # AppInsights is used here to send a heartbeat status to Azure.
    await health.heartbeat();

    // # This is where all the AppInsights clean-up that's SUPPOSED to let it die now happens...
    await health.tearDown();

    // We /could/ wait for 2-3 minutes for App Insights to notice it's dead and die so we can exit.
    //   log.debug('...Waiting for AppInsights to allow us to exit...');
    // Or we can give up now.

    // This is an ungraceful exit:
    process.exit(success ? 0 : 1);
}

// -----------------------------------------------------------------------------
// RUN IT!
// -----------------------------------------------------------------------------

(async () => {
    await end(await start());
})();

But this isn't really an ideal solution...

@antoine-coulon
Copy link

Is there any update on this? It seems like there is still a problem with the Node.js sdk, I'm not able to shut it down gracefully using everything described here flush, dispose

@c3wenjiaowang
Copy link

@tonescones FYI.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants