-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AppInsights blocks Node.JS process from exiting for hours #958
Comments
@PavelBansky I need more information to determine the problem, hope you can help with that, can you enable debug and warning logs to see if there are any errors coming from the SDK?, also flush have a callback parameter that will be called when completed, can you try to wait for that callback? I wonder if this is related to the way Azure DevOps run the node.js process const telemetryClient = new appInsights.TelemetryClient();
telemetryClient.config.enableInternalDebugLogging = true;
telemetryClient.config.enableInternalWarningLogging = true;
telemetryClient.trackEvent({
name: "TestEvent"
});
telemetryClient.flush({ callback: () => { console.log("Done flushing") } }); |
This is real, we have a similar experience when upgrading from 2.3.1 to 2.3.2; on GitHub actions, e2e tests that spin up the http server (per test or file), which is wrapped with application insights, only the first out of 6 files completes, all tests from the other 5 fail with test timeout. |
@hectorhdzg , I have tried both. Callback and not callback the result is always the same. function publishTelemetryEvents(isFinalTelemetry: boolean, callback: (v: string) => void, attemptNumber = 1) {
const maxAttempts = 5;
if (hasPublishedFinalTelemetry) {
// do not log telemetry after we have already logged final telemetry
// this also stops recursive calls from process.beforeExit handler
return;
}
try {
addBuildVariableTelemetry();
const totalTimeTaken = Math.round((Date.now() - startTime) / 1000); // get time in seconds
addMetric("ExecutionTime", totalTimeTaken);
const telemetryClient = new appInsights.TelemetryClient(TELEMETRY_KEY);
telemetryClient.trackEvent({ name: taskName, properties: EventProperties as any, measurements: Metrics as any });
if (Traces.length > 0) {
telemetryClient.trackTrace({ message: Traces.join("\n"), severity: appInsights.Contracts.SeverityLevel.Information, properties: EventProperties as any});
}
telemetryClient.flush({
callback: (response) => {
try {
const parsedObject = JSON.parse(response);
if (parsedObject.errors.length) {
throw parsedObject.errors; // this will reach the outer catch and is handled the same as an unparseable response.
}
} catch (parseError) {
if (attemptNumber < maxAttempts) {
// This is a failed telemetry send case. If we encounter an error, we retry, unless we reach max attempts..
publishTelemetryEvents(isFinalTelemetry, callback, attemptNumber + 1);
return;
}
console.log("##vso[task.LogIssue type=warning;]Telemetry flush:", parseError);
}
hasPublishedFinalTelemetry = hasPublishedFinalTelemetry || isFinalTelemetry;
tl.debug("Telemetry submitted");
callback(response);
}
});
}
catch (error) {
console.log("##vso[task.LogIssue type=warning;]Telemetry wrapper:", error); // don't let telemetry crash us
}
} I will try with the debugging enabled. |
@hectorhdzg : I have enabled logging and see no warnings or errors on the console. telemetryClient.config.enableInternalDebugLogging = true;
telemetryClient.config.enableInternalWarningLogging = true; At this point we will have to drop AppInsights all together until this is resolved :( |
@PavelBansky I don't have repro of the issue and there are not logs so not much I can do here, if you can share you Azure DevOps build privately will be helpful |
Hey @hectorhdzg, I'm currently having the same problem here on 2.3.3 on MacOS with Node.js 16.14.0 with a very minimalist repro: import { post } from "superagent";
import * as appInsights from "applicationinsights";
appInsights
.setup(
"InstrumentationKey=ENV_KEY;IngestionEndpoint=https://westeurope-5.in.applicationinsights.azure.com"
)
.start();
post("https://westeurope-5.in.applicationinsights.azure.com/v2.1/track")
.send([fakeTelemetryRequestBody])
.then(appInsights.defaultClient.flush)
.catch(console.error); When starting the script, the Node.js process hangs during few minutes before exiting 7 times out of 10 (from what I could observe). It happens that the process exits directly but it's clearly not consistent. Do you have any information to share about that? Thanks |
@antoine-coulon flush method have a callback you can pass to ensure is done; can you try that? (async function main() {
post("https://www.bing.com/")
.send([{}])
.then(async () => {
await new Promise((resolve) => {
appInsights.defaultClient.flush({
callback: (msg) => {
console.log(msg);
resolve();
}
});
});
})
.catch(console.error);
})(); |
@hectorhdzg I already tried it as you were already suggesting it but it doesn't work either. Here is an example with an even more minimalist reproduction: appInsights.setup(
"InstrumentationKey=KEY;IngestionEndpoint=https://westeurope-5.in.applicationinsights.azure.com"
);
appInsights.start();
appInsights.defaultClient.flush(); Do you agree on the fact that this should allow the Node.js process to exit? Even if |
@antoine-coulon I cannot reproduce in Windows or Linux but I'm working on getting a MacOS machine to try it out, is there something else involved in your repro? any other packages or code that could be affecting the behavior? |
@hectorhdzg no nothing else, I've just tested it in a raw project with only the As a reminder here is my setup: |
Is it possible to add timeouts to all HTTP calls? This falls under CWE-10881:
Footnotes |
Even very simple telemetry submission code is preventing NodeJS process from exiting. Sometimes for minutes, sometimes for hours. The problem is especially noticeable on Mac OS X, where the delays are always at least a minutes.
This is being used in Azure DevOps extension, that adds build task into the pipeline. Adding minutes or hours to the build is really a big problem.
Removing AppInisghts makes the code run under a second.
The text was updated successfully, but these errors were encountered: