-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update status changes to "In Progress" sometime after "Succeeded". #625
Comments
Is this "InProgress" state persisting continuously for the online devices? What is possibly happening is that if the device remains connected for over an hour, the IoT Hub connection can be refreshed and the agent will receive the ADM configuration again and go into the deployment workflow and report InProgress, but once in the workflow it should see it has the update installed already and report Success again. I would not expect it to stay "InProgress" for an extended period of time. Was the offline agent ever online to receive the deployment? Or has it been offline the entire time? |
The "In Progress" state remains for a longer period. As soon as I restart the agent using "sudo systemctl restart deviceupdate-agent", it will report succeeded again. But after some time, it goes back to in progress state. |
This is the last shell log file from a device that is showing in progress few minutes after success.
Anf here you can see the content of two log files from a device that I restarted it's agent and now is showing succeeded. before and after restart.
|
Thanks, can you open a CSS ticket with your account/instance/device information and these logs and we can follow up. This doesn't sound like expected behavior. |
@josephmsft Will the resolution be posted in this thread as well? I am having the same issue as originally described. |
@mahdighorbanpourptw were you able to resolve this issue? |
I have not yet contacted the support team. I will plan it for the next week and inform you about the resolution here. |
@mahdighorbanpourptw Thanks! |
Could the full agent logs be provided? e.g. sudo tar -czvf /tmp/duagentlogs.tgz /var/log/adu/*.log |
After investigating, this seems like an agent issue, we are working on a fix. Providing the logs as @jw-msft mentioned here on the ticket would be helpful. |
@mahdighorbanpourptw Checking if you were able to get past the issue above or if you could provide the logs if support is needed? |
@eshashah-msft I provided the logs however, no new updates from azure update service team. |
I have the same issue. I did some investigation in the source code, but since I don't fully have the understanding of the project I cannot really just write a fix for it. But at least I can document my findings here. After an update the status of the device is "Succeeded" which is expected. But after about 40 min they go back to "In Progress" The log of the deviceupdate-agent looks like this:
The connection to the IoTHub is broken due to (I guess) an invalid token. The update agent will the reconnect to the IoTHub, and this is when it finds a "new" update. In the log you find this line:
This means we are in the function OrchestratorUpdateCallback at line 446. In line 449 the method The log contains this line:
This JSON-blob contains In the log you can also find this line:
In
So by looking at the log we can determine that the overall state is changed it "InProgress" in |
Just some extra info about the above comment. All the referrals to line numbers in the comment is based on the code in commit |
There is a pull-req (#609) that might solve the issue. Can someone please look at it. |
We are seeing a similar issue. The twin reports "lastInstallResult": {
"extendedResultCodes": "00000000",
"resultCode": 1
}, Azure portal reports "In Progress" But on the same twin "configurations": {
"adu-381b7431-f2d1-45f0-861c-35f060efec48-13176c5bda16dce39043c3c585d14f471ee0d645": {
"status": "Applied"
},
"adu-nodeployment": {
"status": "Applied"
}
}, The device is running the latest version, the OTA has been done. We are running a modified version of the ADU based on release 1.1.0. "deviceUpdate": {
"__t": "c",
"agent": {
"compatPropertyNames": "manufacturer,model",
"deviceProperties": {
"aduVer": "DU;agent/1.1.0",
"contractModelId": "dtmi:azure:iot:deviceUpdateContractModel;3",
"manufacturer": "Zoetis",
"model": "Hub"
},
"installedUpdateId": "{\"provider\":\"Zoetis\",\"name\":\"Hub\",\"version\":\"2024.5.13\"}",
"lastInstallResult": {
"extendedResultCodes": "00000000",
"resultCode": 1
},
"state": 6,
"workflow": {
"action": 3,
"id": "381b7431-f2d1-45f0-861c-35f060efec48"
}
} |
@josephmsft , @jw-msft , @eshashah-msft |
@mahdighorbanpourptw and @ortogonal, could you test out PR #659 to see if it resolves your issue? This is the proper fix of root issue that was introduced with the rootkey work (which sets it to in-progress before update processing actually starts)--#659 sets in-progress state only once processing of the update metadata begins such that updates that have been either been installed before do not lead to reporting of in-progress unless IsInstalled says it is not installed. The PR #610 does get it to set the state to a terminal state by reporting it every 35-40 minute token expiry, but that is fixing a symptom and not fixing the root issue. It has a negative side-effect of consuming large amounts of message quota to compensate for in-progress state being set too early (consumes one message every ~35-40 minutes when token expires). Therefore, I recommend closing PR #610 in favor of PR #659. We are working to address the blocking PR comment and merging it to develop branch soon but predicated on relative prioritization with other tasks. |
@jw-msft I've tested PR #659 but get into trouble. I get this log message
When looking into the PR in commit 50b6ef3 there is a change that checks the return code of rootkeyErc. I guess that is failing for me. Without this patch I can run the update, but still with the original issue. But the PR makes it impossible to update :( |
I have an update script which is a copy of the sample scripts and some changes to adapt it to our needs. The update works perfectly fine. The state changes from not running, in progress, succeeded. Then after a while, it changes back to In Progress. I looked at the update agent logs and there, the condition check works perfectly fine, and the update is not triggered twice! Note that, when I restart the agent, it immediately shows "Succeeded". Then the same issue, after some time, goes back to in progress.
having said, the IotHub portal, updates, update-group overview, showing inconsistent data. The first two devices, are already updated successfully, which on the upper side info box, shows correctly. But in the lower side, all of them are shown "In Progress". The third device in the list is even turned off!
You could also see that, in the twin, it says applied.
Expected behavior
The first two devices should keep showing the state "Succeeded". And the last one, should be "Not Running"
Actual behavior
All the devices are shown "In Progress".
Reproduction Steps
Use the example scripts to create an update. Assign it to the devices.
Environment
Ubuntu 22.04
IotEdge 1.4.39
DeviceIpdate-Agent 1.1.0
The text was updated successfully, but these errors were encountered: