-
Notifications
You must be signed in to change notification settings - Fork 207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ES - Invalid Memory Handle When Restarting/Deleting an Application with Tables (GSFC DCR 14483) #121
Comments
Imported from trac issue 90. Created by sstrege on 2015-08-27T13:59:18, last modified: 2019-07-03T12:48:08 |
Trac comment by glimes on 2016-10-18 15:00:28: redispatching these tickets to clear my name from the "owner" field, |
Trac comment by glimes on 2016-11-08 14:17:08: Current crop of cfe-next are all going into CFE 6.6 |
Trac comment by gdecaruf on 2018-02-08 15:27:04: Ran into exact same issue. the culprit is that the UsedFlag isn't checked when looping over Handles. This leads to the call to CFE_TBL_RemoveAccessLink to fail in cfe_tbl_internal.c:1450 (CFE_TBL_CleanUpApp) because the restarted app was reassigned to the AppID that was previously owned by the other app that has been shutdown. And so when it loops over the Table handles, it incorrectly tries to remove the access link to a tbl registry that has already been cleared previously. Fix is to add Check for UsedFlag == TRUE {{{ }}} |
Trac comment by gdecaruf on 2018-02-08 15:31:59: What are the repercussions in a current build that doesn't have this fix? Is it dangerous? It seems to only fail with the memory access error and keep going. |
Trac comment by jhageman on 2019-07-03 12:48:08: Moved unfinished 6.6.1 issues to next minor release |
When you delete an application that uses tables (e.g. HK) and then restart another task (e.g. SC) for a second time, the ES task writes to the system log that there are invalid memory handles.
Seems like when tables get unregistered is where the errors are happening. Message says it got a bad pointer for this table, not sure if the app in messed up. This problem is not isolated to RestartApp. It occurs in DeleteApp as well. What it looks like is that the linked list is not getting cleaned up properly when an app is deleted or restarted.
Further investigation in the CFS Lab narrows the problem down to the RemoveAccessLink function in cfe_tbl_internal.c. The errors are being generated on table handles from the deleted app. The buffer that is trying to be placed back into the pool is set to NULL because it has already been put back into the pool. The tables that were "cleaned up" still contain the AppID of the deleted app. When the subsequent app is restarted, its AppID becomes that of the deleted app and inherits the table handles from the previous app. For example, the HK app has 2 tables and the SC app has 73 tables. When HK is deleted, the 2 tables are removed and the entries still contain the AppID of HK. When SC is restarted, it becomes the AppID that HK was. The reason the errors occur on the 2nd restart is because on the first restart SC had a unique AppID. On the second restart, it has inherited HKs original AppID. In this case, you will see 2 sets of errors when SC is restarted. The SC application did not show any adverse functionality because of these errors. All that is happening is that the PutPoolBuf function is reporting an error when trying to return a NULL buffer to the pool.
The text was updated successfully, but these errors were encountered: