Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[op-creds] Fix RemoveFabric command #13869

Merged
merged 1 commit into from
Jan 26, 2022

Conversation

Damian-Nordic
Copy link
Contributor

Problem

RemoveFabric command called for the fabric that is used to process the message crashes in SessionHolder:

#0  _exit (status=<optimized out>) at /home/damian/src/ncs/zephyr/lib/libc/newlib/libc-hooks.c:281
#1  0x000891f8 in abort ()
#2  0x000713fa in chipAbort () at /home/damian/src/chip/src/lib/support/CodeUtils.h:477
#3  chip::Optional<chip::ReferenceCountedHandle<chip::Transport::Session> >::Value() const & (this=0x20026e00 <chip::DeviceLayer::sChipThreadStack+6816>)
    at /home/damian/src/chip/src/lib/core/Optional.h:170
#4  chip::SessionHolder::Get (this=this@entry=0x20004728 <chip::Server::sServer+8216>) at /home/damian/src/chip/src/transport/SessionHolder.h:54
#5  0x0007146a in chip::Messaging::ExchangeContext::GetSessionHandle (this=0x200046f8 <chip::Server::sServer+8168>) at /home/damian/src/chip/src/messaging/ExchangeContext.h:163
#6  (anonymous namespace)::FabricCleanupExchangeDelegate::OnExchangeClosing (this=<optimized out>, ec=0x200046f8 <chip::Server::sServer+8168>)
    at /home/damian/src/chip/src/app/clusters/operational-credentials-server/operational-credentials-server.cpp:324
#7  0x00080c74 in chip::Messaging::ExchangeContext::DoClose (this=0x200046f8 <chip::Server::sServer+8168>, clearRetransTable=<optimized out>)
    at ../../../../../../src/messaging/ExchangeContext.cpp:189
#8  0x00080ec8 in chip::Messaging::ExchangeContext::OnSessionReleased (this=0x200046f8 <chip::Server::sServer+8168>) at ../../../../../../src/messaging/ExchangeContext.cpp:338
#9  0x0004b3d2 in chip::Transport::Session::NotifySessionReleased (this=0x20002970 <chip::Server::sServer+608>) at ../../../../../../src/lib/support/IntrusiveList.h:292
#10 chip::Transport::SecureSession::~SecureSession (this=0x20002970 <chip::Server::sServer+608>, __in_chrg=<optimized out>) at ../../../../../../src/transport/SecureSession.h:72
#11 0x000822ca in chip::BitMapObjectPool<chip::Transport::SecureSession, 16u>::ReleaseObject (this=0x200028a8 <chip::Server::sServer+408>, element=0x20002970 <chip::Server::sServer+608>)
    at ../../../../../../src/lib/support/Pool.h:229
#12 0x0008233e in chip::Transport::SecureSessionTable<16u>::ReleaseSession (session=<optimized out>, this=<optimized out>) at ../../../../../../src/transport/SecureSessionTable.h:72
#13 chip::SessionManager::<lambda(auto:4)>::operator()<chip::Transport::SecureSession*> (session=<optimized out>, __closure=<optimized out>)
    at ../../../../../../src/transport/SessionManager.cpp:310
#14 chip::internal::LambdaProxy<chip::Transport::SecureSession, chip::SessionManager::ExpireAllPairingsForFabric(chip::FabricIndex)::<lambda(auto:4)> >::Call(void *, void *) (
    context=<optimized out>, target=<optimized out>) at ../../../../../../src/lib/support/Pool.h:126
#15 0x0008096a in chip::internal::StaticAllocatorBitmap::ForEachActiveObjectInner (this=this@entry=0x200028a8 <chip::Server::sServer+408>, 
    context=context@entry=0x20026e48 <chip::DeviceLayer::sChipThreadStack+6888>, 
    lambda=lambda@entry=0x82327 <chip::internal::LambdaProxy<chip::Transport::SecureSession, chip::SessionManager::ExpireAllPairingsForFabric(chip::FabricIndex)::<lambda(auto:4)> >::Call(void *, void *)>) at ../../../../../../src/lib/support/Pool.h:91
#16 0x0004291c in chip::BitMapObjectPool<chip::Transport::SecureSession, 16>::ForEachActiveObject<chip::SessionManager::ExpireAllPairingsForFabric(chip::FabricIndex)::<lambda(auto:4)> > (
    function=..., this=0x200028a8 <chip::Server::sServer+408>) at /opt/gnuarmemb/gcc-arm-none-eabi-9-2019-q4-major/arm-none-eabi/include/c++/9.2.1/bits/move.h:99
#17 chip::Transport::SecureSessionTable<16>::ForEachSession<chip::SessionManager::ExpireAllPairingsForFabric(chip::FabricIndex)::<lambda(auto:4)> > (function=..., 
    this=0x200028a8 <chip::Server::sServer+408>) at ../../../../../../src/transport/SecureSessionTable.h:77
#18 chip::SessionManager::ExpireAllPairingsForFabric (this=0x20002760 <chip::Server::sServer+80>, fabric=<optimized out>, fabric@entry=1 '\001')
    at ../../../../../../src/transport/SessionManager.cpp:307
#19 0x00071484 in (anonymous namespace)::FabricCleanupExchangeDelegate::OnExchangeClosing (this=<optimized out>, ec=0x200046f8 <chip::Server::sServer+8168>)
    at /home/damian/src/chip/src/messaging/ExchangeMgr.h:186
#20 0x00080c74 in chip::Messaging::ExchangeContext::DoClose (this=0x200046f8 <chip::Server::sServer+8168>, clearRetransTable=<optimized out>)
    at ../../../../../../src/messaging/ExchangeContext.cpp:189
#21 0x00080cda in chip::Messaging::ExchangeContext::Close (this=this@entry=0x200046f8 <chip::Server::sServer+8168>) at ../../../../../../src/messaging/ExchangeContext.cpp:225
#22 0x00080cf6 in chip::Messaging::ExchangeContext::MessageHandled (this=this@entry=0x200046f8 <chip::Server::sServer+8168>) at ../../../../../../src/messaging/ExchangeContext.cpp:489
#23 0x00080e7e in chip::Messaging::ExchangeContext::SendMessage (this=0x200046f8 <chip::Server::sServer+8168>, protocolId=..., msgType=msgType@entry=9 '\t', msgBuf=..., sendFlags=...)

The reason is that sessions for the removed fabric are expired when the command's exchange is closed after sending the response and that triggers closing all associated session holders. An exchange's session holder closes the exchange which in this case results in an attempt to close an exchange that is in the middle of closing...

Change overview

  • Don't try to close already-being-closed exchange.
  • Stop using MRP for sending RemoveFabric response since the MRP engine tries to access the exchange's session object even after the exchange is closed.

Testing

Tested manually on nRF Connect examples that commissioning + RemoveFabric commands don't crash the firmware.

RemoveFabric command called for the fabric that is used to
process the message crashes in SessionHolder. The reason is
that sessions for the removed fabric are expired when the
command's exchange is closed after sending the response
and that triggers closing all associated session holders.
An exchange's session holder closes the exchange which
in this case results in an attempt to close an exchange
that is in the middle of closing...
@Damian-Nordic
Copy link
Contributor Author

@bzbarsky-apple Not sure if that can be considered a fix for #9642 or we would actually prefer keeping the MRP for RemoveFabric. I tried to go that route by adding some OnExchangeReleased to ExchangeDelegate, but it seems that we set the null delegate to an exchange in many different places, so I guess currently ExchangeDelegate cannot be used to notify an application that an exchange has been really released. Therefore, disabling the MRP seemed like the easiest fix for the moment.

@github-actions
Copy link

github-actions bot commented Jan 24, 2022

PR #13869: Size comparison from 1b49aee to bc49db9

Increases (26 builds for cyw30739, efr32, esp32, k32w, linux, mbed, nrfconnect, p6, telink)
platform target config section 1b49aee bc49db9 change % change
cyw30739 light cyw930739m2evb_01 (read/write) 574390 574414 24 0.0
.app_xip_area 479404 479428 24 0.0
efr32 lighting-app BRD4161A (read only) 835916 835932 16 0.0
.text 835908 835924 16 0.0
BRD4161A+rpc (read only) 823336 823352 16 0.0
.text 823328 823344 16 0.0
window-app BRD4161A (read only) 806476 806492 16 0.0
.text 806468 806484 16 0.0
esp32 all-clusters-app c3devkit (read only) 919076 919098 22 0.0
.flash.text 919076 919098 22 0.0
m5stack (read only) 967519 967531 12 0.0
.flash.text 962135 962147 12 0.0
k32w light k32w061+release (read/write) 660588 660604 16 0.0
.text 576312 576328 16 0.0
lock k32w061+release (read/write) 661312 661328 16 0.0
.text 576776 576792 16 0.0
linux chip-tool-ipv6only arm64 (read only) 8536964 8536996 32 0.0
.text 7261028 7261060 32 0.0
thermostat-no-ble arm64 (read only) 2051916 2051964 48 0.0
.text 1706720 1706768 48 0.0
mbed lighting-app CY8CPROTO_062_4343W+release (read/write) 2334408 2334472 64 0.0
.text 1297008 1297072 64 0.0
lock-app CY8CPROTO_062_4343W+release (read/write) 2304760 2304824 64 0.0
.text 1267360 1267424 64 0.0
nrfconnect lighting-app nrf52840dk_nrf52840 (read/write) 942879 942895 16 0.0
text 637312 637328 16 0.0
nrf52840dk_nrf52840+rpc (read/write) 928347 928363 16 0.0
text 632708 632724 16 0.0
nrf52840dongle_nrf52840 (read/write) 993527 993543 16 0.0
text 669512 669528 16 0.0
nrf5340dk_nrf5340_cpuapp (read/write) 852706 852722 16 0.0
text 554232 554248 16 0.0
lock-app nrf52840dk_nrf52840 (read/write) 912223 912239 16 0.0
text 612756 612772 16 0.0
nrf5340dk_nrf5340_cpuapp (read/write) 822294 822310 16 0.0
text 529712 529728 16 0.0
pump-app nrf52840dk_nrf52840 (read/write) 915083 915099 16 0.0
text 615404 615420 16 0.0
pump-controller-app nrf52840dk_nrf52840 (read/write) 910267 910283 16 0.0
text 611428 611444 16 0.0
shell nrf52840dk_nrf52840 text 533928 533932 4 0.0
nrf5340dk_nrf5340_cpuapp text 451608 451612 4 0.0
p6 all-clusters-app default (read/write) 2410664 2410680 16 0.0
.text 1368928 1368944 16 0.0
light-app default (read/write) 2330736 2330768 32 0.0
.text 1289000 1289032 32 0.0
lock-app default (read/write) 2299752 2299768 16 0.0
.text 1258016 1258032 16 0.0
telink lighting-app tlsr9518adk80d (read/write) 841718 841742 24 0.0
text 588346 588366 20 0.0
Full report (33 builds for cyw30739, efr32, esp32, k32w, linux, mbed, nrfconnect, p6, qpg, telink)
platform target config section 1b49aee bc49db9 change % change
cyw30739 light cyw930739m2evb_01 (read/write) 574390 574414 24 0.0
.app_xip_area 479404 479428 24 0.0
.bss 77724 77724 0 0.0
.data 604 604 0 0.0
.rodata 0 0 0 0.0
.text 0 0 0 0.0
efr32 lighting-app BRD4161A (read only) 835916 835932 16 0.0
(read/write) 127244 127244 0 0.0
.bss 125344 125344 0 0.0
.data 1900 1900 0 0.0
.text 835908 835924 16 0.0
BRD4161A+rpc (read only) 823336 823352 16 0.0
(read/write) 143904 143904 0 0.0
.bss 141904 141904 0 0.0
.data 2000 2000 0 0.0
.text 823328 823344 16 0.0
window-app BRD4161A (read only) 806476 806492 16 0.0
(read/write) 125812 125812 0 0.0
.bss 123960 123960 0 0.0
.data 1852 1852 0 0.0
.text 806468 806484 16 0.0
esp32 all-clusters-app c3devkit (read only) 919076 919098 22 0.0
(read/write) 1383466 1383466 0 0.0
.dram0.bss 70720 70720 0 0.0
.dram0.data 14252 14252 0 0.0
.flash.rodata 179376 179376 0 0.0
.flash.text 919076 919098 22 0.0
.iram0.text 62056 62056 0 0.0
m5stack (read only) 967519 967531 12 0.0
(read/write) 449884 449884 0 0.0
.dram0.bss 75184 75184 0 0.0
.dram0.data 34032 34032 0 0.0
.flash.rodata 208540 208540 0 0.0
.flash.text 962135 962147 12 0.0
.iram0.text 123399 123399 0 0.0
k32w light k32w061+release (read/write) 660588 660604 16 0.0
.bss 76608 76608 0 0.0
.data 1868 1868 0 0.0
.text 576312 576328 16 0.0
lock k32w061+release (read/write) 661312 661328 16 0.0
.bss 76848 76848 0 0.0
.data 1888 1888 0 0.0
.text 576776 576792 16 0.0
linux chip-tool-ipv6only arm64 (read only) 8536964 8536996 32 0.0
(read/write) 391009 391009 0 0.0
.bss 56049 56049 0 0.0
.data 1128 1128 0 0.0
.data.rel.ro 261032 261032 0 0.0
.dynamic 560 560 0 0.0
.got 69056 69056 0 0.0
.init 24 24 0 0.0
.init_array 200 200 0 0.0
.rodata 436908 436908 0 0.0
.text 7261028 7261060 32 0.0
thermostat-no-ble arm64 (read only) 2051916 2051964 48 0.0
(read/write) 145697 145697 0 0.0
.bss 64785 64785 0 0.0
.data 904 904 0 0.0
.data.rel.ro 72984 72984 0 0.0
.dynamic 560 560 0 0.0
.got 4064 4064 0 0.0
.init 24 24 0 0.0
.init_array 320 320 0 0.0
.rodata 130700 130700 0 0.0
.text 1706720 1706768 48 0.0
mbed all-clusters-app CY8CPROTO_062_4343W+release (read only) 6224 6224 0 0.0
(read/write) 2352184 2352184 0 0.0
.bss 189428 189428 0 0.0
.data 5304 5304 0 0.0
.text 1314760 1314760 0 0.0
lighting-app CY8CPROTO_062_4343W+release (read only) 6224 6224 0 0.0
(read/write) 2334408 2334472 64 0.0
.bss 180960 180960 0 0.0
.data 5584 5584 0 0.0
.text 1297008 1297072 64 0.0
lock-app CY8CPROTO_062_4343W+release (read only) 6224 6224 0 0.0
(read/write) 2304760 2304824 64 0.0
.bss 179960 179960 0 0.0
.data 5560 5560 0 0.0
.text 1267360 1267424 64 0.0
pigweed-app CY8CPROTO_062_4343W+release (read only) 6224 6224 0 0.0
(read/write) 1139712 1139712 0 0.0
.bss 11756 11756 0 0.0
.data 4368 4368 0 0.0
.text 103096 103096 0 0.0
shell CY8CPROTO_062_4343W+release (read only) 6224 6224 0 0.0
(read/write) 2054256 2054256 0 0.0
.bss 156876 156876 0 0.0
.data 4864 4864 0 0.0
.text 1016856 1016856 0 0.0
nrfconnect lighting-app nrf52840dk_nrf52840 (read/write) 942879 942895 16 0.0
bss 119044 119044 0 0.0
rodata 108916 108916 0 0.0
text 637312 637328 16 0.0
nrf52840dk_nrf52840+rpc (read/write) 928347 928363 16 0.0
bss 116088 116088 0 0.0
rodata 101368 101368 0 0.0
text 632708 632724 16 0.0
nrf52840dongle_nrf52840 (read/write) 993527 993543 16 0.0
bss 121884 121884 0 0.0
rodata 113672 113672 0 0.0
text 669512 669528 16 0.0
nrf5340dk_nrf5340_cpuapp (read/write) 852706 852722 16 0.0
bss 115828 115828 0 0.0
rodata 102092 102092 0 0.0
text 554232 554248 16 0.0
lock-app nrf52840dk_nrf52840 (read/write) 912223 912239 16 0.0
bss 118196 118196 0 0.0
rodata 103884 103884 0 0.0
text 612756 612772 16 0.0
nrf5340dk_nrf5340_cpuapp (read/write) 822294 822310 16 0.0
bss 115016 115016 0 0.0
rodata 97112 97112 0 0.0
text 529712 529728 16 0.0
pigweed-app nrf52840dk_nrf52840 (read/write) 541835 541835 0 0.0
bss 52588 52588 0 0.0
rodata 50104 50104 0 0.0
text 376940 376940 0 0.0
pump-app nrf52840dk_nrf52840 (read/write) 915083 915099 16 0.0
bss 117960 117960 0 0.0
rodata 104248 104248 0 0.0
text 615404 615420 16 0.0
pump-controller-app nrf52840dk_nrf52840 (read/write) 910267 910283 16 0.0
bss 117984 117984 0 0.0
rodata 103400 103400 0 0.0
text 611428 611444 16 0.0
shell nrf52840dk_nrf52840 (read/write) 798527 798527 0 0.0
bss 109776 109776 0 0.0
rodata 78324 78324 0 0.0
text 533928 533932 4 0.0
nrf5340dk_nrf5340_cpuapp (read/write) 711342 711342 0 0.0
bss 107664 107664 0 0.0
rodata 72624 72624 0 0.0
text 451608 451612 4 0.0
p6 all-clusters-app default (read/write) 2410664 2410680 16 0.0
.bss 117772 117772 0 0.0
.data 2584 2584 0 0.0
.text 1368928 1368944 16 0.0
light-app default (read/write) 2330736 2330768 32 0.0
.bss 105544 105544 0 0.0
.data 2408 2408 0 0.0
.text 1289000 1289032 32 0.0
lock-app default (read/write) 2299752 2299768 16 0.0
.bss 104392 104392 0 0.0
.data 2360 2360 0 0.0
.text 1258016 1258032 16 0.0
qpg lighting-app qpg6105+debug (read only) 567476 567476 0 0.0
(read/write) 146940 146940 0 0.0
.bss 89688 89688 0 0.0
.data 1064 1064 0 0.0
.text 562156 562156 0 0.0
lock-app qpg6105+debug (read only) 515904 515904 0 0.0
(read/write) 146936 146936 0 0.0
.bss 88256 88256 0 0.0
.data 988 988 0 0.0
.text 510584 510584 0 0.0
persistent-storage-app qpg6105+debug (read only) 107140 107140 0 0.0
(read/write) 146940 146940 0 0.0
.bss 38504 38504 0 0.0
.data 288 288 0 0.0
.text 101820 101820 0 0.0
telink lighting-app tlsr9518adk80d (read/write) 841718 841742 24 0.0
bss 87492 87492 0 0.0
noinit 37160 37160 0 0.0
text 588346 588366 20 0.0

@msandstedt
Copy link
Contributor

No response at all after deletion of the fabric carrying the session should be possible because to respond requires the use of keys after they were supposed to have been removed. From the client side, I would expect this to look like timeout.

From the description, it sounds like this PR does result in this behavior.

Copy link
Contributor

@bzbarsky-apple bzbarsky-apple left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems fine, insofar as the whole hack we have is fine....

@Damian-Nordic Damian-Nordic merged commit 16e4077 into project-chip:master Jan 26, 2022
@Damian-Nordic Damian-Nordic deleted the fix-remove-fabric branch January 26, 2022 15:30
selissia pushed a commit to selissia/connectedhomeip that referenced this pull request Jan 28, 2022
RemoveFabric command called for the fabric that is used to
process the message crashes in SessionHolder. The reason is
that sessions for the removed fabric are expired when the
command's exchange is closed after sending the response
and that triggers closing all associated session holders.
An exchange's session holder closes the exchange which
in this case results in an attempt to close an exchange
that is in the middle of closing...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants