You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While debugging issues with running the regression tests in PR #418, we noticed that test_restconf_internal.sh often failed, and always at the same step:
Test 41(41) [11. Get restconf status rpc]
Test 42(42) [send rpc status]
grep: warning: stray \ before -
grep: warning: stray \ before -
Test 43(43) [Kill backend]
Feb 10 12:27:07: Killing old daemon with pid: 60371
Test 44(44) [Restart backend -s none]
Test 45(45) [kill old backend]
Test 46(46) [start backend -s none -f /var/tmp/./test_restconf_internal.sh/conf.xml]
Test 47(47) [wait backend]
Test 48(48) [wait restconf]
speed 38400 baud; line = 0;
intr = ^C; quit = ^\; erase = ^?; kill = ^U; eof = ^D; eol = <undef>;
eol2 = <undef>; swtch = <undef>; start = ^Q; stop = ^S; susp = ^Z; rprnt = ^R;
werase = ^W; lnext = ^V; flush = ^O; min = 1; time = 0;
-brkint -imaxbel
Test 49(49) [12. Get restconf (running) after restart]
Test 50(50) [send rpc status]
grep: warning: stray \ before -
grep: warning: stray \ before -
Error in Test50 [send rpc status]:
Expected
^<rpc-reply xmlns="urn:ietf:params:xml:ns:netconf:base:1.0" message-id="42"><active xmlns="http://clicon.org/lib">true</active><description xmlns="http://clicon.org/lib">Clixon RESTCONF process</description><command xmlns="http://clicon.org/lib">/.*/clixon_restconf -f /var/tmp/./test_restconf_internal.sh/conf.xml -D [0-9] .*</command><status xmlns="http://clicon.org/lib">running</status><starttime xmlns="http://clicon.org/lib">20[0-9][0-9]\-[0-9][0-9]\-[0-9][0-9]T[0-9][0-9]:[0-9][0-9]:[0-9][0-9]\.[0-9]*Z</starttime><pid xmlns="http://clicon.org/lib">0</pid></rpc-reply>$
Received:
#889
<rpc-reply xmlns="urn:ietf:params:xml:ns:netconf:base:1.0" message-id="42"><active xmlns="http://clicon.org/lib">false</active><description xmlns="http://clicon.org/lib">Clixon RESTCONF process</description><command xmlns="http://clicon.org/lib">/usr/local/sbin/clixon_restconf -f /var/tmp/./test_restconf_internal.sh/conf.xml -D 0 -l s -R <restconf xmlns="http://clicon.org/restconf"><enable>true</enable><auth-type>none</auth-type><debug>0</debug><log-destination>syslog</log-destination><enable-core-dump>false</enable-core-dump><pretty>false</pretty><socket><namespace>default</namespace><address>0.0.0.0</address><port>80</port><ssl>false</ssl></socket></restconf></command><status xmlns="http://clicon.org/lib">stopped</status></rpc-reply>
##
We've reproduced this on three different machines, all running a Ubuntu 18.04 or 22.04 derivative, despite the tests running in an Alpine Linux docker container. (On an old Fedora 33 server, however, all tests pass ... unclear why, but worth mentioning for reference.)
Turns out that at the end of step "11. Get restconf status rpc", when stop_backend -f $cfg is called, the clixon_backend -z -f $cfg does not stop the clixon_restconf child. So in step "12. Get restconf (running) after restart" the rpcstatus call fails.
Adding a killall "fixes" the problem:
@@ -320,6 +320,7 @@ if [ $BE -ne 0 ]; then
fi
# kill backend
stop_backend -f $cfg
+ killall clixon_restconf
fi
# Restconf is enabled and restconf was running but was killed by stop ^.
Obviously this is not the proper fix. Question is why clixon_backend doesn't kill the child it started, or why a new clixon_backend cannot reconnect with an already running clixon_restconf process? I'm not up to speed how that dynamic works, but maybe someone else in the wider Clixon community do.
There were a few other warnings as well, for instance the "stray \ before -" grep warning. This seems to just be a minor regexp mistake (- not needed outside of regexp ranges):
I think that test_restconf_internal.sh suffers from some quality problems, ie the test itself, it was somewhat problematic writing it because of timing and multile processes.
Thank you for your effort.
The "internal" restconf is one of three ways to start the restconf daemon, as described here: https://clixon-docs.readthedocs.io/en/latest/restconf.html#datastore
ie by forking from the backend. it is probably not the most common scenario (such as starting the restconf daemon separately) but there are products using this method.
test_restconf_internal.sh tests this.
But as you point out there are some issues.
If you want you are welcome to debug this, at least the symtom: what more specifically is wrong, and how to repeat it deterministically.
There is some docs here on starting sub-processes.
While debugging issues with running the regression tests in PR #418, we noticed that
test_restconf_internal.sh
often failed, and always at the same step:We've reproduced this on three different machines, all running a Ubuntu 18.04 or 22.04 derivative, despite the tests running in an Alpine Linux docker container. (On an old Fedora 33 server, however, all tests pass ... unclear why, but worth mentioning for reference.)
Turns out that at the end of step
"11. Get restconf status rpc"
, whenstop_backend -f $cfg
is called, theclixon_backend -z -f $cfg
does not stop theclixon_restconf
child. So in step"12. Get restconf (running) after restart"
therpcstatus
call fails.Adding a
killall
"fixes" the problem:Obviously this is not the proper fix. Question is why
clixon_backend
doesn't kill the child it started, or why a newclixon_backend
cannot reconnect with an already runningclixon_restconf
process? I'm not up to speed how that dynamic works, but maybe someone else in the wider Clixon community do.There were a few other warnings as well, for instance the "stray \ before -" grep warning. This seems to just be a minor regexp mistake (- not needed outside of regexp ranges):
Compared to the Fedora machine we also noticed that restoring the tty with stty also gives some noise. Here's a lib.sh patch:
Finally, in lib.sh, there's an additional pkill that seems a bit malplaced:
The text was updated successfully, but these errors were encountered: