Hi Ryan,
We have some rare cases in which the ZigBee Linux server still crashes. After that happens, the script will enter into a loop like that:
[20:08:39.495,055] [Z_STACK/HNDL] ERROR : ERROR: signal 11 was trigerred:
[0m[37m[20:08:39.495,216] [Z_STACK/HNDL] ERROR : Fault address: 0xfffffff8
[0m[37m[20:08:39.495,248] [Z_STACK/HNDL] ERROR : Fault reason: address not mapped to object
[0m[37m[20:08:39.496,883] [Z_STACK/HNDL] ERROR : Stack trace unavailable
[0m[37m[20:08:39.496,979] [Z_STACK/HNDL] ERROR : Executing original handler...
[0m[37mpid 725 is not there
count is 1, not 4
kill -SIGUSR2 681
caught SIGUSR2, a server other than NWKMGR died!
waiting for GATEWAY SERVER to exit
tracker exiting
[20:08:41.496,238] [GATEWAY/LSTN] ERROR : SRSP Cond Wait timed out!
[0m[37m[20:08:41.496,418] [GATEWAY/LSTN] ERROR : apicSendSynchData() failed getting response
[0m[37m[20:08:43.497,055] [GATEWAY/MAIN] ERROR : SRSP Cond Wait timed out!
[0m[37m[20:08:43.497,161] [GATEWAY/MAIN] ERROR : apicSendSynchData() failed getting response
[0m[37mrecv: Connection reset by peer
waiting for OTA SERVER to exit
waiting for Zstack linux to exit
waiting for NPI to exit
NETWORK MANAGER exited with code 140 on Sun Mar 7 20:08:48 CET 2021
making sure there are no lingering servers...
there are 0 NPI servers
there are 0 ZLS servers
there are 0 GATEWAY servers
there are 0 NWKMGR servers
there are 0 OTA servers
(total 0)
done
a server besides NWKMGR has exited!
ignoring exit code 140 from netmgr
making sure there are no lingering servers...
there are 0 NPI servers
there are 0 ZLS servers
there are 0 GATEWAY servers
there are 0 NWKMGR servers
there are 0 OTA servers
(total 0)
done
waiting for netmgr to exit ( pid 0 ) on Sun Mar 7 20:08:51 CET 2021
oops! Network manager has already exited (!) on Sun Mar 7 20:08:51 CET 2021
making sure there are no lingering servers...
there are 0 NPI servers
there are 0 ZLS servers
there are 0 GATEWAY servers
there are 0 NWKMGR servers
there are 0 OTA servers
(total 0)
done
a server besides NWKMGR has exited!
ignoring exit code 127 from netmgr
making sure there are no lingering servers...
there are 0 NPI servers
there are 0 ZLS servers
there are 0 GATEWAY servers
there are 0 NWKMGR servers
there are 0 OTA servers
(total 0)
done
waiting for netmgr to exit ( pid 0 ) on Sun Mar 7 20:08:53 CET 2021
oops! Network manager has already exited (!) on Sun Mar 7 20:08:53 CET 2021
making sure there are no lingering servers...
there are 0 NPI servers
there are 0 ZLS servers
there are 0 GATEWAY servers
there are 0 NWKMGR servers
there are 0 OTA servers
(total 0)
done
a server besides NWKMGR has exited!
ignoring exit code 127 from netmgr
making sure there are no lingering servers...
there are 0 NPI servers
there are 0 ZLS servers
there are 0 GATEWAY servers
there are 0 NWKMGR servers
there are 0 OTA servers
(total 0)
done
waiting for netmgr to exit ( pid 0 ) on Sun Mar 7 20:08:56 CET 2021
oops! Network manager has already exited (!) on Sun Mar 7 20:08:56 CET 2021
making sure there are no lingering servers...
there are 0 NPI servers
there are 0 ZLS servers
there are 0 GATEWAY servers
there are 0 NWKMGR servers
there are 0 OTA servers
(total 0)
done
a server besides NWKMGR has exited!
ignoring exit code 127 from netmgr
making sure there are no lingering servers...
there are 0 NPI servers
there are 0 ZLS servers
there are 0 GATEWAY servers
there are 0 NWKMGR servers
there are 0 OTA servers
(total 0)
done
waiting for netmgr to exit ( pid 0 ) on Sun Mar 7 20:08:58 CET 2021
oops! Network manager has already exited (!) on Sun Mar 7 20:08:58 CET 2021
making sure there are no lingering servers...
there are 0 NPI servers
there are 0 ZLS servers
there are 0 GATEWAY servers
there are 0 NWKMGR servers
there are 0 OTA servers
(total 0)
done
Technically I could kill the script where you have the "oops" and restart the process. Just asking first, in case you have any "in script" idea how to handle this correctly as it looks like there is some restart mechanism in please already which just fails for some reason.
Regards
Peter