Explanation of why killing RTPS offboard agent causes PX4 failure


My colleagues and I are utilizing ROS2 with a Pixhawk 4 and Raspberry Pi 4 for offboard control of a fixed wing aircraft. While doing test flights we had the aircraft flying around a loiter point in Mission mode and then triggered Offboard, after some unexpected behavior (probably our Offboard code) the safety pilot took control in Stabilized mode. The aircraft was then returned to the loiter point in Mission mode. The nature of our Offboard code requires it to be restarted before triggering it again. To restart I killed the Offboard node and the RTPS Agent on the Raspberry Pi while the aircraft was loitering. This was not smart considering we knew that restarting the RTPS Client can cause autopilot issues, but we had restarted the RTPS Agent during HITL testing and had had no issues. The aircraft immediately fell out of the loiter, the safety pilot indicated they had no control, and the telemetry connection failed. The aircraft then crashed.

I would like someone to explain why when operating in a non-Offboard mode that killing the RTPS Agent would cause the Pixhawk to become totally unresponsive. Also, if someone could explain why HITL would not catch this problem that would be very useful as well.

Airframe: Standard Plane
Standard Plane (2100)
Hardware: PX4_FMU_V5 (V500)
Software Version: Custom based on release 1.12.3
OS Version: NuttX, v8.2.0
Estimator: EKF2

I apologize for not uploading the flight log from the crash, but it contains custom messages and location data I’m not keen to upload online.

Thanks in advance.