"Failsafe enabled: no offboard" after copter has been disarmed

Hi there!

We are using a pixhawk cube with PX4 1.8.2.
Also we have a companion computer that connected through UART to the pixhawk.
Using mavros we do all communication.

Our algorithm of the end of landing:

  • we sent the velocity vector to the copter using /mavros/setpoint_raw/local
  • after copter reached a desired point we sent MAV_CMD_COMPONENT_ARM_DISARM
  • we continue to send the velocity setpoints
  • we are waiting on the /mavros/state topic to get armed: False state
  • after we get disarmed copter we stop to send velocity setpoints

And usually all goes well, but sometime we get Failsafe enabled: no offboard.

It’s ok, but failsafe is configured to return to home and copter try to rich 20m height inside landing box. So, it can easily lead to crash.

Our short investigation shown, that PX4 processed the command MAV_CMD_COMPONENT_ARM_DISARM and told to mavros that copter is disarmed. But it looks like here is some race conditions, and some parts of PX4 still think that copter is armed.

Here is a log of one of this flight: https://logs.px4.io/plot_app?log=701783b8-c09d-49d3-9f61-c52cf0066cac

Also we can provide our own ROS logs if you need it.

So, the question is what we are doing wrong and how can we avoid this failsafe?

Thanks for helps!

Reading the docs, it sounds like you’ll get this warning if you stop sending setpoints while you’re still in offboard mode. Any reason not to land in LAND mode/are you sure you’re out of offboard mode before you stop sending setpoints?

Thanks for your answer!
Yes, you are right and copter still in OFFBOARD mode, but it’s reported disarm.
In my opinion, if copter is sending “disarmed” state, so it shouldn’t do any other things, like triggering failsafe, should it?
The reason is we are doing very precise landing, so we need to do it using setpoint and offboard mode.

Probably not - but I’m not an expert. But assuming that in the first instance you just want a solution so you can keep on working … I’d try setting the mode to HOLD or similar before stopping offboard mode. An alternative is to try set https://docs.px4.io/en/advanced_config/parameter_reference.html#COM_OBL_RC_ACT to LAND - so if there is a timeout the vehicle won’t failsafe to RTL behaviour.

@JulianOes Can you answer these questions, or suggest someone else?

  • Should failsafes be disabled when landed and disarmed? Specifically I would expect the offboard mode to not trigger an RTL failsafe in this case.
  • Is there a preferred mode to switch offboard mode out of when landed in order to put it into an “idle” state?

An alternative is to try set…

Yes, that is the solution, but it sounds like a bit dirty workaround. Failsafe is not a right way to work. =)

So, I’ll wait for the answers from @JulianOes and I think, for this moment, waiting for the changes of flight mode from OFFBOARD will be a good solution.
Thanks you a lot!

How are you checking that the copter actually disarmed? Are you checking the mavros_msgs::CommandBool::Response &res in:

From my quick look at the MAVROS code it looks like it usually waits for an ack and only then would report it to be a success.

From the log I can see that PX4 never realized that it was landed. Therefore the question is, did it ever actually disarm? From the PX4 logs I would argue no, it never disarmed.

I checked the commander code and code not immediately see a reason why it would not have disarmed though, it should even disarm if not landed.

(I know it’s scary. It’s something I want to look into this summer!)

I would probably wait until the vehicle reports to be “landed” and only then send the disarm command. Or better yet you could use auto-disarm using https://dev.px4.io/en/advanced/parameter_reference.html#COM_DISARM_LAND.

They should be but, as said, according to the log it was not landed or disarmed yet.

Not that I know off but if really landed it should not happen.

Thanks for the answer!

How are you checking that the copter actually disarmed? Are you checking the mavros_msgs::CommandBool::Response &res in:

No, we are waiting for armed: False in /mavros/state topic: http://docs.ros.org/hydro/api/mavros/html/msg/State.html

Quick look show, that mavros update this flag from two places:

  1. https://github.com/mavlink/mavros/blob/81ca77560d6d5d27dd82dcffcc0f081c251446bd/mavros/src/plugins/sys_status.cpp#L729
  2. https://github.com/mavlink/mavros/blob/81ca77560d6d5d27dd82dcffcc0f081c251446bd/mavros/src/plugins/sys_status.cpp#L677

Looks like it takes info from PX4 directly.

I also can add, that one of the last messages in our ROS-logs is the log from mavros:
FCU: DISARMED by arm/disarm component command. And I’m not sure, but it seems, that PX4 has processed MAV_CMD_COMPONENT_ARM_DISARM, but it didn’t done a disarm completely.

With a first glance, I’ve not found any evidences, that here is no race conditions. Sorry for my insistence, but could you explain me, where is the grantees in the code, that disarming and failsafe can’t happens simultaneously?

I’ve also found, that FCU: DISARMED by arm/disarm component command came from here: https://github.com/PX4/Firmware/blame/354935459993c207e3cc4a69d3ec9161b7332291/src/modules/commander/Commander.cpp#L817 (sorry for blame. We are using 1.8.2 version and in master someone capitalize words in this log message, heh).
Now I’m quite sure, that PX4 process disarm command.
What do you think about that @JulianOes ?
Thanks!

Thanks for the details, that’s very odd and I don’t understand what is going on yet.

Could you share the code snippet around the disarm logic in your code? Then I can try to reproduce this.

For history: i’ve shared code snippets in DM.
Unfortunately, it’s not so easy to share it publicly.

Correct, and it looks all correct from what I can tell.

Therefore, I’ll have to investigate if there is a way that the commander disarms without publishing the result to the rest of the PX4 system.

I’m looking at the commander source of 1.8.2 and I can’t see how a new armed status would be acked but not published immediately to the rest of the system.

  1. The command is handed:
    https://github.com/PX4/Firmware/blob/f13bbacd5277123d6af2cb5ed21587c220031353/src/modules/commander/commander.cpp#L2475-L2476

  2. ARM_DISARM is handled:
    https://github.com/PX4/Firmware/blob/f13bbacd5277123d6af2cb5ed21587c220031353/src/modules/commander/commander.cpp#L768-L823

  3. arm_disarm determines the cmd_result:
    https://github.com/PX4/Firmware/blob/f13bbacd5277123d6af2cb5ed21587c220031353/src/modules/commander/commander.cpp#L805-L811

  4. This cmd_result determines the command ack:
    https://github.com/PX4/Firmware/blob/f13bbacd5277123d6af2cb5ed21587c220031353/src/modules/commander/commander.cpp#L1082-L1085

  5. handle_command returns with true which means status_changed is now true:
    https://github.com/PX4/Firmware/blob/f13bbacd5277123d6af2cb5ed21587c220031353/src/modules/commander/commander.cpp#L2476

  6. If status_changed is true the new vehicle_status and actuator_armed is published here:
    https://github.com/PX4/Firmware/blob/f13bbacd5277123d6af2cb5ed21587c220031353/src/modules/commander/commander.cpp#L2629-L2654

  7. Now the only thing wrong could be that we send an ack but don’t change to disarmed inside arm_disarm which then delegates to arming_state_transition:
    https://github.com/PX4/Firmware/blob/f13bbacd5277123d6af2cb5ed21587c220031353/src/modules/commander/state_machine_helper.cpp#L95-L229

  8. Given ret is TRANSITION_CHANGED this means valid_transition is true and presumably armed->armed set to false.
    https://github.com/PX4/Firmware/blob/f13bbacd5277123d6af2cb5ed21587c220031353/src/modules/commander/state_machine_helper.cpp#L205-L228

I don’t see it yet where it could go wrong :thinking:.

Thanks for this tremendous and great work, @JulianOes !

Yes, I think you are right, but I want to add somethig:

  1. Even if the command was acked and the internal state has been changed, failsafe can happens in this part:
    https://github.com/px4/Firmware/blob/f13bbacd5277123d6af2cb5ed21587c220031353/src/modules/commander/commander.cpp#L1613-L1643
  2. And be processed here:
    https://github.com/px4/Firmware/blob/f13bbacd5277123d6af2cb5ed21587c220031353/src/modules/commander/commander.cpp#L2595-L2607
  3. And here:
    https://github.com/px4/Firmware/blob/f13bbacd5277123d6af2cb5ed21587c220031353/src/modules/commander/state_machine_helper.cpp#L693-L695
  4. Here is the flag status->failsafe set to true, even if a copter isn’t armed:
    https://github.com/px4/Firmware/blob/f13bbacd5277123d6af2cb5ed21587c220031353/src/modules/commander/state_machine_helper.cpp#L432-L439

So, I’ve not found a right place, where it can lead to change of the status->armed flag to true, but it’s really suspiciously, isn’t it?

Oh, yes, I’ve wrote the code that can easily reproduce the situation, when a failsafe is enabled even if a copter reported a disarm.
It’s happens every time, if you stop sending the setpoints, but do not change a flight mode from OFFBOARD.
In this case, event if internally vehicle_status.armed_state != ARMED, firmware will trigger the failsafe.
If you have a code, that can flight by setpoints in offboard mode, just do this:

  1. Takeoff by setpoint
  2. Send ARM_DISARM to disarm copter
  3. Wait for state.armed = false from /mavros/state topic.
  4. Stop sending setpoins
    And see the result.

I think, it relates to my previous replay.
Hope it’ll help @JulianOes
Thanks!

1 Like

Thanks for the digging, you’re correct that it can still go into failsafe and that’s probably wrong or not always desired and we should fix it, however, I still can’t see how it would arm again :thinking: