Huge Control Delay in Offboard Mode with ROS2 RTPS

The official doc gave an example of offboard control using position setpoints sent at 10Hz. If I try to speed up the control loop to ~100Hz, I begin to see large delay between setpoint sent from ROS2 and actual vehicle response (~0.5 - 1 second visually noticeable delay). I have tried a couple of things to alleviate this problem and below is a list of attempts/guesses after reading some of the source code. It would be much appreciated if someone familiar with mc_att_control, mc_pos_control, mc_rate_control, and / or RTPS can explain why this is happening.

  • This comment in the official example suggests that setpoint and control model messages be sent in pairs. I tried decoupling the two messages to two asynchronously. It seems like running the control model publisher at 10Hz and setpoint publisher at much higher frequency helps with (eliminates) the delay. Maybe its worth updating the official example to not send the two messages in a synchronous loop?
  • I am also trying to do attitude / bodyrate control in offboard mode, and I achieved it by adding the RTPS topics in this PR. After reading some source code, if I understand it correctly, mc_pos_control runs at a much lower frequency than mc_att_control and mc_rate_control, so maybe there are different ranges of acceptable frequencies to run the setpoint publisher loop for the 3 different setpoint types. Can this be documented somewhere?
  • Can this delay be caused by RTPS bandwidth limitations? Using the decoupling method, I was able to achieve 200Hz bodyrate control. However, after adding in a motion capture node which sends pose estimates to the vehicle_visual_odometry channel at 50Hz, the large control delay occurred again. I have to reduce the control rate down to 100Hz to remove (at least visually) the control delay. Could it be caused by micrortps_agent reaching some bandwidth limit?
  • Everything I have tried so far is on the Gazebo simulator. Could there be some bandwidth problem in sitl_gazebo such that the delay can go away on an actual hardware drone?

Any help is appreciated!

1 Like

Is this the actual communication delay? or the simply the position controller being sluggish and converging slowly to the setpoints? Do you have a log that shows the problem you are describing?

I think these two should not matter. Have you identified where such a delay is being caused?

From your comments, it does sound like you might have a bandwidth limitation issue. I think it is important to check whether this is the case with maybe comparing the timestamps on the px4 side to see if this is true

I am using body rate setpoints and I am pretty sure this is not the “convergence delay,” since reducing the setpoint publisher frequency made the control more responsive visually, and the delay was large (~0.5 - 1 second).

Here is a flight log with a 330Hz body rate publisher and 50Hz visual odometry publisher. I have not used any log analyzer before. Do you have any pointers on what would I be looking for in the logs?

Another weird observation: the logged vehicle_visual_odometry messages sometimes have timestamp < timestamp_sampled. When I published with ROS2, I made sure timestamp > timestamp_sampled. Is the logged timestamp different from the published timestamp?

1 Like

@alvinsunyixiao What are the received bodyrate setpoints inside the system? Do you get 330Hz body rate messages inside PX4?

Ah no. Here is a screen shot of the converted CSV file for vehicle_rates_setpoint

The leftmost column is the timestamp field, and the dt is ~14-24ms, which is much slower than 330Hz. I tried reducing it to 200Hz and the received frequency is pretty much the same.

If this is a bandwidth issue, is there any workaround for this? I recall in ROS1, MavROS is able to handle at least 200Hz body rate control with 100Hz motion capture feed.

@alvinsunyixiao Great! At least we know now what is causing the issue

One small note is that being able to send 200Hz does not necessarily mean that the vehicle is using it at 200Hz. This would largely depend on what kind of hardware you are using. In the end, if you are using regular 3 phase controlled BLDC motors, the motor speed update frequency is once per revolution - which is normally around 100Hz (for a 8 inch prop quadrotor), faster if you are using a high kv motor. Therefore there is really no direct benefit of stuffing data into the pipeline since your actuator update rate is not that fast.

If this is a bandwidth issue, is there any workaround for this? I recall in ROS1, MavROS is able to handle at least 200Hz body rate control with 100Hz motion capture feed.

This is not really related to the performance of mavros. This depends more on the link you are using and the quality of the link. In principle, RTPS protocol should have much larger throughput compared to mavlink. That is the reason I don’t think it is really the protocol itself but something around the setup and configuration of your system.

Which link are you sending your data over? Could you outline more on what your setup is? (How are you sending your rtps topics?

2 Likes

I have a similar issue:
I try to send attitude commands to my quadcopter in offboard mode through ros2/micrortps.
the system have a big delay and my controller become unstable…

My setup is in SITL ( :sweat_smile:). I have a ros2 node publishing these messages to px4 at 100Hz:

  • offboard_control_mode
  • vehicle_attitude_setpoint
  • vehicle_local_position_setpoint (needed by hover thrust estimator and land detector…)

The system has a variable delay up to 1 second. I can confirm that by adding the vehicle_attitude_setpoint going back from px4 to ros2 through the bridge.
ploting both topic vehicle_attitude_setpoint/in and vehicle_attitude_setpoint/out show that delay.

Reducing the publication frequency of offboard_control_mode and vehicle_attitude_setpoint fix the issue.

I am quite afraid of this issue coming back on my real system even with a limited number of publications.

Are there any known limitations that cause these delays ? or any good practice to follow ?
I am considering adding some timeout on attitude setpoint timestamp in px4 to trigger offboard mode failsafe, would it be a good thing?