ROS2 Performance Degrades w/ Multiple Vehicles

I have multiple dev drones (modalai voxl2 platform) running PX4 autopilot that I’m attempting to interface with and control using ROS2 foxy distro. I’m sending mocap data as odometry to the vehicles at 120 Hz and a node running onboard each vehicle to republish it the the uorb topics exposed by the microdds client.

I’m noticing that everything works fine until I add a third vehicle to the network then performance degrades significantly and with a fourth vehicle ros2 becomes unusable. There are frequent stalls in the publisher up to 1-2s and consistently high latency (>20-30ms).

Has anyone encountered a similar issue? I read that using a ros discovery server can cut down on network traffic but when I connect the vehicles to a discovery server on the groundstation the topics published to the vehicle from the microdds client/agent are no longer exposed and I’m not sure how to get around that.

You likely reach the limits of your WiFi network.

I would put everything on Local host only and add a custom protocol to send the commands to each drone in order to limit the WiFi overhead.

I don’t think it’s a bandwidth issue, but I’m trying to look into that as well. The total payload transmission for 4 vehicles only amounts to around 2 Mbps. I’m sending the data via 2x2 MIMO telemetry radio that’s supposed to support 25 Mbps and I’m only using it to transmit this data. I’ve also seen this issue persist when I reduce the data rate (sending at 120 Hz → 60 Hz doesn’t help)

I’ve been looking at the network traffic in wireshark and where I’d expect a pretty much one way flow of data to the vehicles I’m seeing the network getting flooded with RTPS heartbeat and ACKNACK messages between the vehicles and the ground-station. From what I’ve read the ACKNACK messages are for the data readers to request missing frames from the data writers. I’m using the sensor data qos profile (best effort, volatile, keep last 5) so I would expect missing frames to just get dropped, but I guess that’s not how the underlying RTPS protocol works unless I have to further configure the dds settings

In my opinion, you just described a limited bandwidth cascade failure.

Bandwidth too low
You lose a packet
The request for new packet increases the required bandwidth
You lose more packets

DDS is a pain on unreliable networks.
I maintain to use a different protocol for the multi vehicle.

1 Like

I think you’re right about it being a bandwidth issue, but mostly because of how the default configs are set up for ros2’s fast-dds middleware. The udp send in the rtps layer is also apparently blocking by default so that might be contributing as well.

I set up a bandwidth test to emulate the payload of my mocap data and the issues go away entirely when I run a discovery server on the ground station computer and have the vehicles connect to it. The only problem is now whatever nodes I run connected to the discovery server are isolated from the px4 topics exposed by the microdds agent running on the vehicles’ onboard computers.

So I think that the ros2/eprosima fast dds implementation is at least capable of meeting the performance requirements for my application. I just need to figure out how to bridge the microdds agent discovery process and the ros discovery server, or set up the dds configuration in a way that emulates the behavior of running the discovery server (maybe set up static discovery and hardcode all the endpoints?)

1 Like