I have a PX4 simulator running in a k8s cluster with open UDP endpoints and can connect, disconnect, and reconnect with it via QGC. However, if I run the mavsdk_server against the simulator, it will only complete initialization if it is the first connection made to the simulator since it was restarted. So after restarting the simulator, mavsdk_server can connect. If I connect first with QGC, then disconnect QGC, and attempt to connect with mavsdk_server, it fails to finish initialization. If I successfully connect with mavsdk_server, disconnect, and reconnect with the mavsdk_server, then it again fails to finish initialization.
When I say it fails to initialize I mean that it hangs with the following output:
[11:57:08|Info ] MAVSDK version: v1.4.13 (mavsdk_impl.cpp:20)
Waiting to discover system on udp://<non-zero IP>:18570...
[11:57:08|Debug] Initializing connection to remote system... (mavsdk_impl.cpp:494)
On a successful connection, it says the system was discovered and I can see other messages. While it is hanging on this message, if I restart the simulator then it will connect once the simulator comes back online.
Is there a special way that I need to disconnect? Is this behavior an issue on the PX4 side, or the MAVSDK side?
<non-zero IP>:18570 is the <IP>:<Port> for the PX4 simulator. In this case, the mavsdk_server is acting as a client to the simulator, so I think this is correct. Unless I am misunderstanding your meaning?
Same for me: run PX4 sim in docker and trying to connect to udp://172.17.0.2:18570 with mavsdk_server - connects successfully on the 1st time, and fails all the followings. While QGC can connect-disconnect many times, exit, reconnect, etc.
Seems some mavsdk_server bug or problem. What could we do to solve? Report issue on Github?
I think it has to do with the port where traffic is returned to. The port in the other direction is chosen by the operating system, at least for MAVSDK.
So that would mean PX4 sends traffic to a random UDP port. If you then disconnect and connect again, PX4 might still send traffic to the previous port.
For QGC, it could be that it selects a specific port and that way can re-connect.
I think it would be worthwhile to look at the port numbers using wireshark to confirm my theory.
The thing is that the port 18570 is not really a port to rely on. Usually the way it’s described is that the drone broadcasts on 14550 and 14540 but the choice of 18570 is more of an implementation detail.
So what is happening is that PX4 sends traffic to the port where it first happened to receive traffic from but then doesn’t reset the port after the connection is lost, and so the remote port is not reset.
I’m not quite sure yet how to change the PX4 side.
The whole logic on the PX4 side is a bit convoluted if you ask me.
Yes, we used it, then switched to pure Ubuntu to reduce the image size by few GBs.
Hmm, very cool image! However it is not clear how can it unblock us? Stream simulation to a specific IP and 14540 port specially for MAVSDK? I"ll ask our team if we could know/specify MAVSDK IP beforehand for the simulator. Some scripts in the repo was not updated for years - not sure how they’ll work: our old scripts did not work with newest PX4 changes so we forced to update them as well. Also current PX4 has problems with HEADLESS mode, etc. Need to try it however: latest Release was in Jan.
Our scripts also allow us to stream to a specific IP - need to check what port it streams to.
I tried to connect mavsdk_server to 14550 or 14540 - but it does not connect at all even in the 1st time - only to 18570.
So You think problem is in PX4? Not sure about correct logic there: if one reconnects - should it reroute traffic to the new port/client or always send only to the 1st client/port?
BTW, PX4 reports connection regained on mavsdk_server rerun:
[Wrn] [Event.cc:61] Warning: Deleting a connection right after creation. Make sure to save the ConnectionPtr from a Connect call
INFO [commander] GCS connection regained
INFO [health_and_arming_checks] Preflight Fail: No manual control input
INFO [commander] Connection to ground station lost
INFO [health_and_arming_checks] Preflight Fail: No manual control input
INFO [commander] GCS connection regained
INFO [health_and_arming_checks] Preflight Fail: No manual control input
So You propose to patch PX4, not MAVSDK? We had problems with latest PX4 versions - we used d6b523b574875a9f640620d1e90c8277fa13781c commit as You may saw above because of the bug in HEADLESS mode. Will try Your branch now as I found the workaround for it - hope it will work.
BTW, could such situation appear with real drone? May be need to patch mavsdk_server as well?!