Iterative learning with PX4


Hi Everyone,

I am using PX4 as part of my flight stack, where I am developing a reinforcement learning algorithm to navigate the drone.
I am also using Gazebo to simulate the drone.

Currently, after each training iteration, I am restarting PX4 using a bash script. However, doing so means that it takes about 5-10 seconds for each learning step, which will drastically slow down learning.

After a particular iteration, I want to reset the robot back to its initial pose in order to run another simulation. I am wondering, is there a way to programmatically “reset” PX4 or to re-calibrate it, so that I don’t need to reboot it each time?

Your help is really appreciated.



@SkiBum326 Running the PX4 STIL means you are also simulating the firmware. Meaning, you cannot initialize the firmware to a arbitrary state. For reinforcement learning, you need to be able to initialize your firmware where you are mid-flight .etc (or at least you need this to get better performance)

I am not sure why you need to emulate the firmware for reinforcement learning. In the end, your network will only learn from the states you define, therefore, putting PX4 as part of your simulation seems unnecessary for me. What you need is closer to a dynamics simulation rather than a simulation including the whole flight stack. If you want to utilize the simulation plugins in gazebo, there are projects such as gazebo-gym, openai_ros which provides these functionalities


I don’t think this feature exists in PX4 SITL but you could probably speed up the re-initialization quite a bit by not killing make posix... and re-running it. If you dig around the code and find what is actually being run, you’ll be able to find the command that just runs the PX4 binary that will then connect to gazebo. I imagine starting and stopping that is much faster than also running make and starting and stopping gazebo.

I’m sure it’s quite easy to just move a gazebo model to a given location programmatically. So you would run gazebo, and then just run in a loop (1) move vehicle to start position, (2) run PX4, (3) kill PX4.