EKF filter fault warning and problems during flights

Hi! A coworker and I have been testing two drones with PX4 and we have experienced some problems with EKF warnings, vibrations and height divergence and after testing multiple options it’s still not clear to us why it’s failing.

Problem description

When the drone is flying and a “big movement” is commanded, for example, when moving from one mission waypoint to another, during RTL mode or controling the drone with the RC the EKF gives the next warning:

  • [EKF2] primary EKF changed 3 (filter fault) → 2

The EKF starts switching from one instance to another, sometimes it only does it once, but it can switch multiple times.

On some flights, at the exact moment the EKF starts to give warnings the altitude started to increase or decrease, in some situations the pilot had to take manual control. During log analysis we noticed that fused altitude estimation started to diverge on the same timestamp in some of the logs.

At first we thought the problem might be caused due to vibrations, as they’re a bit higher than the recommended values and most of the times the EKF warning seems to happen when the vibrations are higher, but after more test we’re not clear about this issue and we would appreciate some advice on how to verify it.

Setup:

  • Drone chasis: T-Motor M690A
  • Propellers: Original ones that come with the T-Motor, carbon fiber, low weight.
  • FMU: Pixhawk 4 (Holybro)
  • GPS: Pixhawk 4 GPS Module (Holybro M8N GPS)
  • Firmware version: PX4 v1.12.3
  • Onboard computer: Nvidia Jetson Nano
  • Telemetry: Microhard module
  • Battery: 16000mAh 4S with and without BMS

More setup information

  • The drones sensors have been calibrated and the PIDs were adjusted
  • Internal magnetometer disabled (using the GPS one) due to interferences and inconsistency between internal and external mag → EKF running with 2 instances (1 mag and 2 IMUs)
  • FMU mounted to the chasis using 3M dual sided foam
  • Primary height source → Barometer

Both drones have the same setup and have experienced the same problems, so we don’t think it’s a FMU hardware related problem.

Flight testing

First test block. Errors start to happen

The first time errors happend were during simple missions. The internal mag was disabled as stated above by setting CAL_MAG0_PRIO to 0 (Disabled).

Log examples:

In both cases the fused altitude estimate starts to change when the error occurs and the drone experienced an altitude change.

During log analysis we found out the EKF numbers didn’t make much sense as there should only be two instances and in fact the process_logdata_ekf.py script fails to analyse them. This was because setting CAL_MAG0_PRIO to Disabled doesn’t automatically adjust the SENS_MAG_MODE and EKF2_MULTI_MAG.

Vibrations graphics seem a bit high and EKF warnings tend to happen when the vibrations are at it’s peak, we though that might be causing the EKF problem and decided to change the FMU mounting system.

Second test block. Using an FMU mounting bed

Parameters modified:

The FMU was mounted using a platform similar to this one: FMU mounting platform

Sensors were recalibrated and the PIDs had to be slightly readjusted. A few flights were permormed and the error didn’t happen with this setup.

Log examples:

The problem seems to be fixed, there were no EKF warnings during all the tests with this setup and altitude was always stable. The online analysis tools indicates that vibrations seem similar and are still high. When running the process_logdata_ekf.py it indicates that IMU vibrations are high and the graphics seem comparable, sometimes the peak values are even higher with this mounting platform.

Third test block. FMU mounted on gel and on the same pads as in the first test.

As some EKF parameters were changed before the second test block we decided to test again with the same original mounting system (same as first test block) and also with a green mounting gel. PID values were restored as they seem better during log analysis and drone flew more stable this way.

Log examples:

The EKF warning starts to happen again and with both mounting systems. Vibrations remain similar to those obtained with the mouting platform and are still a bit high. Somehow the platform seems to solve the problem but vibration metrics don’t show why this would be.

Fourth test block. Testing internal magnetometer divergence

The internal magnetomer had to be disabled in this drones as there were some interferences and the indicated heading started to diverge (between 60 and 95 degrees) after the drone was armed and went back to the correct value once it was disarmed.

To test if this interference problem or the fact that only one magnetomer was used had any relation with the EKF problems, maybe it affected other sensors readings, more tests were performed.

As we suspect that the battery BMS might be causing the noise problem it’s taken off. The onboard computer and microhard telemetry were also disconnected and a basic SiK radio telemetry system was used instead. The internal mag is enabled again, sensors get recalibrated and the EKF is left as default (4 instances running).

With this basic setup the divergence problem seems to be fixed (only 0-3 degress of error during arming and it stabilizes). The EKF error still happens.

Conclusion

The error seems to be somehow related to the vibrations as the setup with the mounting platform seems to fix the EKF problem, but the log analysis shows that vibrations are still higher than recommended and they seem comparable to the other mounting methods tested, sometimes even higher.

Why would appreciate any ideas or information that could help us check what is really causing the EKF to switch instances and what solutions can be tested to reduce the vibrations, as all the mounting systems tested (including the original Pixhawk 4 pads, that had been tested in previous tests) seem to give somehow bad results.

Please contact if you need more information or if you need that I upload the logs directly.

Thanks.

1 Like

Hi antonio-sc66,

i experienced the same issue and kuchenesser told me to to use only one IMU as a solution.

Do you know more now about this topic?

Best regards

Hi @Mohannad,

Yes, in this case I managed to get rid of the problem but it took me a lot of time and effort to realize what it could be.

As in the link you provided, in my case vibration levels were in the yellow and red zone all the time and I couldn’t get them lower. No cables were touching the controller, different anti-vibration pads were tested and nothing seemed to work.

In my case, I realized that the top of the frame was not completely parallel to the ground, so when I did the horizon level calibration, there was an error between the drone beeing on the ground and what the drone “thought” it was being leveled when flying.

My solution to this problem was to level the drone as if it would be flying, that is, the arms were leveled although the landing gear was not (fixed type). Then I recalibrated the horizon level with the drone in that position and after that, vibration levels went straight into the green zone and the error disapeared. I don’t now 100% if that was the real solution, but that was the only thing I changed. That same procedure was replicated in three other drones and the error went away in all of them.

There was not a lot of difference between the old calibrations and the new ones, but it seems to have a big effect as the drone gets confused when flying and seems to be doing micro-corrections all the time.

If this is your case, to do the leveling you might need a stand or maybe put some pads under the landing gear until you manage to level the arms.

Hope it can help you

Thank you very much for your answer.
Vibration is really a big thing.
In my case, the main cause was that the drone parts were fixed with Rivets. However, it is not 100% fixed. Therefore, i tried to glue the parts first and then use the Rivets.
The vibration in both drones “Tarot x4” and “Tarot Ironman 650” are at the green level but still i am getting the acceleration clipping problem.

Best,

Yeah you really got to chase those vibrations. They mess up all the sensors. A typical 5" racer is way below the yellow all the time if it’s properly built. ,)