PX4 Sync / Q&A: July 10, 2024
Agenda
Announcements
Release Discussion
Q&A
Announcements
Release Discussion
Just couple fixes before the release are left.
PX4:main
← PX4:pr-ekf2_astyle_follow_up
opened 03:09PM - 10 Jul 24 UTC
PX4:main
← PX4:maetugr/fix-subscription-interval-timestamp
opened 12:33PM - 10 Jul 24 UTC
### Solved Problem
When debugging #23378 , the only instance where timestamps g… et negative/wrap the unsigned datatype was the case where `SubscriptionInterval`s are initialized less than the interval time (e.g. 1 second) after boot and hence the start of the microcontroller timer. This made the cases fail after invalid assumptions were made that this case would never normally happen without checking even once here: https://github.com/PX4/PX4-Autopilot/pull/22881/files#diff-5971d648b2b246dfb01ec5d5006b5258cfdbc41b73f50db1abe7dd5236f855e1R167-R172
The change producing a few timestamps after boot that wrap was introduced here:
https://github.com/PX4/PX4-Autopilot/pull/14181/files#diff-c82d12a48f93eeb820597a14504a6a2d58cb7c09ad86d2e7da6aa6b04e144628L124-R128
and it never resulted in any problem because even across the wrapping the result of the elapsed time calculation was correct again.
Fixes #23378
### Solution
The stamp stays 0 if subscriptions are initialized less than the interval time after boot instead of wrapping the unsigned timestamp.
It might not be optimal when the 64 bit time actually wraps because then for one intervals time the `_last_update` is not updated anymore but I hope that's acceptable given this should only happen after 500+k years of runtime.
### Changelog Entry
```
Bugfix: Timestamp wrapping when initializing SubscriptionInterval less than the interval time after boot
```
### Alternatives
Even with this fix I highly suggest reverting the change that made elapsed time across wrapping calculations impossible: https://github.com/PX4/PX4-Autopilot/pull/23380
### Test coverage
I ran this in SITL printing out all the `_last_update` timestamps of all instances and they do not wrap anymore shortly after boot but rather stay zero until the interval time passed once.
Q&A
@LudovicVanasse There is a question mentioned in reply to this thread regarding a custom task locking in the logger.
@dagar : checking in details with perf counters to find out what is going on
Range finder terrain estimate polling mechanism and how to address it. Two logs below:
1- No Flare: https://review.px4.io/plot_app?log=db265314-1698-41c5-9d3b-cf587943545d
2- Flare: https://review.px4.io/plot_app?log=6463b56d-fa4b-4fe6-97a7-de1e6c045291
@dagar vehicle local pos and estimator uorb topics to be checked for the validity. Bottom distance has to be valid.
@AlexKlimaj suggesting to add bottom dist validity for flight review.
@Rowan_Dempster : How the sphere from mag calibration is normalized inside the sphere?
@dagar using a model to validate and normalize the raw calibration data.
Review needed
@MaEtUgR :
PX4:main
← PX4:maetugr/fix-uorb-timestamps
opened 08:42PM - 09 Jul 24 UTC
### Solved Problem
Fixes #23378
### Solution
reverting a small part of 4a55… 3938fb6bc7a853b83d175544fff26b404f68
### Changelog Entry
```
Bugfix: allow time differences across timestamp wrap again
```
### Alternatives
We could check why these occur in the first place.
### Test coverage
The SITL test procedure I used to reproduce the original issue passes again.
### Context
https://github.com/PX4/PX4-Autopilot/pull/22881
PX4:main
← PX4:maetugr/fix-subscription-interval-timestamp
opened 12:33PM - 10 Jul 24 UTC
### Solved Problem
When debugging #23378 , the only instance where timestamps g… et negative/wrap the unsigned datatype was the case where `SubscriptionInterval`s are initialized less than the interval time (e.g. 1 second) after boot and hence the start of the microcontroller timer. This made the cases fail after invalid assumptions were made that this case would never normally happen without checking even once here: https://github.com/PX4/PX4-Autopilot/pull/22881/files#diff-5971d648b2b246dfb01ec5d5006b5258cfdbc41b73f50db1abe7dd5236f855e1R167-R172
The change producing a few timestamps after boot that wrap was introduced here:
https://github.com/PX4/PX4-Autopilot/pull/14181/files#diff-c82d12a48f93eeb820597a14504a6a2d58cb7c09ad86d2e7da6aa6b04e144628L124-R128
and it never resulted in any problem because even across the wrapping the result of the elapsed time calculation was correct again.
Fixes #23378
### Solution
The stamp stays 0 if subscriptions are initialized less than the interval time after boot instead of wrapping the unsigned timestamp.
It might not be optimal when the 64 bit time actually wraps because then for one intervals time the `_last_update` is not updated anymore but I hope that's acceptable given this should only happen after 500+k years of runtime.
### Changelog Entry
```
Bugfix: Timestamp wrapping when initializing SubscriptionInterval less than the interval time after boot
```
### Alternatives
Even with this fix I highly suggest reverting the change that made elapsed time across wrapping calculations impossible: https://github.com/PX4/PX4-Autopilot/pull/23380
### Test coverage
I ran this in SITL printing out all the `_last_update` timestamps of all instances and they do not wrap anymore shortly after boot but rather stay zero until the interval time passed once.
Hi everyone,
I’m encountering a recurring issue with our high altitude platform during flights. We experience a 5-minute log blackout and system lockup during a 1h30 flight. According to the logs, the PrintLoad in the logger module is triggered by the altitude control module, which is our own custom module. This module is relatively simple and based on the template module from the PX4 repository.
Here’s the flight review log for the flight test: Flight Test Log .
I was able to reproduce the bug in less than 45 minutes over the weekend. Once the system locks up, it remains locked and does not recover.
Here’s the flight review log for the weekend test: Weekend Test Log .
Last Monday night, I conducted a 13-hour test in an office setting without the altitude control module, and everything worked fine throughout the test. I am continuing to perform more tests to further isolate the issue.
Here’s the overnight test log: Overnight Test Log .
Has anyone encountered similar issues with modules triggering the PrintLoad in the logger module? Any insights or suggestions on what might be causing this problem would be greatly appreciated.
Thank you!
Best regards,
Ludovic