After going through some hard-faults, flashing bootloader 100s of times, learning how board nuttx configurations are organized, and tweaking the vendor / product / board ID for the flight controller, I was left feeling unsatisfied with not being sure on how PX4 actually starts up & bootloader starts. So I decided to create a little guide on the information I gathered throughout the research.
Note: Still Work In Progress!
The basics of how a STM32 starts, when the power is supplied to the MCU is quite well explained here: Bare-Metal STM32: Exploring Memory-Mapped I/O And Linker Scripts | Hackaday
Also, the book on understanding STM32 helps a lot, especially the “STM32 Memory Model and Boot Sequence” section in Chapter 3: https://legacy.cs.indiana.edu/~geobrown/book.pdf
So the important bit to understand is that each board (with specific processor) has it’s basic memory layout defined in the linker script “script.ld”. And further down, the specific data that goes into each section (FLASH, RAM, SRAM, etc) are defined in detail.
The ‘specific data’ that is being referred to are in fact referenced in formats like “.text”, “.bss”, etc. Which are standard conventions for the types of data (for .bss, for example, un-initialized static or global variables) for the program.
You can read more about it here: text, data and bss: Code and Data Size Explained | MCU on Eclipse
You can also notice that the start of each section are marked by the variable ‘_s****’ (e.g. _sdata), and are referenced in the code I will show below.
In fact, you can check exactly how the sections of data are arranged in the binary built via examining it’s .elf file. To do that, simply execute “bloaty build//.elf”, after building the target (via “make ”).
That should show something like this:
FILE SIZE VM SIZE -------------- -------------- 52.9% 14.3Mi 0.0% 0 .debug_info 12.7% 3.44Mi 0.0% 0 .debug_loc 11.8% 3.17Mi 0.0% 0 .debug_line 5.9% 1.60Mi 0.0% 0 .debug_str 5.4% 1.46Mi 97.1% 1.46Mi .text 4.2% 1.12Mi 0.0% 0 .debug_abbrev 3.0% 830Ki 0.0% 0 .debug_ranges 1.3% 354Ki 0.0% 0 .symtab 1.1% 305Ki 0.0% 0 .strtab 0.9% 261Ki 0.0% 0 .debug_frame 0.4% 104Ki 0.0% 0 [Unmapped] 0.3% 89.7Ki 0.0% 0 .debug_aranges 0.0% 0 2.6% 40.7Ki .bss 0.0% 3.39Ki 0.2% 3.39Ki .data 0.0% 800 0.0% 0 [ELF Section Headers] 0.0% 211 0.0% 0 .shstrtab 0.0% 136 0.0% 136 .init_section 0.0% 128 0.0% 0 [ELF Program Headers] 0.0% 76 0.0% 0 .comment 0.0% 60 0.0% 8 [2 Others] 0.0% 53 0.0% 0 .ARM.attributes 100.0% 26.9Mi 100.0% 1.50Mi TOTAL
As explained in the article, for us the most relevant sections are:
- .bss: where non-initialized static allocated variables value are stored, read more here: .bss - Wikipedia)
- .data: where initialization values for static allocated variables are stored
- .text: what ends up in the FLASH memory (e.g. constants, functions, vector table)
Apart from that, actually when I execute the bloaty command above with
-v flag, I get the following 2 extra sections that gets included into the VM section (which gets actually into the final binary for the target, read more here: https://github.com/google/bloaty/blob/main/doc/using.md#running-bloaty):
- .init_section - 136 bytes (placed right after the .text section in memory, in FLASH)
- .ARM.exidx: 8 bytes (placed right after .init_section section in memory, in FLASH)
It’s quite interesting how they are placed exactly how the linker script has asked them to be. For example, the .data section gets first placed in SRAM (which in this case, for MATEK H743 mini, there were plenty of space), but then possibly would be written in FLASH, if it needed to (I think).
Here’s the whole output result in case you are curious. Feel free to compare the address range and check where each sections are located:
FILE MAP: 0000000-0000034 52 [ELF Header] 0000034-00000b4 128 [ELF Program Headers] 00000b4-0010000 65356 [Unmapped] 0010000-0185ce0 1531104 .text 0185ce0-0185d68 136 .init_section 0185d68-0185d70 8 .ARM.exidx 0185d70-0190000 41616 [Unmapped] 0190000-0190d90 3472 .data 0190d90-0190ddc 76 .comment 0190ddc-0190e11 53 .ARM.attributes 0190e11-02afa0b 1174522 .debug_abbrev 02afa0b-10f247d 14953074 .debug_info 10f247d-141ce70 3320307 .debug_line 141ce70-1433530 91840 .debug_aranges 1433530-15cccaa 1677178 .debug_str 15cccaa-193c2b5 3601931 .debug_loc 193c2b5-1a0be50 850843 .debug_ranges 1a0be50-1a4d5a8 268120 .debug_frame 1a4d5a8-1aa60c8 363296 .symtab 1aa60c8-1af28a5 313309 .strtab 1af28a5-1af2978 211 .shstrtab 1af2978-1af2c98 800 [ELF Section Headers] VM MAP: 00000000-08020000 134348800 [-- Nothing mapped --] 08020000-08195ce0 1531104 .text 08195ce0-08195d68 136 .init_section 08195d68-08195d70 8 .ARM.exidx 08195d70-24000000 468099728 [-- Nothing mapped --] 24000000-24000d90 3472 .data 24000d90-24000dc0 48 [-- Nothing mapped --] 24000dc0-2400b078 41656 .bss
So we now have rough idea on how important the linker script is for defining the overall memory structure. But which code actually then gets executed when the MCU powers up?
First big role of a bootloader is to first make sure we move the data from the .data, .bss sections into the RAM appropriately. This is all handled by the NuttX itself for it’s own start-up sequence, and is explained very well here: https://cwiki.apache.org/confluence/display/NUTTX/NuttX+Initialization+Sequence
Implementation of how STM32H7 chip’s start sequence is handled can be found here:
After the RAM copying is complete, NuttX then initializes the clock, Floating Point Unit, etc. Then, it calls the “stm32_boardinitialize” function, which is implemented in the PX4 domain.
So this is the part in the bootloader of the Matek H743 mini board that gets executed, which only configures the USB connection (as that’s the only thing we need while in bootloader, to re-flash the board):
And since for this board, the timer hook is enabled in the NuttX defconfig for the bootloader:
The “board_timehook” implemented gets called every timer interrupt in bootloader:
Which then controls the LED to show whether the bootloader is active:
However, the actual bootloader main function is in a totally separate place (although, I agree it is confusing to have bootloader related NuttX function implementations in “bootloadeR_main.c” under targets haha).
First, the fact that we use the bootloader_main function for initialization entry point is defined in the NuttX defconfig for bootloader:
So it is in fact, this function defined in “platforms/nuttx/src/bootloader/stm/stm32_common/main.c”:
Here, we really get into the details, but few important steps are:
- board general init for GPIO pins
- Clock initialization (implementation by NuttX)
And then, it will consider all the possible firmware-update related scenarios, which are:
- Checking for Force-bootloader pin status
- Check USB connection
- Check USART pin status
And if any of them indicate that there may be an entity trying to update the firmware, it will call the bootloader function in “bl.c”, and if it times out, the launching of the normal firmware will continue.
And this is the final function that handles all the firmware update protocol part:
Here you can see how the full chip erase command gets processed, for example:
So that’s all cool and all but then how does the PX4 jump to the main function when there’s no-one trying to upgrade the firmware? That would happen if either the upgrade conditions written above were not met, or the timeout has been reached without any effective command reaching the board.
The answer to that is in the bootloader main function again:
It calls the “jump_to_app”, which is also defined in the “bl.c” file:
Here the intricate checking of the validity of the APP’s base address, and checks whether the Table of Contents (TOC) saved in the Flash section (details can be fine tuned via the “hw_config.h” file under the board directory: https://github.com/PX4/PX4-Autopilot/blob/95b30056794b47bb415f4c1d96028ec77a567446/boards/matek/h743/src/hw_config.h#L98), etc.
Then after de-initializing the clocks, and the board, the actual jump to the app is made:
So the actual PX4 starting function (at least part of it), in terms of initialization of PX4 system can be found here:
And that, seems to be called from the “board_app_initialize” function of per-target implementation in “init.c” like here:
However, I wasn’t able to definitely come up with a clear sequence of commands that leads to this yet. It seems to be somehow related with “nsh_initialize”, the NuttX console, but I am hesitant to believe that’s the case, since the shell doesn’t seem like a necessity for PX4 (at least it can run without the shell, I think).
I guess I can cover that in a follow up post / edit this!