Bare Metal Blinky on STM32
Goal: Blink an LED on an STM32F4 board using only register writes — no HAL, no libraries, no IDE. Understand every byte between power-on and the LED toggling.
Prerequisites: Microcontroller Architecture, GPIO and Digital IO, Memory-Mapped IO, Makefiles and Build Systems
What “Bare Metal” Means
No operating system, no runtime, no standard library. Your code is the first and only thing that runs after the CPU comes out of reset. You control the clock, the GPIO, and the timing — by writing to specific memory addresses.
Step 1: The Startup Code
The CPU needs two things from the vector table at address 0x00000000:
- Entry 0: initial stack pointer
- Entry 1: address of the reset handler (your entry point)
// startup.c
#include <stdint.h>
extern uint32_t _estack; // defined by linker script
extern void main(void);
// Minimal vector table — just enough to boot
__attribute__((section(".isr_vector")))
uint32_t *vectors[] = {
(uint32_t *)&_estack, // initial stack pointer
(uint32_t *)main, // reset handler → jump to main
};After power-on, the Cortex-M loads the stack pointer from vectors[0] and jumps to vectors[1]. That’s it — you’re running.
Step 2: Register Definitions
Instead of using vendor headers, define just what we need:
// main.c
#include <stdint.h>
// Register addresses (STM32F411, Nucleo-64 board)
#define RCC_BASE 0x40023800
#define RCC_AHB1ENR (*(volatile uint32_t *)(RCC_BASE + 0x30))
#define GPIOA_BASE 0x40020000
#define GPIOA_MODER (*(volatile uint32_t *)(GPIOA_BASE + 0x00))
#define GPIOA_ODR (*(volatile uint32_t *)(GPIOA_BASE + 0x14))
// Bit positions
#define RCC_AHB1ENR_GPIOAEN (1 << 0)
#define GPIO_MODER_PIN5_OUT (1 << 10) // MODER5 = 01 (output)
#define GPIO_ODR_PIN5 (1 << 5)Every peripheral is controlled by reading and writing specific addresses. See Memory-Mapped IO — volatile tells the compiler not to optimize these accesses away.
Step 3: Main — Blink the LED
The Nucleo-F411RE has an LED on PA5. Turn it on and off with a delay loop:
static void delay(volatile uint32_t count) {
while (count--) {} // volatile prevents optimizer from removing
}
void main(void) {
// 1. Enable GPIOA clock (without clock, writes to GPIO registers do nothing)
RCC_AHB1ENR |= RCC_AHB1ENR_GPIOAEN;
// 2. Set PA5 to output mode (MODER5 = 01)
GPIOA_MODER &= ~(3 << 10); // clear MODER5 bits
GPIOA_MODER |= GPIO_MODER_PIN5_OUT;
// 3. Blink forever
while (1) {
GPIOA_ODR ^= GPIO_ODR_PIN5; // toggle PA5
delay(500000); // ~500ms at default 16MHz HSI
}
}Key insight: the first thing you must do is enable the peripheral clock via RCC. The GPIO peripheral is powered off by default to save energy. Writing to GPIOA registers without enabling the clock has no effect.
Step 4: Linker Script
The linker script tells the toolchain where to place code and data in the MCU’s memory:
/* stm32f411.ld */
MEMORY
{
FLASH (rx) : ORIGIN = 0x08000000, LENGTH = 512K
SRAM (rwx) : ORIGIN = 0x20000000, LENGTH = 128K
}
_estack = ORIGIN(SRAM) + LENGTH(SRAM); /* stack starts at top of SRAM */
SECTIONS
{
.isr_vector : { *(.isr_vector) } > FLASH
.text : { *(.text*) } > FLASH
.rodata : { *(.rodata*) } > FLASH
.data : { *(.data*) } > SRAM AT > FLASH
.bss : { *(.bss*) } > SRAM
}The vector table goes first in Flash (address 0x08000000, aliased to 0x00000000 at boot). Code and constants follow in Flash. Variables live in SRAM.
Step 5: Makefile
CC = arm-none-eabi-gcc
CFLAGS = -mcpu=cortex-m4 -mthumb -nostdlib -g -O0 -Wall
all: blinky.bin
blinky.elf: main.c startup.c stm32f411.ld
$(CC) $(CFLAGS) -T stm32f411.ld -o $@ main.c startup.c
blinky.bin: blinky.elf
arm-none-eabi-objcopy -O binary $< $@
flash: blinky.bin
openocd -f interface/stlink.cfg -f target/stm32f4x.cfg \
-c "program blinky.bin 0x08000000 verify reset exit"
clean:
rm -f blinky.elf blinky.bin
.PHONY: all flash clean# Install toolchain (Ubuntu/Debian)
sudo apt install gcc-arm-none-eabi openocd
make
make flash # program the boardWhat each flag does
-mcpu=cortex-m4 -mthumb: generate Thumb instructions for Cortex-M4-nostdlib: no standard library (we have no OS to provide it)-T stm32f411.ld: use our linker script for memory layout
Step 6: Verify
After flashing, the LED on PA5 should blink at ~1 Hz. If it doesn’t:
- LED doesn’t light at all: check the RCC clock enable — the GPIO won’t respond without it
- LED is always on or off: check MODER bits — make sure it’s set to output (01), not alternate function (10) or analog (11)
- Crashes immediately: check the vector table — wrong stack pointer or reset handler address causes a hard fault
# Debug with GDB over OpenOCD
openocd -f interface/stlink.cfg -f target/stm32f4x.cfg &
arm-none-eabi-gdb blinky.elf
(gdb) target remote :3333
(gdb) monitor reset halt
(gdb) break main
(gdb) continue
(gdb) print /x *(uint32_t*)0x40020000 # read GPIOA_MODERWhat Just Happened
You wrote a program that runs directly on silicon with zero abstraction layers:
Your code → ARM instructions in Flash → Cortex-M4 CPU → AHB bus → GPIO peripheral → pin voltage → LED
Every register write travels through the bus matrix to the peripheral. The CPU doesn’t “know” about LEDs — it just writes a 1 to a memory address that happens to control a physical pin.
Exercises
-
SysTick delay: Replace the crude loop delay with the SysTick timer for precise 500ms intervals. Configure SysTick to count down from
(16000000 / 1000) - 1for a 1ms tick at 16MHz HSI. -
Button input: Read the user button (PC13 on Nucleo). Configure PC13 as input with pull-up. Toggle the LED on button press.
-
Multiple LEDs: If you have an external LED on another pin (e.g., PB0), blink two LEDs alternately.
-
PLL clock setup: Configure the PLL to run the CPU at 100MHz instead of 16MHz. Recalculate the delay loop or SysTick reload value.
Next: 08 - UART Serial Console from Scratch — add serial output so you can printf from bare metal.