Memory-Mapped IO

On microcontrollers, peripheral hardware registers occupy specific addresses in the same address space as RAM and Flash. Reading from or writing to those addresses directly controls hardware — no special IO instructions needed.

Why It Matters

Every line of embedded C that configures a GPIO pin, starts a timer, or reads an ADC is ultimately a memory-mapped register access. Understanding the address space layout, why volatile is mandatory, and how to safely manipulate register bits is the foundation of bare-metal programming.

How It Works

Address Space Layout

The MCU memory map places peripherals at fixed addresses defined by the chip vendor. On STM32F4:

0xFFFF_FFFF ┌────────────────────┐
            │  Cortex-M internals│  NVIC, SysTick, SCB
0xE000_0000 ├────────────────────┤
            │  AHB1 peripherals  │  GPIO, DMA, RCC
0x4002_0000 ├────────────────────┤
            │  APB2 peripherals  │  SPI1, USART1, TIM1, ADC
0x4001_0000 ├────────────────────┤
            │  APB1 peripherals  │  TIM2, SPI2, USART2, I2C1
0x4000_0000 ├────────────────────┤
            │  SRAM              │  Variables, stack, heap
0x2000_0000 ├────────────────────┤
            │  Flash             │  Code, vector table, constants
0x0800_0000 └────────────────────┘

Each peripheral has a base address and a set of registers at fixed offsets. For example, GPIOA on STM32F4:

GPIOA base = 0x4002_0000
  +0x00 MODER   (mode register)
  +0x04 OTYPER  (output type)
  +0x08 OSPEEDR (speed)
  +0x0C PUPDR   (pull-up/down)
  +0x10 IDR     (input data, read-only)
  +0x14 ODR     (output data)
  +0x18 BSRR    (bit set/reset)
  +0x20 AFR[0]  (alternate function low)
  +0x24 AFR[1]  (alternate function high)

The volatile Keyword

volatile tells the compiler: “this value can change at any time outside the program’s visible control flow.” Without it, the compiler may optimize away reads or writes that are actually essential.

// BUG: compiler may read STATUS once and loop forever
uint32_t *status = (uint32_t *)0x40004000;
while (*status & BUSY_FLAG) {}   // may never re-read from hardware
 
// CORRECT: volatile forces a real memory read every iteration
volatile uint32_t *status = (volatile uint32_t *)0x40004000;
while (*status & BUSY_FLAG) {}   // actually polls the register

Three places where volatile is mandatory:

  1. Hardware registers — value changes because hardware updates it
  2. Variables shared with ISRs — ISR modifies the variable asynchronously
  3. Variables shared between threads — another thread/task modifies the value (though in RTOS context, proper synchronization primitives are also needed)

Register Access Patterns

// Set bits (OR) -- turn on specific bits, leave others unchanged
REG |= (1 << BIT);
 
// Clear bits (AND NOT) -- turn off specific bits
REG &= ~(1 << BIT);
 
// Toggle bits (XOR)
REG ^= (1 << BIT);
 
// Read-modify-write a multi-bit field
REG = (REG & ~FIELD_MASK) | (new_value << FIELD_POS);

Problem with read-modify-write: if an ISR modifies the same register between the read and the write, the ISR’s change is lost. Solutions:

  • Use atomic bit-set/clear registers (like GPIO BSRR) when available
  • Disable interrupts around the read-modify-write (costs a few cycles)
  • On Cortex-M3+, use bit-banding (see below)

Bit-Banding on Cortex-M3/M4

Cortex-M3 and M4 provide a bit-band region that maps each bit of the peripheral and SRAM space to a separate 32-bit word. Writing to the alias address atomically sets or clears a single bit with no read-modify-write.

Peripheral bit-band region:  0x4000_0000 - 0x400F_FFFF (1MB)
Peripheral bit-band alias:   0x4200_0000 - 0x43FF_FFFF (32MB)

Alias address = 0x4200_0000 + (register_offset * 32) + (bit_number * 4)

Example: set bit 5 of GPIOA->ODR (0x4002_0014)
  offset from 0x4000_0000 = 0x0002_0014
  alias = 0x4200_0000 + (0x20014 * 32) + (5 * 4) = 0x4240_02A4

  *(volatile uint32_t *)0x424002A4 = 1;  // atomic set bit 5
  *(volatile uint32_t *)0x424002A4 = 0;  // atomic clear bit 5

This is a single store instruction — no read-modify-write, no interrupt disable needed. Useful in ISR-safe code.

Peripheral Register Struct Pattern

Vendor headers (CMSIS) define structs that overlay the register layout onto the base address. This is the standard way to access registers in STM32 code:

// From stm32f4xx.h (simplified)
typedef struct {
    volatile uint32_t MODER;    // offset 0x00
    volatile uint32_t OTYPER;   // offset 0x04
    volatile uint32_t OSPEEDR;  // offset 0x08
    volatile uint32_t PUPDR;    // offset 0x0C
    volatile uint32_t IDR;      // offset 0x10
    volatile uint32_t ODR;      // offset 0x14
    volatile uint32_t BSRR;     // offset 0x18
    volatile uint32_t LCKR;     // offset 0x1C
    volatile uint32_t AFR[2];   // offset 0x20-0x24
} GPIO_TypeDef;
 
#define GPIOA ((GPIO_TypeDef *)0x40020000)
 
// Now register access is clean and type-safe:
GPIOA->MODER |= (1 << 10);   // same as *(volatile uint32_t*)0x40020000 |= ...
GPIOA->BSRR = (1 << 5);      // set PA5

Every field is volatile because it maps to hardware. The struct layout must match the register offsets exactly — the compiler packs them sequentially. See C Language Essentials and Pointers and Memory for the underlying pointer mechanics.

Code Example

Defining a custom peripheral register access without vendor headers:

// Manual register definitions (when no CMSIS header available)
#define PERIPH_BASE     0x40000000UL
#define GPIOA_BASE      (PERIPH_BASE + 0x00020000UL)
#define GPIOA_MODER     (*(volatile uint32_t *)(GPIOA_BASE + 0x00))
#define GPIOA_ODR       (*(volatile uint32_t *)(GPIOA_BASE + 0x14))
#define GPIOA_BSRR      (*(volatile uint32_t *)(GPIOA_BASE + 0x18))
 
// Use exactly like struct access:
GPIOA_MODER |= (1 << 10);    // PA5 output
GPIOA_BSRR = (1 << 5);       // set PA5
GPIOA_BSRR = (1 << 21);      // clear PA5 (bit 5 + 16)