Boot and Firmware
May 16, 2026·14 min read·advanced
This chapter covers what happens between power-on and the OS being in control. Everything that runs before the kernel — the BIOS or UEFI on x86, ARM Trusted Firmware on AArch64, OpenSBI on RISC-V —…
This chapter covers what happens between power-on and the OS being in control. Everything that runs before the kernel — the BIOS or UEFI on x86, ARM Trusted Firmware on AArch64, OpenSBI on RISC-V — is firmware. Firmware initializes hardware, locates and loads the OS, and provides services the OS needs to manage hardware whose details vary across platforms.
Boot and firmware are perhaps the least glamorous parts of a CPU's lifecycle, but understanding them is essential. Most subtle hardware bugs are encountered first by firmware writers. Most security guarantees about a running system rest on firmware having been correct. And the boundary between firmware and OS shapes how portable the OS can be.
01. What "Boot" Means
Boot is the process of bringing a system from power-off to a running OS. It involves:
- Power-on reset: hardware comes up, all state defined.
- Hardware initialization: clocks, voltage regulators, DRAM training, peripherals.
- Code loading: progressively larger and more capable firmware images, each loading and verifying the next.
- OS handoff: firmware loads the kernel image, sets up its environment, and transfers control.
- OS initialization: the kernel takes over, completes hardware setup at its level, and starts user space.
The hardware-firmware-OS boundary is not always sharp. Firmware often remains resident, providing runtime services to the OS (UEFI runtime services on x86; ATF SMC handlers on AArch64; OpenSBI on RISC-V).
02. Reset and Initial Execution
When the power button is pressed and voltages stabilize, the CPU comes out of reset. What happens depends on the ISA.
x86-64 Reset
On x86 reset:
- The CPU starts in real mode (16-bit, no MMU, no protection).
- CS:IP = 0xF000:0xFFF0, the legacy BIOS reset vector.
- All registers have defined initial values per the architecture spec.
- A single core (the BSP, Bootstrap Processor) runs; others are held in reset.
The first instruction fetched is at physical address 0xFFFFFFF0 (one of the historical-compatibility quirks: this is just below the 4 GB boundary, where the firmware ROM is mapped on modern systems). Modern systems immediately jump from this stub into the main UEFI image.
The transitions from real mode to protected mode (32-bit) to long mode (64-bit) are the firmware's responsibility. By the time UEFI's main code is running, the CPU is in long mode with paging enabled.
AArch64 Reset
On AArch64 reset:
- The CPU starts in EL3 (the highest exception level), assuming EL3 is implemented.
- The PC is set to RVBAR_EL3 (Reset Vector Base Address, configured by hardware design).
- The MMU is off, caches are off.
- A single core runs; others are typically held off (or in a low-power state managed by the power controller).
EL3 firmware (ARM Trusted Firmware, ATF) does the initial setup, then drops to EL2 for hypervisor code (or to EL1 directly if no hypervisor) to run the OS.
Some systems implement EL3 conditionally; on those, reset starts at the highest implemented EL.
RISC-V Reset
On RISC-V reset:
- The CPU starts in M-mode (machine mode).
- The PC is set to a platform-defined reset vector (often 0x1000 or 0x80000000, depending on the SoC).
- All harts come out of reset together; one is designated boot hart, others spin in a holding pattern.
M-mode firmware (commonly OpenSBI) takes over, initializes the platform, sets up S-mode environment, and drops to S-mode for the OS.
03. Hardware Initialization
Before any code can run usefully, the firmware must bring hardware to a working state.
DRAM Training
DRAM is the most demanding piece. Modern DDR4 and DDR5 require careful calibration:
- Determine the DIMMs present (read SPD ROM, JEDEC standardized).
- Set memory controller parameters (timings, voltages).
- Train read and write data eyes (find the timing window where data transfers reliably).
- Calibrate ZQ termination, on-die termination, etc.
Training can take hundreds of milliseconds. On reboot, results are often cached so successive boots are faster ("fastboot" memory training).
Clock and Power Setup
Modern SoCs have dozens of clock domains, all derived from a few crystal oscillators via PLLs (phase-locked loops). The firmware programs the clock tree to provide each block with its required frequency.
Voltage regulators are similarly programmable. The firmware brings up rails in the right order with the right voltages. Some rails depend on others; some have specific timing requirements; getting it wrong damages hardware.
Peripherals
The firmware enables and minimally configures peripherals it needs for boot: a UART (for serial console), a storage controller (to load the next-stage payload), perhaps a network interface (for PXE / network boot).
Most peripherals are left for the OS to configure. The firmware only does what's necessary for boot.
04. Boot Stages
Modern boot is multi-stage. Each stage is small, focused, and verifies the next stage before loading it. This pattern provides:
- Trust: verification at each step roots the chain in immutable on-die ROM.
- Capability growth: later stages have more memory, more drivers, more flexibility.
- Recoverability: if a stage fails, an earlier stage can attempt recovery.
x86-64 Boot Stages
The x86-64 boot is conceptually:
- Reset vector / SEC (Security): tiny code in on-die ROM or write-protected flash. Establishes initial trust, transitions out of real mode if needed.
- PEI (Pre-EFI Initialization): brings up DRAM, basic hardware, transitions from cache-as-RAM to real DRAM.
- DXE (Driver Execution Environment): the bulk of UEFI. Loads drivers, builds protocols, populates the EFI services tables.
- BDS (Boot Device Selection): user-facing boot menu, selects the OS loader.
- TSL (Transient System Load): OS loader runs (e.g., grub, the Windows boot manager).
- RT (Runtime): firmware exits boot services, OS takes over, but firmware retains a small runtime services region.
This is the PI (Platform Initialization) specification's framework. UEFI is the API exposed to OS loaders and the OS.
AArch64 Boot Stages
The ARM platform's typical boot:
- BL1 (Boot Loader stage 1): tiny code in on-die ROM, runs at EL3. Initializes minimal state, locates BL2.
- BL2 (Boot Loader stage 2): loads BL31, BL32 (optional secure-world OS), BL33 (non-secure firmware/bootloader), still at EL3.
- BL31 (EL3 Runtime Firmware): the ARM Trusted Firmware runtime. Handles SMC calls, power management, secure-world transitions. Stays resident.
- BL32 (Secure World OS / TEE): optional. OP-TEE or a vendor TEE OS.
- BL33 (Non-secure firmware): UEFI (e.g., EDK2-based) or U-Boot. Eventually loads the OS.
This separation between BL31 (always in EL3) and BL33 (runs in EL2 or below, can be UEFI or U-Boot) matches the secure / non-secure split.
RISC-V Boot Stages
Typical RISC-V boot:
- ZSBL (Zeroth-Stage Boot Loader): tiny ROM code, often just enough to load FSBL from flash or storage.
- FSBL (First-Stage Boot Loader): brings up DRAM, loads OpenSBI.
- OpenSBI (M-mode firmware): provides SBI services, drops to S-mode.
- U-Boot (S-mode bootloader, optional): for systems that need bootloader features (network boot, complex device discovery).
- Linux kernel (S-mode): the OS.
The exact stages vary; some systems combine ZSBL and FSBL; some boot Linux directly from OpenSBI without U-Boot.
05. UEFI
The Unified Extensible Firmware Interface is the dominant firmware standard for x86-64 and increasingly for AArch64. It replaced the legacy BIOS on x86.
UEFI provides:
- Boot Services: APIs for loading executables, accessing devices, allocating memory, manipulating files. Available only before the OS calls ExitBootServices().
- Runtime Services: a small set of services (variable access, time, reset) that remain available after the OS is running.
- Variable storage: NVRAM-backed key-value pairs for persistent configuration (boot order, secure boot keys).
- Protocols: an object-oriented model for drivers and services.
- GUID-named everything: every protocol, variable, and image has a globally unique identifier.
- PE/COFF executables: UEFI loads Windows PE-format binaries.
- GPT partition tables: replaces MBR.
A UEFI OS loader is a regular UEFI application, loaded by the firmware, calling boot services to read the kernel from disk, then calling ExitBootServices() and jumping to the kernel.
The ACPI tables that UEFI provides describe the hardware to the OS:
- MADT (Multiple APIC Description Table): processors and interrupt controllers.
- DSDT (Differentiated System Description Table): a bytecode (AML) describing devices and power management.
- SRAT (System Resource Affinity Table): NUMA topology.
- MCFG: PCI Express memory-mapped configuration space.
- ... and dozens of others.
ACPI is an old, complex, and influential standard. The OS must parse and follow ACPI's rules to manage devices and power correctly. ACPI's bytecode (AML) is essentially a small interpreted programming language, with a runtime in the kernel; this has been criticized but is too entrenched to replace.
06. Device Tree
The alternative to ACPI is the Device Tree (DT), originating in Open Firmware (Sun and IBM workstations, then Apple's New World ROMs, then Linux on PowerPC and ARM).
A device tree is a structured description of the hardware: a tree of nodes representing buses, devices, memory regions, interrupts. Each node has properties (name, compatible string, registers, interrupts). The OS reads the device tree and instantiates drivers for each compatible node.
| / { | |
| cpus { | |
| cpu@0 { | |
| compatible = "arm,cortex-a72"; | |
| reg = <0>; | |
| ... | |
| }; | |
| }; | |
| memory@40000000 { | |
| device_type = "memory"; | |
| reg = <0x40000000 0x80000000>; | |
| }; | |
| soc { | |
| uart@7e201000 { | |
| compatible = "arm,pl011"; | |
| reg = <0x7e201000 0x200>; | |
| interrupts = <0 153 4>; | |
| }; | |
| ... | |
| }; | |
| }; |
DT is text (.dts) compiled to a binary blob (.dtb) loaded by the firmware and passed to the kernel.
DT is dominant on ARM embedded and AArch64 servers (in some cases), and is the standard on RISC-V. ACPI is dominant on x86. Both attempt to solve the same problem: how does the firmware tell the OS what hardware exists?
The two coexist on AArch64. SBSA-compliant ARM servers use ACPI; embedded boards use DT. Linux supports both.
07. Secure Boot
Secure boot is the mechanism that ensures the system boots only firmware and OS components signed by trusted keys. The chain:
- Hardware has an immutable root of trust (a key embedded in fuses or on-die ROM).
- The first-stage firmware verifies its signature against this root.
- The first-stage verifies the next-stage firmware before loading.
- Each stage verifies the next, building a chain of trust.
- The OS loader is verified before being loaded.
- The OS verifies kernel modules, drivers, and applications according to its own policy.
UEFI Secure Boot uses X.509 certificates; the hardware root is typically the Platform Key (PK), which signs Key Exchange Keys (KEKs), which sign signature databases (db, dbx). Microsoft signs Windows boot loaders; Linux distributions sign with their own keys (often using shim, a Microsoft-signed tiny loader that then verifies its own keys).
ARM uses Trusted Boot, with similar concepts: BL1 verifies BL2 against a root key derived from on-die OTP fuses; BL2 verifies BL31, BL32, BL33; etc.
RISC-V's secure boot is platform-specific; OpenSBI supports verification of payloads, but the key infrastructure depends on the SoC.
08. Measured Boot and TPM
Closely related: measured boot. Each stage hashes the next stage and extends the hash into a Trusted Platform Module (TPM) PCR (Platform Configuration Register) before transferring control. After boot, the PCRs hold a cumulative hash of every binary that ran. Remote attestation can compare the PCRs to expected values to verify system integrity.
The TPM is a separate chip (or, increasingly, a firmware TPM running in a TEE). Provides:
- PCRs for measurement.
- Sealed storage (data can be encrypted such that it only decrypts when PCRs match).
- Attestation keys for proving identity remotely.
- Random number generation.
TPM 2.0 is the current standard. Required for Windows 11; common on enterprise Linux deployments.
09. Server vs. Embedded Boot
Server boot (x86-64 with UEFI, ARM with SBBR/SBSA):
- POST takes seconds to tens of seconds (DRAM training, PCIe enumeration, network probe).
- Multiple boot devices possible (network, local disk, USB).
- Out-of-band management (BMC, IPMI, Redfish) provides remote control.
- ACPI tables describe hardware to OS.
- Measured boot common in enterprise deployments.
Embedded boot (small ARM, RISC-V microcontrollers, custom SoCs):
- Boot can be milliseconds (no DRAM training needed if using on-chip SRAM; minimal probing).
- Usually a single, fixed boot path.
- Often boots from on-chip flash directly to firmware then OS or RTOS.
- Device tree describes hardware.
- Field updates require A/B partitions or recovery modes.
The boot mechanisms scale across this range, but the trade-offs are very different at each end.
10. Watchdogs and Recovery
Robust boot includes recovery mechanisms:
- Watchdog timers: hardware that resets the system if not "kicked" within a deadline. Catches hangs in firmware or kernel.
- A/B partitions: two copies of firmware/OS; if a new image fails to boot, fall back to the old one.
- Recovery mode: a minimal known-good firmware image that can be invoked by a special boot input (button, timing, jumper).
- Roll-back protection: firmware enforces version monotonicity to prevent downgrade attacks.
Modern Android phones, Chromebooks, and many IoT devices implement these patterns rigorously. Servers tend to rely more on operator intervention.
11. OS Handoff
When firmware transfers control to the OS, what state must be in place?
x86-64 (UEFI handoff):
- CPU in long mode, paging enabled.
- Identity mapping for low memory; OS will install its own page tables.
- ACPI tables in memory at known locations.
- Memory map (a list of regions: usable, reserved, ACPI data, etc.).
- BSP running; APs (other cores) parked.
AArch64 (UEFI or U-Boot):
- CPU in EL2 or EL1 (hypervisor or kernel level).
- MMU initially off (kernel turns it on).
- Device tree blob pointer in x0.
- BSP running; APs spinning at known address (or to be brought up via PSCI).
RISC-V (OpenSBI handoff):
- CPU in S-mode.
- Hartid in a0.
- Device tree blob pointer in a1.
- All harts in known states (boot hart at kernel entry; others stopped, brought up via SBI HSM calls).
12. Bringing Up Other Cores
The boot CPU is one core. Modern systems have many. The other cores must be brought online by the OS or firmware.
x86-64: the boot core sends an INIT IPI then a STARTUP IPI (SIPI) to each AP. The SIPI carries a vector pointing to the AP's startup code. The AP wakes up in real mode at that vector, transitions to long mode, and joins the running kernel.
AArch64: PSCI (Power State Coordination Interface). The OS calls PSCI_CPU_ON via SMC; ATF (in EL3) brings the target core out of reset, points it at a kernel-supplied entry, and the target core begins executing kernel code.
RISC-V: SBI HSM (Hart State Management). The OS calls sbi_hart_start with the target hart ID, an entry address, and an opaque parameter. OpenSBI brings the hart up; the hart begins executing at the given address.
In all cases, secondary cores execute a minimal startup sequence (set up stack, initialize per-core data, configure CPU registers) and join the scheduler.
13. Suspend and Resume
The inverse of boot, sort of: bringing a running system to a low-power state and back.
- Suspend to RAM (S3 / SuspendToIdle): most hardware powered off; DRAM in self-refresh; CPU off; wake on specific events (power button, lid open, RTC alarm, USB, network). Resume time: a second or less.
- Suspend to Disk (S4 / Hibernate): write RAM contents to disk; full power-off. Resume time: several seconds to a minute (read RAM image back).
- Modern Standby (S0ix on x86, deep idle on ARM): CPU off but software running; can wake quickly to handle events.
The resume path is essentially a partial boot, with the OS taking shortcuts (skipping things it knows are still configured) and restoring its own state from saved structures.
14. Summary
Firmware and boot bridge raw silicon and a running OS. The journey from reset vector to userspace login involves multiple stages, each with growing capabilities and increasingly complete views of the system. Each ISA's boot ecosystem differs in details — UEFI on x86, ATF on AArch64, OpenSBI on RISC-V — but the structure is similar: small, trustworthy initial code; progressively more capable stages; verification at each step; eventual handoff to the OS with describable hardware state.
Hardware description (ACPI on x86, Device Tree on ARM/RISC-V) lets the OS adapt to varied hardware without recompilation. Secure boot and TPMs root trust in physical attestation. Suspend and resume are partial boots in slow motion. None of this is glamorous, but all of it is essential.
The next chapter looks at one of the most consequential developments in modern systems software: virtualization. We'll see how hardware extensions (Intel VT-x, AMD SVM, ARM virtualization, RISC-V H extension) enable running multiple OSes on one machine, and how hypervisors leverage them.