System Calls

The thin boundary between user space and kernel — how syscalls work under the hood on x86-64 Linux.

syscallkernellinuxx86-64

What Is a System Call?

A syscall is a controlled entry point into the kernel. User-space code can’t touch hardware, manage memory, or talk to the network directly — it must ask the kernel via a syscall.

The x86-64 Syscall Path

  1. User code loads the syscall number into rax and arguments into rdi, rsi, rdx, r10, r8, r9.
  2. The syscall instruction swaps to kernel mode (ring 0), saves rip into rcx and rflags into r11, then jumps to MSR_LSTAR — the kernel’s syscall entry point.
  3. The kernel dispatches via sys_call_table[rax].
  4. sysret returns to user mode.
; write(1, buf, len)
mov rax, 1          ; __NR_write
mov rdi, 1          ; fd = stdout
lea rsi, [buf]      ; buffer address
mov rdx, 13         ; length
syscall

vDSO: Avoiding the Kernel

Some “syscalls” (e.g., gettimeofday, clock_gettime) are performance-critical and read-only. The kernel maps a shared page (the vDSO) into every process containing user-space implementations — no ring transition needed.

Cost of a Syscall

A bare syscall/sysret round-trip costs ~50–100 ns on modern hardware. Factor in cache/TLB pollution from the kernel entry and the real cost is often higher.

Optimization: Batch operations (readv/writev, io_uring) amortize syscall overhead across many I/O operations.