The GMP model (Goroutines, OS threads, Processors) is Go's runtime scheduler that multiplexes goroutines onto OS threads using work-stealing, enabling massive concurrency with minimal overhead.
G (Goroutine): the unit of work with its own stack and state
M (Machine/OS thread): the entity that actually executes code on the CPU
P (Processor): a logical processor with a local run queue, limited by GOMAXPROCS
Each P can run one G at a time on one M — GOMAXPROCS controls parallelism, not concurrency
Global run queue: overflow queue when local P queues are full
Work stealing: idle P steals goroutines from busy P's local queue — automatic load balancing
Preemption: goroutines are preempted at function calls and loop back-edges (Go 1.14+) — no starvation
Blocking syscall: when G blocks on a syscall, P detaches from M and attaches to a new M so other goroutines keep running
Network I/O: Go's netpoller parks goroutines awaiting I/O without blocking an OS thread