The Write-Ahead Log (WAL) in MongoDB, known as the journal, is a durability mechanism that records every write operation to a disk-based log before it is applied to the main data files, ensuring that committed data can be recovered after a crash .
The Write-Ahead Log (WAL) is a foundational technique for ensuring data durability in database systems, including MongoDB. In MongoDB, this is implemented through a component called the Journal. The core principle is simple but critical: before any change is made to the main database files, a detailed record of that change is first written to a sequential, append-only log file on disk. This ensures that even if the system crashes before the change can be permanently applied to the database, the information is not lost and can be recovered from the log.
MongoDB's storage engine, WiredTiger, uses a combination of the journal and checkpoints to achieve durability . Here’s the step-by-step workflow :
Write to Journal: When a write operation occurs, MongoDB first writes the changes to an in-memory journal buffer . This buffer is periodically (by default every 50ms) flushed to the on-disk journal files. The j: true option in write concern forces a sync to disk before acknowledging the write .
Apply to Memory (Cache): After being recorded in the journal, the changes are applied to the in-memory data structures (the WiredTiger cache) and the documents are marked as 'dirty' . This is where active data is manipulated for high performance.
Checkpoint (Flush to Data Files): Periodically, by default every 60 seconds, MongoDB creates a checkpoint. A checkpoint flushes all the dirty data from the in-memory cache to the on-disk data files (collection-*.wt and index-*.wt), creating a consistent snapshot of the database .
A sudden power failure can halt a database at any moment, potentially between checkpoints. This is where the journal's role becomes crucial. When MongoDB restarts after a crash, it enters a recovery process . The system first identifies the last successful checkpoint from the data files. Then, it reads the journal files and replays all write operations that were made after that checkpoint. This process applies any unflushed, committed writes, bringing the database back to a consistent state and ensuring that no acknowledged write is lost .