keos_project5/ffs/
journal.rs

1//! # Journaling for Crash Consistency.
2//!
3//! File systems must ensure data consistency in the presence of crashes or
4//! power failures.  When a system crash or power failure occurs, in-progress
5//! file operations may leave the file system in an inconsistent state, where
6//! metadata and data blocks are only partially updated. This can lead to file
7//! corruption, orphaned blocks, or even complete data loss. Thus, modern file
8//! systems must guard against these scenarios to ensure durability and
9//! recoverability.
10//!
11//! To address this, modern file systems employ **journaling**. Journaling
12//! provides crash-consistency by recording intended changes to a special log
13//! (called the journal) before applying them to the main file system. In the
14//! event of a crash, the journal can be replayed to recover to a consistent
15//! state. This significantly reduces the risk of data corruption and allows
16//! faster recovery after unclean shutdowns, without the need for full
17//! file system checks.
18//!
19//! In this approach, all intended updates, such as block allocations, inode
20//! changes, or directory modifications, are first written to a special log
21//! called the **journal**. Only after the log is safely persisted to disk,
22//! the actual file system structures updated. In the event of a crash, the
23//! system can replay the journal to restore a consistent state. This method
24//! provides a clear "intent before action" protocol, making recovery
25//! predictable and bounded.
26//!
27//! ## Journaling in KeOS
28//!
29//! To explore the fundamentals of crash-consistent file systems, **KeOS
30//! implements a minimal meta-data journaling mechanism** using the well-known
31//! technique of **write-ahead logging**. This mechanism ensures that
32//! updates to file system structures are made durable and recoverable.
33//!
34//! The journaling mechanism is anchored by a **journal superblock**, which
35//! includes a `commited` flag. This flag indicates whether the journal area
36//! currently holds valid, committed journal data that has not yet been
37//! checkpointed.
38//!
39//! Journals in KeOS structured around four key stages: **Metadata updates**,
40//! **commit**, **checkpoint**, and **recovery**.
41//!
42//! ### 1. Metadata Updates
43//!
44//! In KeOS, journaling is tightly integrated with the [`RunningTransaction`]
45//! struct, which acts as the central abstraction for managing write-ahead
46//! logging of file system changes. All journaled operations must be serialized
47//! through this structure to ensure consistency.
48//!
49//! Internally, [`RunningTransaction`] is protected by a `SpinLock` on the
50//! journal superblock, enforcing **global serialization** of journal writes.
51//! This design guarantees that only one transaction may be in progress at any
52//! given time, preventing concurrent updates to the same block, which could
53//! otherwise result in a corrupted or inconsistent state.
54//!
55//! Crucially, KeOS uses Rust’s strong type system to enforce this safety at
56//! compile time: without access to an active [`RunningTransaction`], it is
57//! **impossible** to write metadata blocks. All metadata modifications must be
58//! submitted explicitly via the `submit()` method, which stages the changes for
59//! journaling.
60//!
61//! If you forget to submit modified blocks through [`RunningTransaction`], the
62//! kernel will **panic** with a clear error message, catching the issue early
63//! and avoiding silent corruption. This design provides both safety and
64//! transparency, making metadata updates robust and auditable.
65//!
66//!
67//! ### 2. Commit Phase: [`RunningTransaction::commit`]
68//!
69//! In the commit phase, KeOS records all pending modifications to a dedicated
70//! **journal area** before applying them to their actual on-disk locations:
71//!
72//! A transaction begins with a **`TxBegin` block**, which contains a list of
73//! logical block addresses that describe where the updates will eventually be
74//! written. This is followed by the **journal data blocks**, which contain the
75//! actual contents to be written to the specified logical blocks. Once all data
76//! blocks have been written, a **`TxEnd` block** is appended to mark the
77//! successful conclusion of the transaction. This write-ahead logging
78//! discipline guarantees that no update reaches the main file system until its
79//! full intent is safely recorded in the journal.
80//!
81//! You can write journal blocks with [`JournalWriter`] struct. This structure
82//! is marked with a type that represent the stages of commit phase, enforcing
83//! you to write journal blocks in a correct order.
84//!
85//! ### 3. Checkpoint Phase: [`Journal::checkpoint`]
86//!
87//! After a transaction is fully committed, the system proceeds to
88//! **checkpoint** the journal. During checkpointing, the journaled data blocks
89//! are copied from the journal area to their final destinations in the main
90//! file system (i.e., to the logical block addresses specified in the `TxBegin`
91//! block).
92//!
93//! Once all modified blocks have been written to their final locations, the
94//! system clears the journal by resetting the `commited` flag in the journal
95//! superblock. This indicates that the journal is no longer recovered when
96//! crash.
97//!
98//! In modern file systems, checkpointing is typically performed
99//! **asynchronously** in the background to minimize the latency of system calls
100//! like `write()` or `fsync()`. This allows the file system to acknowledge the
101//! operation as complete once the journal is committed, without waiting for the
102//! final on-disk update.
103//!
104//! However, for simplicity in this project, **checkpointing is done
105//! synchronously**: the file system waits until all journaled updates are
106//! copied to their target locations before clearing the journal. This
107//! simplifies correctness, avoids the need for background threads or
108//! deferred work mechanisms, and reduces work for maintaining consistent view
109//! between disk and commited data.
110//!
111//!
112//! ### 4. Recovery: [`Journal::recovery`]
113//!
114//! If a crash occurs before the checkpointing phase completes, KeOS
115//! **recovers** the file system during the next boot. It begins by inspecting
116//! the journal superblock to determine whether a committed transaction exists.
117//!
118//! If the `committed` flag is set and a valid `TxBegin`/`TxEnd` pair is
119//! present, this indicates a completed transaction whose changes have not yet
120//! been checkpointed. In this case, KeOS retries the **checkpointing**. If the
121//! journal is not marked as committed, the system discards the journal
122//! entirely. This rollback ensures consistency by ignoring partially written
123//! or aborted transactions.
124//!
125//! This recovery approach is both **bounded** and **idempotent**: it scans only
126//! the small, fixed-size journal area, avoiding costly full file system
127//! traversal, and it can safely retry recovery without side effects if
128//! interrupted again.
129//!
130//! ## Implementation Requirements
131//! You need to implement the followings:
132//!   - [`Journal::recovery`]
133//!   - [`Journal::checkpoint`]
134//!   - [`JournalWriter::<TxBegin>::write_tx_begin`]
135//!   - [`JournalWriter::<Block>::write_blocks`]
136//!   - [`JournalWriter::<TxEnd>::write_tx_end`]
137//!
138//! After implement the functionalities, move on to the last [`section`] of the
139//! KeOS.
140//!
141//! [`section`]: mod@crate::advanced_file_structs
142
143use crate::ffs::{
144    FastFileSystemInner, JournalIO, LogicalBlockAddress,
145    disk_layout::{JournalSb, JournalTxBegin, JournalTxEnd},
146};
147use alloc::{boxed::Box, vec::Vec};
148use core::cell::RefCell;
149use keos::{KernelError, sync::SpinLockGuard};
150
151/// A structure representing the journal metadata used for crash consistency.
152///
153/// Journaling allows the file system to recover from crashes by recording
154/// changes in a write-ahead log before committing them to the main file system.
155/// This ensures that partially written operations do not corrupt the file
156/// system state.
157///
158/// The `Journal` struct encapsulates the journaling superblock.
159/// It is responsible for managing the checkpointing process, which commits
160/// durable changes and clears completed transactions.
161///
162/// # Fields
163/// - `sb`: The journal superblock, containing configuration and state of the
164///   journal.
165pub struct Journal {
166    /// Journal superblock.
167    pub sb: Box<JournalSb>,
168}
169
170impl Journal {
171    /// Recovers and commited but not checkpointed transactions from the
172    /// journal.
173    ///
174    /// This function is invoked during file system startup to ensure
175    /// metadata consistency in the event of a system crash or power failure.
176    /// It scans the on-disk journal area for valid transactions and re-applies
177    /// them to the file system metadata.
178    ///
179    /// If no complete transaction is detected, the journal is left unchanged.
180    /// If a partial or corrupt transaction is found, it is safely discarded.
181    ///
182    /// # Parameters
183    /// - `ffs`: A reference to the core file system state, used to apply
184    ///   recovered metadata.
185    /// - `io`: The journal I/O interface used to read journal blocks and
186    ///   perform recovery writes.
187    ///
188    /// # Returns
189    /// - `Ok(())` if recovery completed successfully or no action was needed.
190    /// - `Err(KernelError)` if an unrecoverable error occurred during recovery.
191    pub fn recovery(
192        &mut self,
193        ffs: &FastFileSystemInner,
194        io: &JournalIO,
195    ) -> Result<(), KernelError> {
196        todo!()
197    }
198
199    /// Commits completed journal transactions to the file system.
200    ///
201    /// This method performs the **checkpoint** operation: it flushes completed
202    /// transactions from the journal into the main file system, ensuring their
203    /// effects are permanently recorded.
204    ///
205    /// # Parameters
206    /// - `ffs`: A reference to the file system core (`FastFileSystemInner`),
207    ///   needed to apply changes to metadata blocks.
208    /// - `io`: An object for performing I/O operations related to the journal.
209    /// - `debug_journal`: If true, enables debug logging for checkpointing.
210    ///
211    /// # Returns
212    /// - `Ok(())`: If checkpointing succeeds and all transactions are flushed.
213    /// - `Err(KernelError)`: If I/O or consistency errors are encountered.
214    pub fn checkpoint(
215        &mut self,
216        ffs: &FastFileSystemInner,
217        io: &JournalIO,
218        debug_journal: bool,
219    ) -> Result<(), KernelError> {
220        if self.sb.commited != 0 {
221            let mut block = Box::new([0; 4096]);
222            let tx_begin = JournalTxBegin::from_io(io, ffs.journal().start + 1)?;
223            if debug_journal {
224                println!("[FFS-Journal]: Transaction #{} [", tx_begin.tx_id);
225            }
226            for (idx, slot) in tx_begin.lbas.iter().enumerate() {
227                if let Some(slot) = slot {
228                    if debug_journal {
229                        println!("[FFS-Journal]:      #{:04}: {:?},", idx, slot);
230                    }
231                    todo!();
232                } else {
233                    break;
234                }
235            }
236            if debug_journal {
237                println!("[FFS-Journal]: ] Checkpointed.");
238            }
239            self.sb.commited = 0;
240            self.sb.writeback(io, ffs)?;
241        }
242        Ok(())
243    }
244}
245
246/// Represents an in-progress file system transaction using write-ahead
247/// journaling.
248///
249/// A `RunningTransaction` buffers metadata updates to disk blocks before they
250/// are permanently written, ensuring crash consistency. When a transaction is
251/// committed, the buffered blocks are flushed to the journal area first. Once
252/// the journal write completes, the updates are applied to the actual metadata
253/// locations on disk.
254///
255/// Transactions are used to group file system changes atomically — either all
256/// updates in a transaction are committed, or none are, preventing partial
257/// updates.
258///
259/// # Fields
260/// - `tx`: A buffer that stores staged metadata writes as a list of (LBA, data)
261///   tuples.
262/// - `journal`: A locked handle to the global `Journal`, used during commit.
263/// - `tx_id`: Unique identifier for the current transaction.
264/// - `io`: The journal I/O interface used for block-level reads/writes.
265/// - `debug_journal`: Enables logging of journal operations for debugging.
266/// - `ffs`: A reference to the file system's core structure.
267pub struct RunningTransaction<'a> {
268    tx: RefCell<Vec<(LogicalBlockAddress, Box<[u8; 4096]>)>>,
269    journal: Option<SpinLockGuard<'a, Journal>>,
270    tx_id: u64,
271    io: Option<JournalIO<'a>>,
272    debug_journal: bool,
273    pub ffs: &'a FastFileSystemInner,
274}
275
276impl<'a> RunningTransaction<'a> {
277    /// Begins a new journaled transaction.
278    ///
279    /// Initializes the transaction state and prepares to buffer metadata
280    /// writes.
281    ///
282    /// # Parameters
283    /// - `name`: A label for the transaction, useful for debugging.
284    /// - `ffs`: The file system core structure.
285    /// - `io`: The journal I/O interface for block operations.
286    /// - `debug_journal`: Enables verbose logging if set to `true`.
287    #[inline]
288    pub fn begin(
289        name: &str,
290        ffs: &'a FastFileSystemInner,
291        io: JournalIO<'a>,
292        debug_journal: bool,
293    ) -> Self {
294        let mut journal = ffs.journal.as_ref().map(|journal| journal.lock());
295        let tx_id = journal
296            .as_mut()
297            .map(|j| {
298                let tx_id = j.sb.tx_id;
299                j.sb.tx_id += 1;
300                tx_id
301            })
302            .unwrap_or(0);
303        if debug_journal && journal.is_some() {
304            println!("[FFS-Journal]: Transaction #{} \"{}\" [", tx_id, name);
305        }
306        RunningTransaction {
307            tx: RefCell::new(Vec::new()),
308            journal,
309            io: Some(io),
310            tx_id,
311            debug_journal,
312            ffs,
313        }
314    }
315
316    /// Buffers a metadata block modification for inclusion in the transaction.
317    ///
318    /// The actual write is deferred until `commit()` is called.
319    ///
320    /// # Parameters
321    /// - `lba`: The logical block address where the metadata will eventually be
322    ///   written.
323    /// - `data`: A boxed page of data representing the new metadata contents.
324    /// - `ty`: A type string name of the metadata (for debugging).
325    #[inline]
326    pub fn write_meta(&self, lba: LogicalBlockAddress, data: Box<[u8; 4096]>, ty: &str) {
327        if self.debug_journal {
328            println!(
329                "[FFS-Journal]:      #{:04}: {:20} - {:?},",
330                self.tx.borrow_mut().len(),
331                ty.split(":").last().unwrap_or("?"),
332                lba
333            );
334        }
335        self.tx.borrow_mut().push((lba, data));
336    }
337
338    /// Commits the transaction to the journal and applies changes to disk.
339    ///
340    /// This method performs the following steps:
341    /// 1. Writes all staged metadata blocks to the journal region on disk.
342    /// 2. Updates the journal superblock.
343    /// 3. Checkpoint the journal.
344    ///
345    /// # Returns
346    /// - `Ok(())`: If the transaction was successfully committed and
347    ///   checkpointed.
348    /// - `Err(KernelError)`: If an I/O or consistency error occurred.
349    pub fn commit(mut self) -> Result<(), KernelError> {
350        // In real filesystem, there exist more optimizations to reduce disk I/O, such
351        // as merging the same LBA in a journal into one block.
352        let (io, tx, journal, tx_id, ffs, debug_journal) = (
353            self.io.take().unwrap(),
354            core::mem::take(&mut *self.tx.borrow_mut()),
355            self.journal.take(),
356            self.tx_id,
357            self.ffs,
358            self.debug_journal,
359        );
360
361        if let Some(journal) = journal {
362            if debug_journal {
363                println!("[FFS-Journal]: ] Commited.");
364            }
365            let (mut journal, io) = JournalWriter::new(tx, journal, io, ffs, tx_id)
366                .write_tx_begin()?
367                .write_blocks()?
368                .write_tx_end()?;
369
370            // In real file system, the checkpointing works asynchronously by the kernel
371            // thread.
372            //
373            // However, to keep the implementation simple, synchronously checkpoints the
374            // journaled update right after the commit.
375            let result = journal.checkpoint(ffs, &io, debug_journal);
376            journal.unlock();
377            result
378        } else {
379            // When a journaling is not supported, write the metadata directly on the
380            // locations.
381            for (lba, block) in tx.into_iter() {
382                io.write_metadata_block(lba, block.as_array().unwrap())?;
383            }
384            Ok(())
385        }
386    }
387}
388
389impl Drop for RunningTransaction<'_> {
390    fn drop(&mut self) {
391        if let Some(journal) = self.journal.take() {
392            journal.unlock();
393        }
394    }
395}
396
397/// Marker type for the first phase of a journal commit: TxBegin.
398///
399/// Used with [`JournalWriter`] to enforce commit stage ordering via the type
400/// system.
401pub struct TxBegin {}
402
403/// Marker type for the second phase of a journal commit: writing the metadata
404/// blocks.
405///
406/// Ensures that [`JournalWriter::write_tx_begin`] must be called before
407/// [`JournalWriter::write_blocks`].
408pub struct Block {}
409
410/// Marker type for the final phase of a journal commit: TxEnd.
411///
412/// Ensures that [`JournalWriter::write_blocks`] are completed before finalizing
413/// the transaction.
414pub struct TxEnd {}
415
416/// A staged writer for committing a transaction to the journal.
417///
418/// `JournalWriter` uses a type-state pattern to enforce the correct sequence of
419/// journal writes:
420/// - `JournalWriter<TxBegin>`: Can only call [`JournalWriter::write_tx_begin`].
421/// - `JournalWriter<Block>`: Can only call [`JournalWriter::write_blocks`].
422/// - `JournalWriter<TxEnd>`: Can only call [`JournalWriter::write_tx_end`].
423///
424/// This staged API ensures that transactions are written in the correct order
425/// and prevents accidental misuse.
426pub struct JournalWriter<'a, WriteTarget> {
427    /// Staged list of (LBA, data) pairs representing metadata blocks to commit.
428    tx: Vec<(LogicalBlockAddress, Box<[u8; 4096]>)>,
429
430    /// A lock-protected handle to the journal structure.
431    journal: SpinLockGuard<'a, Journal>,
432
433    /// I/O interface for reading/writing disk blocks.
434    io: JournalIO<'a>,
435
436    /// Reference to the filesystem's core state.
437    ffs: &'a FastFileSystemInner,
438
439    /// Unique identifier of the transaction.
440    tx_id: u64,
441
442    /// Internal index tracking progress through `tx`.
443    index: usize,
444
445    /// Phantom data used to track the current commit stage.
446    _write_target: core::marker::PhantomData<WriteTarget>,
447}
448
449impl<'a> JournalWriter<'a, TxBegin> {
450    /// Creates a new `JournalWriter` in the initial `TxBegin` stage.
451    ///
452    /// This prepares the writer for the staged commit sequence of the given
453    /// transaction.
454    ///
455    /// # Parameters
456    /// - `tx`: The list of metadata blocks to be written.
457    /// - `journal`: A locked handle to the global journal state.
458    /// - `io`: The disk I/O interface.
459    /// - `ffs`: A reference to the file system.
460    /// - `tx_id`: A unique ID assigned to the transaction.
461    ///
462    /// # Returns
463    /// A `JournalWriter` instance in the `TxBegin` state.
464    pub fn new(
465        tx: Vec<(LogicalBlockAddress, Box<[u8; 4096]>)>,
466        journal: SpinLockGuard<'a, Journal>,
467        io: JournalIO<'a>,
468        ffs: &'a FastFileSystemInner,
469        tx_id: u64,
470    ) -> Self {
471        Self {
472            tx,
473            journal,
474            io,
475            ffs,
476            tx_id,
477            index: 0,
478            _write_target: core::marker::PhantomData,
479        }
480    }
481
482    /// Writes the `TxBegin` marker to the journal.
483    ///
484    /// This signals the start of a journaled transaction. Must be called before
485    /// writing the data blocks.
486    ///
487    /// # Returns
488    /// A `JournalWriter` in the `Block` stage.
489    pub fn write_tx_begin(mut self) -> Result<JournalWriter<'a, Block>, KernelError> {
490        let mut tx_begin = JournalTxBegin::new(self.tx_id);
491        todo!();
492        Ok(JournalWriter {
493            tx: self.tx,
494            journal: self.journal,
495            ffs: self.ffs,
496            io: self.io,
497            tx_id: self.tx_id,
498            index: self.index,
499            _write_target: core::marker::PhantomData,
500        })
501    }
502}
503
504impl<'a> JournalWriter<'a, Block> {
505    /// Writes all staged metadata blocks to the journal.
506    ///
507    /// Each block is written sequentially to a dedicated journal area.
508    /// This must be called after `write_tx_begin()` and before finalizing with
509    /// `write_tx_end()`.
510    ///
511    /// # Returns
512    /// A `JournalWriter` in the `TxEnd` stage.
513    pub fn write_blocks(mut self) -> Result<JournalWriter<'a, TxEnd>, KernelError> {
514        todo!();
515        Ok(JournalWriter {
516            tx: self.tx,
517            journal: self.journal,
518            ffs: self.ffs,
519            io: self.io,
520            tx_id: self.tx_id,
521            index: self.index,
522            _write_target: core::marker::PhantomData,
523        })
524    }
525}
526
527impl<'a> JournalWriter<'a, TxEnd> {
528    /// Writes the `TxEnd` and completes the transaction by updating journal
529    /// superblock.
530    ///
531    /// This signals a successfully completed transaction and allows recovery
532    /// mechanisms to apply the journal contents to the actual file system
533    /// metadata.
534    ///
535    /// # Returns
536    /// - The locked journal and I/O handle, to checkpoint the journal.
537    /// - `Err(KernelError)` if the final commit stage fails.
538    pub fn write_tx_end(
539        mut self,
540    ) -> Result<(SpinLockGuard<'a, Journal>, JournalIO<'a>), KernelError> {
541        let tx_end = JournalTxEnd::new(self.tx_id);
542        // In the real-file system, this TxEnd block usally omitted to reduce the disk
543        // I/O.
544        todo!();
545
546        // Mark the Transaction is commited to the JournalSb.
547        let Self {
548            mut journal,
549            io,
550            ffs,
551            ..
552        } = self;
553        journal.sb.commited = 1;
554        match journal.sb.writeback(&io, ffs) {
555            Ok(_) => Ok((journal, io)),
556            Err(e) => {
557                journal.unlock();
558                Err(e)
559            }
560        }
561    }
562}