keos_project5/ffs/
journal.rs

1//! # Journaling for Crash Consistency.
2//!
3//! File systems must ensure data consistency in the presence of crashes or
4//! power failures.  When a system crash or power failure occurs, in-progress
5//! file operations may leave the file system in an inconsistent state, where
6//! metadata and data blocks are only partially updated. This can lead to file
7//! corruption, orphaned blocks, or even complete data loss. Thus, modern file
8//! systems must guard against these scenarios to ensure durability and
9//! recoverability.
10//!
11//! To address this, modern file systems employ **journaling**. Journaling
12//! provides crash-consistency by recording intended changes to a special log
13//! (called the journal) before applying them to the main file system. In the
14//! event of a crash, the journal can be replayed to recover to a consistent
15//! state. This significantly reduces the risk of data corruption and allows
16//! faster recovery after unclean shutdowns, without the need for full
17//! file system checks.
18//!
19//! In this approach, all intended updates, such as block allocations, inode
20//! changes, or directory modifications, are first written to a special log
21//! called the **journal**. Only after the log is safely persisted to disk,
22//! the actual file system structures updated. In the event of a crash, the
23//! system can replay the journal to restore a consistent state. This method
24//! provides a clear "intent before action" protocol, making recovery
25//! predictable and bounded.
26//!
27//! ## Journaling in KeOS
28//!
29//! To explore the fundamentals of crash-consistent file systems, **KeOS
30//! implements a minimal meta-data journaling mechanism** using the well-known
31//! technique of **write-ahead logging**. This mechanism ensures that
32//! updates to file system structures are made durable and recoverable.
33//!
34//! The journaling mechanism is anchored by a **journal superblock**, which
35//! includes a `commited` flag. This flag indicates whether the journal area
36//! currently holds valid, committed journal data that has not yet been
37//! checkpointed.
38//!
39//! Journals in KeOS structured around four key stages: **Metadata updates**,
40//! **commit**, **checkpoint**, and **recovery**.
41//!
42//! ### 1. Metadata Updates
43//!
44//! In KeOS, journaling is tightly integrated with the [`RunningTransaction`]
45//! struct, which acts as the central abstraction for managing write-ahead
46//! logging of file system changes. All journaled operations must be serialized
47//! through this structure to ensure consistency.
48//!
49//! Internally, [`RunningTransaction`] is protected by a `SpinLock` on the
50//! journal superblock, enforcing **global serialization** of journal writes.
51//! This design guarantees that only one transaction may be in progress at any
52//! given time, preventing concurrent updates to the same block, which could
53//! otherwise result in a corrupted or inconsistent state.
54//!
55//! Crucially, KeOS uses Rust’s strong type system to enforce this safety at
56//! compile time: without access to an active [`RunningTransaction`], it is
57//! **impossible** to write metadata blocks. All metadata modifications must be
58//! submitted explicitly via the `submit()` method, which stages the changes for
59//! journaling.
60//!
61//! If you forget to submit modified blocks through [`RunningTransaction`], the
62//! kernel will **panic** with a clear error message, catching the issue early
63//! and avoiding silent corruption. This design provides both safety and
64//! transparency, making metadata updates robust and auditable.
65//!
66//!
67//! ### 2. Commit Phase: [`RunningTransaction::commit`]
68//!
69//! In the commit phase, KeOS records all pending modifications to a dedicated
70//! **journal area** before applying them to their actual on-disk locations:
71//!
72//! A transaction begins with a **`TxBegin` block**, which contains a list of
73//! logical block addresses that describe where the updates will eventually be
74//! written. This is followed by the **journal data blocks**, which contain the
75//! actual contents to be written to the specified logical blocks. Once all data
76//! blocks have been written, a **`TxEnd` block** is appended to mark the
77//! successful conclusion of the transaction. This write-ahead logging
78//! discipline guarantees that no update reaches the main file system until its
79//! full intent is safely recorded in the journal.
80//!
81//! You can write journal blocks with [`JournalWriter`] struct. This structure
82//! is marked with a type that represent the stages of commit phase, enforcing
83//! you to write journal blocks in a correct order.
84//!
85//! ### 3. Checkpoint Phase: [`Journal::checkpoint`]
86//!
87//! After a transaction is fully committed, the system proceeds to
88//! **checkpoint** the journal. During checkpointing, the journaled data blocks
89//! are copied from the journal area to their final destinations in the main
90//! file system (i.e., to the logical block addresses specified in the `TxBegin`
91//! block).
92//!
93//! Once all modified blocks have been written to their final locations, the
94//! system clears the journal by resetting the `commited` flag in the journal
95//! superblock. This indicates that the journal is no longer recovered when
96//! crash.
97//!
98//! In modern file systems, checkpointing is typically performed
99//! **asynchronously** in the background to minimize the latency of system calls
100//! like `write()` or `fsync()`. This allows the file system to acknowledge the
101//! operation as complete once the journal is committed, without waiting for the
102//! final on-disk update.
103//!
104//! However, for simplicity in this project, **checkpointing is done
105//! synchronously**: the file system waits until all journaled updates are
106//! copied to their target locations before clearing the journal. This
107//! simplifies correctness, avoids the need for background threads or
108//! deferred work mechanisms, and reduces work for maintaining consistent view
109//! between disk and commited data.
110//!
111//!
112//! ### 4. Recovery: [`Journal::recovery`]
113//!
114//! If a crash occurs before the checkpointing phase completes, KeOS
115//! **recovers** the file system during the next boot. It begins by inspecting
116//! the journal superblock to determine whether a committed transaction exists.
117//!
118//! If the `committed` flag is set and a valid `TxBegin`/`TxEnd` pair is
119//! present, this indicates a completed transaction whose changes have not yet
120//! been checkpointed. In this case, KeOS retries the **checkpointing**. If the
121//! journal is not marked as committed, the system discards the journal
122//! entirely. This rollback ensures consistency by ignoring partially written
123//! or aborted transactions.
124//!
125//! This recovery approach is both **bounded** and **idempotent**: it scans only
126//! the small, fixed-size journal area, avoiding costly full file system
127//! traversal, and it can safely retry recovery without side effects if
128//! interrupted again.
129//!
130//! ## Implementation Requirements
131//! You need to implement the followings:
132//!   - [`Journal::recovery`]
133//!   - [`Journal::checkpoint`]
134//!   - [`JournalWriter::<TxBegin>::write_tx_begin`]
135//!   - [`JournalWriter::<Block>::write_blocks`]
136//!   - [`JournalWriter::<TxEnd>::write_tx_end`]
137//!
138//! After implement the functionalities, move on to the last [`section`] of the
139//! KeOS.
140//!
141//! [`section`]: mod@crate::advanced_file_structs
142
143use crate::ffs::{
144    FastFileSystemInner, JournalIO, LogicalBlockAddress,
145    disk_layout::{JournalSb, JournalTxBegin, JournalTxEnd},
146};
147use alloc::{boxed::Box, vec::Vec};
148use core::cell::RefCell;
149use keos::{KernelError, sync::SpinLockGuard};
150
151/// A structure representing the journal metadata used for crash consistency.
152///
153/// Journaling allows the file system to recover from crashes by recording
154/// changes in a write-ahead log before committing them to the main file system.
155/// This ensures that partially written operations do not corrupt the file
156/// system state.
157///
158/// The `Journal` struct encapsulates the journaling superblock and the total
159/// size of the journal region on disk. It is responsible for managing the
160/// checkpointing process, which commits durable changes and clears completed
161/// transactions.
162///
163/// # Fields
164/// - `sb`: The journal superblock, containing configuration and state of the
165///   journal.
166/// - `size`: The total number of blocks allocated for the journal region.
167pub struct Journal {
168    /// Journal superblock.
169    pub sb: Box<JournalSb>,
170}
171
172impl Journal {
173    /// Recovers and commited but not checkpointed transactions from the
174    /// journal.
175    ///
176    /// This function is invoked during file system startup to ensure
177    /// metadata consistency in the event of a system crash or power failure.
178    /// It scans the on-disk journal area for valid transactions and re-applies
179    /// them to the file system metadata.
180    ///
181    /// If no complete transaction is detected, the journal is left unchanged.
182    /// If a partial or corrupt transaction is found, it is safely discarded.
183    ///
184    /// # Parameters
185    /// - `ffs`: A reference to the core file system state, used to apply
186    ///   recovered metadata.
187    /// - `io`: The journal I/O interface used to read journal blocks and
188    ///   perform recovery writes.
189    ///
190    /// # Returns
191    /// - `Ok(())` if recovery completed successfully or no action was needed.
192    /// - `Err(KernelError)` if an unrecoverable error occurred during recovery.
193    pub fn recovery(
194        &mut self,
195        ffs: &FastFileSystemInner,
196        io: &JournalIO,
197    ) -> Result<(), KernelError> {
198        todo!()
199    }
200
201    /// Commits completed journal transactions to the file system.
202    ///
203    /// This method performs the **checkpoint** operation: it flushes completed
204    /// transactions from the journal into the main file system, ensuring their
205    /// effects are permanently recorded.
206    ///
207    /// # Parameters
208    /// - `ffs`: A reference to the file system core (`FastFileSystemInner`),
209    ///   needed to apply changes to metadata blocks.
210    /// - `io`: An object for performing I/O operations related to the journal.
211    /// - `debug_journal`: If true, enables debug logging for checkpointing.
212    ///
213    /// # Returns
214    /// - `Ok(())`: If checkpointing succeeds and all transactions are flushed.
215    /// - `Err(KernelError)`: If I/O or consistency errors are encountered.
216    pub fn checkpoint(
217        &mut self,
218        ffs: &FastFileSystemInner,
219        io: &JournalIO,
220        debug_journal: bool,
221    ) -> Result<(), KernelError> {
222        if self.sb.commited != 0 {
223            let mut block = Box::new([0; 4096]);
224            let tx_begin = JournalTxBegin::from_io(io, ffs.journal().start + 1)?;
225            if debug_journal {
226                println!("[FFS-Journal]: Transaction #{} [", tx_begin.tx_id);
227            }
228            for (idx, slot) in tx_begin.lbas.iter().enumerate() {
229                if let Some(slot) = slot {
230                    if debug_journal {
231                        println!("[FFS-Journal]:      #{:04}: {:?},", idx, slot);
232                    }
233                    todo!();
234                } else {
235                    break;
236                }
237            }
238            if debug_journal {
239                println!("[FFS-Journal]: ] Checkpointed.");
240            }
241            self.sb.commited = 0;
242            self.sb.writeback(io, ffs)?;
243        }
244        Ok(())
245    }
246}
247
248/// Represents an in-progress file system transaction using write-ahead
249/// journaling.
250///
251/// A `RunningTransaction` buffers metadata updates to disk blocks before they
252/// are permanently written, ensuring crash consistency. When a transaction is
253/// committed, the buffered blocks are flushed to the journal area first. Once
254/// the journal write completes, the updates are applied to the actual metadata
255/// locations on disk.
256///
257/// Transactions are used to group file system changes atomically — either all
258/// updates in a transaction are committed, or none are, preventing partial
259/// updates.
260///
261/// # Fields
262/// - `tx`: A buffer that stores staged metadata writes as a list of (LBA, data)
263///   tuples.
264/// - `journal`: A locked handle to the global `Journal`, used during commit.
265/// - `tx_id`: Unique identifier for the current transaction.
266/// - `io`: The journal I/O interface used for block-level reads/writes.
267/// - `debug_journal`: Enables logging of journal operations for debugging.
268/// - `ffs`: A reference to the file system's core structure.
269pub struct RunningTransaction<'a> {
270    tx: RefCell<Vec<(LogicalBlockAddress, Box<[u8; 4096]>)>>,
271    journal: Option<SpinLockGuard<'a, Journal>>,
272    tx_id: u64,
273    io: Option<JournalIO<'a>>,
274    debug_journal: bool,
275    pub ffs: &'a FastFileSystemInner,
276}
277
278impl<'a> RunningTransaction<'a> {
279    /// Begins a new journaled transaction.
280    ///
281    /// Initializes the transaction state and prepares to buffer metadata
282    /// writes.
283    ///
284    /// # Parameters
285    /// - `name`: A label for the transaction, useful for debugging.
286    /// - `ffs`: The file system core structure.
287    /// - `io`: The journal I/O interface for block operations.
288    /// - `debug_journal`: Enables verbose logging if set to `true`.
289    #[inline]
290    pub fn begin(
291        name: &str,
292        ffs: &'a FastFileSystemInner,
293        io: JournalIO<'a>,
294        debug_journal: bool,
295    ) -> Self {
296        let mut journal = ffs.journal.as_ref().map(|journal| journal.lock());
297        let tx_id = journal
298            .as_mut()
299            .map(|j| {
300                let tx_id = j.sb.tx_id;
301                j.sb.tx_id += 1;
302                tx_id
303            })
304            .unwrap_or(0);
305        if debug_journal && journal.is_some() {
306            println!("[FFS-Journal]: Transaction #{} \"{}\" [", tx_id, name);
307        }
308        RunningTransaction {
309            tx: RefCell::new(Vec::new()),
310            journal,
311            io: Some(io),
312            tx_id,
313            debug_journal,
314            ffs,
315        }
316    }
317
318    /// Buffers a metadata block modification for inclusion in the transaction.
319    ///
320    /// The actual write is deferred until `commit()` is called.
321    ///
322    /// # Parameters
323    /// - `lba`: The logical block address where the metadata will eventually be
324    ///   written.
325    /// - `data`: A boxed page of data representing the new metadata contents.
326    /// - `ty`: A type string name of the metadata (for debugging).
327    #[inline]
328    pub fn write_meta(&self, lba: LogicalBlockAddress, data: Box<[u8; 4096]>, ty: &str) {
329        if self.debug_journal {
330            println!(
331                "[FFS-Journal]:      #{:04}: {:20} - {:?},",
332                self.tx.borrow_mut().len(),
333                ty.split(":").last().unwrap_or("?"),
334                lba
335            );
336        }
337        self.tx.borrow_mut().push((lba, data));
338    }
339
340    /// Commits the transaction to the journal and applies changes to disk.
341    ///
342    /// This method performs the following steps:
343    /// 1. Writes all staged metadata blocks to the journal region on disk.
344    /// 2. Updates the journal superblock.
345    /// 3. Checkpoint the journal.
346    ///
347    /// # Returns
348    /// - `Ok(())`: If the transaction was successfully committed and
349    ///   checkpointed.
350    /// - `Err(KernelError)`: If an I/O or consistency error occurred.
351    pub fn commit(mut self) -> Result<(), KernelError> {
352        // In real filesystem, there exist more optimizations to reduce disk I/O, such
353        // as merging the same LBA in a journal into one block.
354        let (io, tx, journal, tx_id, ffs, debug_journal) = (
355            self.io.take().unwrap(),
356            core::mem::take(&mut *self.tx.borrow_mut()),
357            self.journal.take(),
358            self.tx_id,
359            self.ffs,
360            self.debug_journal,
361        );
362
363        if let Some(journal) = journal {
364            if debug_journal {
365                println!("[FFS-Journal]: ] Commited.");
366            }
367            let (mut journal, io) = JournalWriter::new(tx, journal, io, ffs, tx_id)
368                .write_tx_begin()?
369                .write_blocks()?
370                .write_tx_end()?;
371
372            // In real file system, the checkpointing works asynchronously by the kernel
373            // thread.
374            //
375            // However, to keep the implementation simple, synchronously checkpoints the
376            // journaled update right after the commit.
377            let result = journal.checkpoint(ffs, &io, debug_journal);
378            journal.unlock();
379            result
380        } else {
381            // When a journaling is not supported, write the metadata directly on the
382            // locations.
383            for (lba, block) in tx.into_iter() {
384                io.write_metadata_block(lba, block.as_array().unwrap())?;
385            }
386            Ok(())
387        }
388    }
389}
390
391impl Drop for RunningTransaction<'_> {
392    fn drop(&mut self) {
393        if let Some(journal) = self.journal.take() {
394            journal.unlock();
395        }
396    }
397}
398
399/// Marker type for the first phase of a journal commit: TxBegin.
400///
401/// Used with [`JournalWriter`] to enforce commit stage ordering via the type
402/// system.
403pub struct TxBegin {}
404
405/// Marker type for the second phase of a journal commit: writing the metadata
406/// blocks.
407///
408/// Ensures that [`JournalWriter::write_tx_begin`] must be called before
409/// [`JournalWriter::write_blocks`].
410pub struct Block {}
411
412/// Marker type for the final phase of a journal commit: TxEnd.
413///
414/// Ensures that [`JournalWriter::write_blocks`] are completed before finalizing
415/// the transaction.
416pub struct TxEnd {}
417
418/// A staged writer for committing a transaction to the journal.
419///
420/// `JournalWriter` uses a type-state pattern to enforce the correct sequence of
421/// journal writes:
422/// - `JournalWriter<TxBegin>`: Can only call [`JournalWriter::write_tx_begin`].
423/// - `JournalWriter<Block>`: Can only call [`JournalWriter::write_blocks`].
424/// - `JournalWriter<TxEnd>`: Can only call [`JournalWriter::write_tx_end`].
425///
426/// This staged API ensures that transactions are written in the correct order
427/// and prevents accidental misuse.
428pub struct JournalWriter<'a, WriteTarget> {
429    /// Staged list of (LBA, data) pairs representing metadata blocks to commit.
430    tx: Vec<(LogicalBlockAddress, Box<[u8; 4096]>)>,
431
432    /// A lock-protected handle to the journal structure.
433    journal: SpinLockGuard<'a, Journal>,
434
435    /// I/O interface for reading/writing disk blocks.
436    io: JournalIO<'a>,
437
438    /// Reference to the filesystem's core state.
439    ffs: &'a FastFileSystemInner,
440
441    /// Unique identifier of the transaction.
442    tx_id: u64,
443
444    /// Internal index tracking progress through `tx`.
445    index: usize,
446
447    /// Phantom data used to track the current commit stage.
448    _write_target: core::marker::PhantomData<WriteTarget>,
449}
450
451impl<'a> JournalWriter<'a, TxBegin> {
452    /// Creates a new `JournalWriter` in the initial `TxBegin` stage.
453    ///
454    /// This prepares the writer for the staged commit sequence of the given
455    /// transaction.
456    ///
457    /// # Parameters
458    /// - `tx`: The list of metadata blocks to be written.
459    /// - `journal`: A locked handle to the global journal state.
460    /// - `io`: The disk I/O interface.
461    /// - `ffs`: A reference to the file system.
462    /// - `tx_id`: A unique ID assigned to the transaction.
463    ///
464    /// # Returns
465    /// A `JournalWriter` instance in the `TxBegin` state.
466    pub fn new(
467        tx: Vec<(LogicalBlockAddress, Box<[u8; 4096]>)>,
468        journal: SpinLockGuard<'a, Journal>,
469        io: JournalIO<'a>,
470        ffs: &'a FastFileSystemInner,
471        tx_id: u64,
472    ) -> Self {
473        Self {
474            tx,
475            journal,
476            io,
477            ffs,
478            tx_id,
479            index: 0,
480            _write_target: core::marker::PhantomData,
481        }
482    }
483
484    /// Writes the `TxBegin` marker to the journal.
485    ///
486    /// This signals the start of a journaled transaction. Must be called before
487    /// writing the data blocks.
488    ///
489    /// # Returns
490    /// A `JournalWriter` in the `Block` stage.
491    pub fn write_tx_begin(mut self) -> Result<JournalWriter<'a, Block>, KernelError> {
492        let mut tx_begin = JournalTxBegin::new(self.tx_id);
493        todo!();
494        Ok(JournalWriter {
495            tx: self.tx,
496            journal: self.journal,
497            ffs: self.ffs,
498            io: self.io,
499            tx_id: self.tx_id,
500            index: self.index,
501            _write_target: core::marker::PhantomData,
502        })
503    }
504}
505
506impl<'a> JournalWriter<'a, Block> {
507    /// Writes all staged metadata blocks to the journal.
508    ///
509    /// Each block is written sequentially to a dedicated journal area.
510    /// This must be called after `write_tx_begin()` and before finalizing with
511    /// `write_tx_end()`.
512    ///
513    /// # Returns
514    /// A `JournalWriter` in the `TxEnd` stage.
515    pub fn write_blocks(mut self) -> Result<JournalWriter<'a, TxEnd>, KernelError> {
516        todo!();
517        Ok(JournalWriter {
518            tx: self.tx,
519            journal: self.journal,
520            ffs: self.ffs,
521            io: self.io,
522            tx_id: self.tx_id,
523            index: self.index,
524            _write_target: core::marker::PhantomData,
525        })
526    }
527}
528
529impl<'a> JournalWriter<'a, TxEnd> {
530    /// Writes the `TxEnd` and completes the transaction by updating journal
531    /// superblock.
532    ///
533    /// This signals a successfully completed transaction and allows recovery
534    /// mechanisms to apply the journal contents to the actual file system
535    /// metadata.
536    ///
537    /// # Returns
538    /// - The locked journal and I/O handle, to checkpoint the journal.
539    /// - `Err(KernelError)` if the final commit stage fails.
540    pub fn write_tx_end(
541        mut self,
542    ) -> Result<(SpinLockGuard<'a, Journal>, JournalIO<'a>), KernelError> {
543        let tx_end = JournalTxEnd::new(self.tx_id);
544        // In the real-file system, this TxEnd block usally omitted to reduce the disk
545        // I/O.
546        todo!();
547
548        // Mark the Transaction is commited to the JournalSb.
549        let Self {
550            mut journal,
551            io,
552            ffs,
553            ..
554        } = self;
555        journal.sb.commited = 1;
556        match journal.sb.writeback(&io, ffs) {
557            Ok(_) => Ok((journal, io)),
558            Err(e) => {
559                journal.unlock();
560                Err(e)
561            }
562        }
563    }
564}