keos_project1/
file_struct.rs

1//! # File state of a process.
2//!
3//! One of the kernel's primary responsibilities is managing process states.
4//! A process is an instance of a program being executed, abstracting a machine
5//! by encompassing various states like memory allocation, CPU registers, and
6//! the files it operates on. These process states are crucial for the kernel to
7//! allocate resources, prioritize tasks, and manage the process lifecycle
8//! (including creation, execution, suspension, and termination). The kernel
9//! processes system calls by evaluating the current state of the associated
10//! processes, checking resource availability, and ensuring that the requested
11//! operation is carried out safely and efficiently. Between them, this project
12//! focuses on the kernel's interaction with the file system.
13//!
14//! ## File
15//!
16//! A **file** primary refers an interface for accessing disk-based data. At its
17//! core, a file serves as a sequential stream of bytes. There are two primary
18//! types of files in most file systems:
19//!
20//! - **Regular files**: These contain user or system data, typically organized
21//!   as a sequence of bytes. They can store text, binary data, executable code,
22//!   and more. Regular files are the most common form of file used by
23//!   applications for reading and writing data.
24//!
25//! - **Directories**: A directory is a special kind of file that contains
26//!   mappings from human-readable names (filenames) to other files or
27//!   directories. Directories form the backbone of the file system’s
28//!   hierarchical structure, allowing files to be organized and accessed via
29//!   paths.
30//!
31//! Processes interact with files through **file descriptors**, which serve
32//! as handles to open file objects. File descriptors provide an indirection
33//! layer that allows user programs to perform operations like reading, writing,
34//! seeking, and closing, without exposing the internal details of file objects.
35//! This file descriptor plays a **crucial security role**: actual file objects
36//! reside in kernel space, and are never directly accessible from user
37//! space. By using descriptors as opaque references, the operating system
38//! enforces strict isolation between user and kernel memory, preventing
39//! accidental or malicious tampering with sensitive kernel-managed resources.
40//!
41//! File descriptors are small integer values, typically starting from 0, that
42//! index into the process's file descriptor table. This table holds references
43//! to open file objects, including metadata like the file's location, access
44//! mode (e.g., read or write), and other details necessary for I/O operations.
45//! When a process issues a file operation (e.g., reading, writing, or seeking),
46//! it provides the appropriate file descriptor as an argument to the system
47//! call. The kernel uses this descriptor to access the corresponding entry in
48//! the table and perform the requested operation.
49//!
50//! ## "Everything is a File"
51//!
52//! Beyond the abstraction about disk, the **file** abstraction is applied
53//! uniformly across a wide range of system resources. "Everything is a file" is
54//! a Unix-inspired design principle that simplifies system interaction by
55//! treating various resources—including devices, sockets, and processes—as
56//! files. While not an absolute rule, this philosophy influences many
57//! Unix-based systems, encouraging the representation of objects as file
58//! descriptors and enabling interaction through standard I/O operations. This
59//! approach provides a unified and consistent way to handle different types of
60//! system objects.
61//!
62//! A key aspect of this principle is the existence of **standard file
63//! descriptors**:
64//! - **Standard Input (stdin) - File Descriptor 0**: Used for reading input
65//!   data (e.g., keyboard input or redirected file input).
66//! - **Standard Output (stdout) - File Descriptor 1**: Used for writing output
67//!   data (e.g., printing to the terminal or redirecting output to a file).
68//! - **Standard Error (stderr) - File Descriptor 2**: Used for writing error
69//!   messages separately from standard output.
70//!
71//! Another important mechanism following this design is the **pipe**, which
72//! allows interprocess communication by connecting the output of one process to
73//! the input of another. Pipes function as a buffer between processes,
74//! facilitating seamless data exchange without requiring intermediate storage
75//! in a file. For example, executing:
76//!
77//! ```sh
78//! ls | grep "file"
79//! ```
80//! connects the `ls` command’s output to the `grep` command’s input through a
81//! pipe.
82//!
83//! ## Files in KeOS
84//!
85//! You need to extend KeOS to support the following system call with a file
86//! abstraction:
87//! - [`open`]: Open a file.
88//! - [`read`]: Read data from a file.
89//! - [`write`]: Write data to a file.
90//! - [`close`]: Close an open file.
91//! - [`seek`]: Set the file pointer to a specific position.
92//! - [`tell`]: Get the current position of the file.
93//! - [`pipe`]: Create an interprocess communication channel.
94//!
95//! To manage the state about file, KeOS manages per-process specific state
96//! about file called [`FileStruct`], which is corresponding to the Linux
97//! kernel's `struct file_struct`. Through this struct, you need to manage file
98//! descriptors that represent open files within a process. Since many system
99//! interactions are built around file descriptors, understanding this principle
100//! will help you design efficient and flexible system call handlers for file
101//! operations.
102//!
103//! You need to implement system call handlers with [`FileStruct`] struct that
104//! manages file states for a process. For example, it contains current working
105//! directory of a file (cwd), and tables of file descriptors, which map each
106//! file descriptor (fd) to a specific [`FileKind`] state. When invoking system
107//! calls, you must update these file states accordingly, ensuring the correct
108//! file state is used for each operation. To store the mapping between file
109//! descriptor and [`FileKind`] state, KeOS utilizes `BTreeMap` provided by the
110//! [`alloc::collections`] module. You might refer to [`channel`] and
111//! [`teletype`] module for implementing stdio and channel I/O.
112//!
113//! As mentioned before, kernel requires careful **error handling**. The kernel
114//! must properly ensuring that errors are reported in a stable and reliable
115//! manner without causing system crashes.
116//!
117//! ## User Memory Access
118//! Kernel **MUST NOT** believe the user input. User might maliciously or
119//! mistakenly inject invalid inputs to the system call arguments. If such input
120//! represents the invalid memory address or kernel address, directly accessing
121//! the address can leads security threats.
122//!
123//! To safely interact with user-space memory when handling system call, KeOS
124//! provides [`uaccess`] module:
125//! - [`UserPtrRO`]: A read-only user-space pointer, used for safely retrieving
126//!   structured data from user memory.
127//! - [`UserPtrWO`]: A write-only user-space pointer, used for safely writing
128//!   structured data back to user memory.
129//! - [`UserCString`]: Read null-terminated strings from user-space (e.g., file
130//!   paths).
131//! - [`UserU8SliceRO`]: Read byte slices from user-space (e.g., buffers for
132//!   reading files).
133//! - [`UserU8SliceWO`]: Write byte slices to user-space (e.g., buffers for
134//!   writing files).
135//!
136//! These types help prevent unsafe memory access and ensure proper bounds
137//! checking before performing read/write operations. When error occurs during
138//! the check, it returns the `Err` with [`KernelError::BadAddress`]. You can
139//! simply combining the `?` operator with the methods to propagate the error to
140//! the system call entry. Therefore, **you should never use `unsafe` code
141//! directly for accessing user-space memory**. Instead, utilize these safe
142//! abstractions, which provide built-in validation and access control, reducing
143//! the risk of undefined behavior, security vulnerabilities, and kernel
144//! crashes.
145
146//! #### Implementation Requirements
147//! You need to implement the followings:
148//! - [`FileStruct::install_file`]
149//! - [`FileStruct::open`]
150//! - [`FileStruct::read`]
151//! - [`FileStruct::write`]
152//! - [`FileStruct::seek`]
153//! - [`FileStruct::tell`]
154//! - [`FileStruct::close`]
155//! - [`FileStruct::pipe`]
156//!
157//! This ends the project 1.
158//!
159//! [`open`]: FileStruct::open
160//! [`read`]: FileStruct::read
161//! [`write`]: FileStruct::write
162//! [`seek`]: FileStruct::seek
163//! [`tell`]: FileStruct::tell
164//! [`close`]: FileStruct::close
165//! [`pipe`]: FileStruct::pipe
166//! [`uaccess`]: keos::syscall::uaccess
167//! [`UserPtrRO`]: keos::syscall::uaccess::UserPtrRO
168//! [`UserPtrWO`]: keos::syscall::uaccess::UserPtrWO
169//! [`UserCString`]: keos::syscall::uaccess::UserCString
170//! [`UserU8SliceRO`]: keos::syscall::uaccess::UserU8SliceRO
171//! [`UserU8SliceWO`]: keos::syscall::uaccess::UserU8SliceWO
172//! [`alloc::collections`]: <https://doc.rust-lang.org/alloc/collections/index.html>
173
174use crate::syscall::SyscallAbi;
175use alloc::collections::BTreeMap;
176use keos::{
177    KernelError,
178    fs::{Directory, RegularFile},
179    syscall::flags::FileMode,
180};
181#[cfg(doc)]
182use keos::{channel, teletype};
183
184/// The type of a file in the filesystem.
185///
186/// This enum provides a way to distinguish between regular files and special
187/// files like standard input (stdin), standard output (stdout), standard error
188/// (stderr), and interprocess communication (IPC) channels such as pipes.
189/// It allows the system to treat these different types of files accordingly
190/// when performing file operations like reading, writing, or seeking.
191#[derive(Clone)]
192pub enum FileKind {
193    /// A regular file on the filesystem.
194    RegularFile {
195        /// A [`RegularFile`] object, which holds the underlying kernel
196        /// structure that represents the actual file in the kernel's
197        /// memory. This structure contains additional metadata about
198        /// the file, such as its name.
199        file: RegularFile,
200        /// The current position in the file (offset).
201        ///
202        /// This field keeps track of the current position of the file pointer
203        /// within the file. The position is measured in bytes from the
204        /// beginning of the file. It is updated whenever a read or write
205        /// operation is performed, allowing the system to track where
206        /// the next operation will occur.
207        ///
208        /// Example: If the file's position is 100, the next read or write
209        /// operation will begin at byte 100.
210        position: usize,
211    },
212    /// A directory of the filesystem.
213    ///
214    /// This variant represents a directory in the filesystem. Unlike regular
215    /// files, directories serve as containers for other files and
216    /// directories. Operations on directories typically include listing
217    /// contents, searching for files, and navigating file structures.
218    Directory {
219        dir: Directory,
220        /// The current position in the directory (offset).
221        ///
222        /// This field is internally used in readdir() function to track
223        /// how much entries
224        position: usize,
225    },
226    /// A special file for standard input/output streams.
227    ///
228    /// This variant represents standard I/O streams like stdin, stdout, and
229    /// stderr. These are not associated with physical files on disk but are
230    /// used for interaction between processes and the console or terminal.
231    ///
232    /// - **Standard Input (`stdin`)**: Used to receive user input.
233    /// - **Standard Output (`stdout`)**: Used to display process output.
234    /// - **Standard Error (`stderr`)**: Used to display error messages.
235    Stdio,
236    /// A receive endpoint for interprocess communication (IPC).
237    ///
238    /// This variant represents a receiving channel in an IPC mechanism,
239    /// commonly used for message-passing between processes. It
240    /// acts as a read-only endpoint, allowing a process to receive data
241    /// from a corresponding [`FileKind::Tx`] (transmit) channel.
242    ///
243    /// Data sent through the corresponding [`FileKind::Tx`] endpoint is
244    /// buffered and can be read asynchronously using this receiver. Once
245    /// all [`FileKind::Tx`] handles are closed, reads will return an
246    /// end-of-file (EOF) indication.
247    ///
248    /// This is useful for implementing features like pipes, message queues, or
249    /// event notifications.
250    Rx(keos::channel::Receiver<u8>),
251    /// A transmit endpoint for interprocess communication (IPC).
252    ///
253    /// This variant represents a sending channel in an IPC mechanism. It serves
254    /// as a write-only endpoint, allowing a process to send data to a
255    /// corresponding [`FileKind::Rx`] (receive) channel.
256    ///
257    /// Data written to this [`FileKind::Tx`] endpoint is buffered until it is
258    /// read by the corresponding [`FileKind::Rx`] endpoint. If no receiver
259    /// exists, writes may block or fail depending on the system's IPC
260    /// behavior.
261    ///
262    /// This is commonly used in pipes, producer-consumer queues, and task
263    /// synchronization mechanisms.
264    Tx(keos::channel::Sender<u8>),
265}
266
267/// The [`File`] struct represents an abstraction over a file descriptor in the
268/// operating system.
269///
270/// This struct encapsulates information about an open file,
271/// access mode, and other metadata necessary for performing file operations
272/// such as reading, writing, and seeking. It also holds a reference to the
273/// kernel's underlying file structure ([`FileKind`]), allowing the operating
274/// system to perform actual file operations on the filesystem.
275///
276/// The [`File`] struct is used to track the state of an open file, ensuring
277/// that the correct file operations are executed and resources are managed
278/// efficiently.
279#[derive(Clone)]
280pub struct File {
281    /// The access mode of the file (e.g., read, write, read/write).
282    ///
283    /// [`FileMode`] is used by user program to tell kernel "how" open the file,
284    /// and records internally what operation can be done on the file.
285    ///
286    /// Refer to [`FileMode`] for detail.
287    pub mode: FileMode,
288
289    /// The kernel file structure.
290    ///
291    /// This field contains the underlying representation of the file within the
292    /// operating system kernel. It holds the kernel's metadata for the
293    /// file, such as its name, permissions, and the actual file object used
294    /// to perform system-level file operations.
295    ///
296    /// The [`FileKind`] enum allows this field to represent either a regular
297    /// file ([`FileKind::RegularFile`]) or a special file such as standard
298    /// input/output ([`FileKind::Stdio`]).
299    pub file: FileKind,
300}
301
302/// Represents an index into a process’s file descriptor table.
303///
304/// In most operating systems, each process maintains a **file descriptor
305/// table** that maps small integers (file descriptors) to open file objects.
306/// A [`FileDescriptor`] is a wrapper around an `i32` that provides
307/// stronger type safety when handling these indices in the kernel.
308#[derive(Debug, PartialEq, Eq, PartialOrd, Ord, Clone, Copy)]
309pub struct FileDescriptor(pub i32);
310
311/// The [`FileStruct`] represents the filesystem state for a specific
312/// process, which corresponding to the Linux kernel's `struct files_struct`.
313///
314/// This struct encapsulates information about the current state of the
315/// filesystem for the process, including the management of open
316/// files, their positions, and the operations that can be performed on them.
317/// The [`FileStruct`] is responsible for keeping track of file descriptors and
318/// their associated file states, ensuring that file operations (like
319/// reading, writing, seeking, and closing) are executed correctly and
320/// efficiently.
321///
322/// It also provides a mechanism for the operating system to manage and interact
323/// with files in a multi-process environment, allowing for
324/// process-local filesystem management.
325///
326/// # Filesystem State
327///
328/// The filesystem state refers to the set of files currently open for a given
329///  process. This includes managing the file descriptors (unique
330/// identifiers for open files), file positions, and ensuring that resources are
331/// freed once a file is closed.
332#[derive(Clone)]
333pub struct FileStruct {
334    /// The current working directory of the process.
335    pub cwd: Directory,
336    /// The file descriptor table of the process.
337    pub files: BTreeMap<FileDescriptor, File>,
338}
339
340impl Default for FileStruct {
341    fn default() -> Self {
342        Self::new()
343    }
344}
345
346impl FileStruct {
347    /// Creates a new instance of [`FileStruct`].
348    ///
349    /// This function initializes a new filesystem state, typically when a
350    /// process starts or when a fresh file operation is needed.
351    ///
352    /// # Returns
353    ///
354    /// Returns a new [`FileStruct`] struct, representing a clean slate for the
355    /// filesystem state. The clean state must initialize the STDIN, STDOUT,
356    /// STDERR.
357    pub fn new() -> Self {
358        let mut this = Self {
359            cwd: keos::fs::FileSystem::root(),
360            files: BTreeMap::new(),
361        };
362        this.install_file(File {
363            mode: FileMode::Read,
364            file: FileKind::Stdio,
365        })
366        .unwrap();
367        this.install_file(File {
368            mode: FileMode::Write,
369            file: FileKind::Stdio,
370        })
371        .unwrap();
372        this.install_file(File {
373            mode: FileMode::Write,
374            file: FileKind::Stdio,
375        })
376        .unwrap();
377        this
378    }
379
380    /// Installs a [`File`] into the process’s file descriptor table.
381    ///
382    /// This method assigns the lowest available file descriptor number to
383    /// `file` and returns it as a [`FileDescriptor`].
384    /// The descriptor can then be used by the process to perform I/O operations
385    /// such as `read`, `write`, `stat`, or `close`.
386    ///
387    /// # Errors
388    /// - Returns [`KernelError::TooManyOpenFile`] if the process already has
389    ///   more than **1024 open files**, meaning no additional descriptors are
390    ///   available.
391    pub fn install_file(&mut self, file: File) -> Result<FileDescriptor, KernelError> {
392        todo!()
393    }
394
395    /// Opens a file.
396    ///
397    /// This function handles the system call for opening a file, including
398    /// checking if the file exists, and setting up the file's access mode
399    /// (e.g., read, write, or append). It modifies the [`FileStruct`] by
400    /// associating the file with the current process and prepares the file
401    /// for subsequent operations.
402    ///
403    /// # Errors
404    /// - Returns [`KernelError::InvalidArgument`] if unexpected access mode
405    ///   is provided.
406    /// - Propagates any errors from underlying APIs (e.g. [`uaccess`](keos::syscall::uaccess)).
407    /// 
408    /// # Syscall API
409    /// ```c
410    /// int open(const char *pathname, int flags);
411    /// ```
412    /// - `pathname`: Path to the file to be opened.
413    /// - `flags`: Specifies the access mode. The possible values are:
414    ///   - `O_RDONLY` (0): The file is opened for read only.
415    ///   - `O_WRONLY` (1): The file is opened for write only.
416    ///   - `O_RDWR`   (2): The file is opened for both read and write.
417    ///
418    /// Returns the corresponding file descriptor number for the opened file.
419    pub fn open(&mut self, abi: &SyscallAbi) -> Result<usize, KernelError> {
420        todo!()
421    }
422
423    /// Reads data from an open file.
424    ///
425    /// This function implements the system call for reading from an open file.
426    /// It reads up to a specified number of bytes from the file and returns
427    /// them to the user. The current file position is adjusted accordingly.
428    /// 
429    /// # Errors
430    /// - Returns [`KernelError::IsDirectory`] if the specified file is a directory.
431    /// - Returns [`KernelError::BrokenPipe`] if the specified file is a disconnected
432    ///   interprocesscommunication channel.
433    /// - Returns [`KernelError::BadFileDescriptor`] if the specified file descriptor is
434    ///   invalid.
435    ///
436    /// # Syscall API
437    /// ```c
438    /// ssize_t read(int fd, void *buf, size_t count);
439    /// ```
440    /// - `fd`: File descriptor of the file to read from.
441    /// - `buf`: Buffer to store the data read from the file.
442    /// - `count`: Number of bytes to read.
443    ///
444    /// Returns the actual number of bytes read.
445    pub fn read(&mut self, abi: &SyscallAbi) -> Result<usize, KernelError> {
446        todo!()
447    }
448
449    /// Writes data to an open file.
450    ///
451    /// This function implements the system call for writing data to a file. It
452    /// writes a specified number of bytes to the file, starting from the
453    /// current file position. The file's state is updated accordingly.
454    ///
455    /// # Errors
456    /// - Returns [`KernelError::IsDirectory`] if the specified file is a directory.
457    /// - Returns [`KernelError::BrokenPipe`] if the specified file is a disconnected
458    ///   interprocesscommunication channel.
459    /// - Returns [`KernelError::BadFileDescriptor`] if the specified file descriptor is
460    ///   invalid.
461    /// - Propagates any errors from underlying APIs (e.g. [`uaccess`](keos::syscall::uaccess)).
462    ///
463    /// # Syscall API
464    /// ```c
465    /// ssize_t write(int fd, const void *buf, size_t count);
466    /// ```
467    /// - `fd`: File descriptor of the file to write to.
468    /// - `buf`: Buffer containing the data to be written.
469    /// - `count`: Number of bytes to write.
470    ///
471    /// Returns the number of bytes written
472    pub fn write(&mut self, abi: &SyscallAbi) -> Result<usize, KernelError> {
473        todo!()
474    }
475
476    /// Seeks to a new position in the file.
477    ///
478    /// This function implements the system call for moving the file pointer to
479    /// a specified position within the file. The position can be set
480    /// relative to the beginning, current position, or end of the file.
481    ///
482    /// # Errors
483    /// - Returns [`KernelError::InvalidArgument`] if the calculated position is
484    ///   invalid.
485    /// - Returns [`KernelError::InvalidArgument`] if the specified file is not a
486    ///  [`FileKind::RegularFile`].
487    /// - Returns [`KernelError::BadFileDescriptor`] if specified file descriptor is
488    ///   invalid.
489    /// - Propagates any errors from underlying APIs (e.g. [`uaccess`](keos::syscall::uaccess)).
490    /// 
491    /// # Syscall API
492    /// ```c
493    /// off_t seek(int fd, off_t offset, int whence);
494    /// ```
495    /// - `fd`: File descriptor of the file to seek in.
496    /// - `offset`: Number of bytes to move the file pointer.
497    /// - `whence`: Specifies how the offset is to be interpreted. Common values
498    ///   are:
499    ///   - `SEEK_SET` (0): The offset is relative to the beginning of the file.
500    ///   - `SEEK_CUR` (1): The offset is relative to the current file position.
501    ///   - `SEEK_END` (2): The offset is relative to the end of the file.
502    ///
503    /// Returns the new position of the file descriptor after moving it.
504    pub fn seek(&mut self, abi: &SyscallAbi) -> Result<usize, KernelError> {
505        todo!()
506    }
507
508    /// Tells the current position in the file.
509    ///
510    /// This function implements the system call for retrieving the current file
511    /// pointer position. It allows the program to know where in the file
512    /// the next operation will occur.
513    /// 
514    /// # Errors
515    /// - Returns [`KernelError::InvalidArgument`] if the specified file is not a
516    ///   [`FileKind::RegularFile`].
517    /// - Returns [`KernelError::BadFileDescriptor`] if specified file descriptor is
518    ///   invalid.
519    /// 
520    /// # Syscall API
521    /// ```c
522    /// off_t tell(int fd);
523    /// ```
524    /// - `fd`: File descriptor of the file.
525    ///
526    /// Returns the position of the file descriptor.
527    pub fn tell(&mut self, abi: &SyscallAbi) -> Result<usize, KernelError> {
528        todo!()
529    }
530
531    /// Closes an open file.
532    ///
533    /// This function implements the system call for closing an open file.
534    /// 
535    /// # Errors
536    /// - Returns [`KernelError::BadFileDescriptor`] if specified file descriptor is
537    ///   invalid.
538    /// 
539    /// # Syscall API
540    /// ```c
541    /// int close(int fd);
542    /// ```
543    ///  - `fd`: File descriptor to close.
544    ///
545    /// Returns 0 if success.
546    pub fn close(&mut self, abi: &SyscallAbi) -> Result<usize, KernelError> {
547        todo!()
548    }
549
550    /// Creates an interprocess communication channel between two file
551    /// descriptors.
552    //
553    /// Pipes are unidirectional communication channels commonly used for IPC.
554    /// Data written to `pipefd[1]` can be read from `pipefd[0]`.
555    ///
556    /// A process that read from pipe must wait if there are no bytes to be
557    /// read.
558    ///
559    /// # Syscall API
560    /// ```c
561    /// int pipe(int pipefd[2]);
562    /// ```
563    /// - `pipefd`: An array of two file descriptors, where `pipefd[0]` is for
564    ///   reading and `pipefd[1]` is for writing.
565    ///
566    /// Returns 0 if success.
567    pub fn pipe(&mut self, abi: &SyscallAbi) -> Result<usize, KernelError> {
568        todo!()
569    }
570}