keos_project1/file_struct.rs
1//! # File state of a process.
2//!
3//! One of the kernel's primary responsibilities is managing process states.
4//! A process is an instance of a program being executed, abstracting a machine
5//! by encompassing various states like memory allocation, CPU registers, and
6//! the files it operates on. These process states are crucial for the kernel to
7//! allocate resources, prioritize tasks, and manage the process lifecycle
8//! (including creation, execution, suspension, and termination). The kernel
9//! processes system calls by evaluating the current state of the associated
10//! processes, checking resource availability, and ensuring that the requested
11//! operation is carried out safely and efficiently. Between them, this project
12//! focuses on the kernel's interaction with the file system.
13//!
14//! ## File
15//!
16//! A **file** primary refers an interface for accessing disk-based data. At its
17//! core, a file serves as a sequential stream of bytes. There are two primary
18//! types of files in most file systems:
19//!
20//! - **Regular files**: These contain user or system data, typically organized
21//! as a sequence of bytes. They can store text, binary data, executable code,
22//! and more. Regular files are the most common form of file used by
23//! applications for reading and writing data.
24//!
25//! - **Directories**: A directory is a special kind of file that contains
26//! mappings from human-readable names (filenames) to other files or
27//! directories. Directories form the backbone of the file system’s
28//! hierarchical structure, allowing files to be organized and accessed via
29//! paths.
30//!
31//! Processes interact with files through **file descriptors**, which serve
32//! as handles to open file objects. File descriptors provide an indirection
33//! layer that allows user programs to perform operations like reading, writing,
34//! seeking, and closing, without exposing the internal details of file objects.
35//! This file descriptor plays a **crucial security role**: actual file objects
36//! reside in kernel space, and are never directly accessible from user
37//! space. By using descriptors as opaque references, the operating system
38//! enforces strict isolation between user and kernel memory, preventing
39//! accidental or malicious tampering with sensitive kernel-managed resources.
40//!
41//! File descriptors are small integer values, typically starting from 0, that
42//! index into the process's file descriptor table. This table holds references
43//! to open file objects, including metadata like the file's location, access
44//! mode (e.g., read or write), and other details necessary for I/O operations.
45//! When a process issues a file operation (e.g., reading, writing, or seeking),
46//! it provides the appropriate file descriptor as an argument to the system
47//! call. The kernel uses this descriptor to access the corresponding entry in
48//! the table and perform the requested operation.
49//!
50//! ## "Everything is a File"
51//!
52//! Beyond the abstraction about disk, the **file** abstraction is applied
53//! uniformly across a wide range of system resources. "Everything is a file" is
54//! a Unix-inspired design principle that simplifies system interaction by
55//! treating various resources—including devices, sockets, and processes—as
56//! files. While not an absolute rule, this philosophy influences many
57//! Unix-based systems, encouraging the representation of objects as file
58//! descriptors and enabling interaction through standard I/O operations. This
59//! approach provides a unified and consistent way to handle different types of
60//! system objects.
61//!
62//! A key aspect of this principle is the existence of **standard file
63//! descriptors**:
64//! - **Standard Input (stdin) - File Descriptor 0**: Used for reading input
65//! data (e.g., keyboard input or redirected file input).
66//! - **Standard Output (stdout) - File Descriptor 1**: Used for writing output
67//! data (e.g., printing to the terminal or redirecting output to a file).
68//! - **Standard Error (stderr) - File Descriptor 2**: Used for writing error
69//! messages separately from standard output.
70//!
71//! Another important mechanism following this design is the **pipe**, which
72//! allows interprocess communication by connecting the output of one process to
73//! the input of another. Pipes function as a buffer between processes,
74//! facilitating seamless data exchange without requiring intermediate storage
75//! in a file. For example, executing:
76//!
77//! ```sh
78//! ls | grep "file"
79//! ```
80//! connects the `ls` command’s output to the `grep` command’s input through a
81//! pipe.
82//!
83//! ## Files in KeOS
84//!
85//! You need to extend KeOS to support the following system call with a file
86//! abstraction:
87//! - [`open`]: Open a file.
88//! - [`read`]: Read data from a file.
89//! - [`write`]: Write data to a file.
90//! - [`close`]: Close an open file.
91//! - [`seek`]: Set the file pointer to a specific position.
92//! - [`tell`]: Get the current position of the file.
93//! - [`pipe`]: Create an interprocess communication channel.
94//!
95//! To manage the state about file, KeOS manages per-process specific state
96//! about file called [`FileStruct`], which is corresponding to the Linux
97//! kernel's `struct file_struct`. Through this struct, you need to manage file
98//! descriptors that represent open files within a process. Since many system
99//! interactions are built around file descriptors, understanding this principle
100//! will help you design efficient and flexible system call handlers for file
101//! operations.
102//!
103//! You need to implement system call handlers with [`FileStruct`] struct that
104//! manages file states for a process. For example, it contains current working
105//! directory of a file (cwd), and tables of file descriptors, which map each
106//! file descriptor (fd) to a specific [`FileKind`] state. When invoking system
107//! calls, you must update these file states accordingly, ensuring the correct
108//! file state is used for each operation. To store the mapping between file
109//! descriptor and [`FileKind`] state, KeOS utilizes `BTreeMap` provided by the
110//! [`alloc::collections`] module. You might refer to [`channel`] and
111//! [`teletype`] module for implementing stdio and channel I/O.
112//!
113//! As mentioned before, kernel requires careful **error handling**. The kernel
114//! must properly ensuring that errors are reported in a stable and reliable
115//! manner without causing system crashes.
116//!
117//! ## User Memory Access
118//! Kernel **MUST NOT** believe the user input. User might maliciously or
119//! mistakenly inject invalid inputs to the system call arguments. If such input
120//! represents the invalid memory address or kernel address, directly accessing
121//! the address can leads security threats.
122//!
123//! To safely interact with user-space memory when handling system call, KeOS
124//! provides [`uaccess`] module:
125//! - [`UserPtrRO`]: A read-only user-space pointer, used for safely retrieving
126//! structured data from user memory.
127//! - [`UserPtrWO`]: A write-only user-space pointer, used for safely writing
128//! structured data back to user memory.
129//! - [`UserCString`]: Read null-terminated strings from user-space (e.g., file
130//! paths).
131//! - [`UserU8SliceRO`]: Read byte slices from user-space (e.g., buffers for
132//! reading files).
133//! - [`UserU8SliceWO`]: Write byte slices to user-space (e.g., buffers for
134//! writing files).
135//!
136//! These types help prevent unsafe memory access and ensure proper bounds
137//! checking before performing read/write operations. When error occurs during
138//! the check, it returns the `Err` with [`KernelError::BadAddress`]. You can
139//! simply combining the `?` operator with the methods to propagate the error to
140//! the system call entry. Therefore, **you should never use `unsafe` code
141//! directly for accessing user-space memory**. Instead, utilize these safe
142//! abstractions, which provide built-in validation and access control, reducing
143//! the risk of undefined behavior, security vulnerabilities, and kernel
144//! crashes.
145
146//! #### Implementation Requirements
147//! You need to implement the followings:
148//! - [`FileStruct::install_file`]
149//! - [`FileStruct::open`]
150//! - [`FileStruct::read`]
151//! - [`FileStruct::write`]
152//! - [`FileStruct::seek`]
153//! - [`FileStruct::tell`]
154//! - [`FileStruct::close`]
155//! - [`FileStruct::pipe`]
156//!
157//! This ends the project 1.
158//!
159//! [`open`]: FileStruct::open
160//! [`read`]: FileStruct::read
161//! [`write`]: FileStruct::write
162//! [`seek`]: FileStruct::seek
163//! [`tell`]: FileStruct::tell
164//! [`close`]: FileStruct::close
165//! [`pipe`]: FileStruct::pipe
166//! [`uaccess`]: keos::syscall::uaccess
167//! [`UserPtrRO`]: keos::syscall::uaccess::UserPtrRO
168//! [`UserPtrWO`]: keos::syscall::uaccess::UserPtrWO
169//! [`UserCString`]: keos::syscall::uaccess::UserCString
170//! [`UserU8SliceRO`]: keos::syscall::uaccess::UserU8SliceRO
171//! [`UserU8SliceWO`]: keos::syscall::uaccess::UserU8SliceWO
172//! [`alloc::collections`]: <https://doc.rust-lang.org/alloc/collections/index.html>
173
174use crate::syscall::SyscallAbi;
175use alloc::collections::BTreeMap;
176use keos::{
177 KernelError,
178 fs::{Directory, RegularFile},
179 syscall::flags::FileMode,
180};
181#[cfg(doc)]
182use keos::{channel, teletype};
183
184/// The type of a file in the filesystem.
185///
186/// This enum provides a way to distinguish between regular files and special
187/// files like standard input (stdin), standard output (stdout), standard error
188/// (stderr), and interprocess communication (IPC) channels such as pipes.
189/// It allows the system to treat these different types of files accordingly
190/// when performing file operations like reading, writing, or seeking.
191#[derive(Clone)]
192pub enum FileKind {
193 /// A regular file on the filesystem.
194 RegularFile {
195 /// A [`RegularFile`] object, which holds the underlying kernel
196 /// structure that represents the actual file in the kernel's
197 /// memory. This structure contains additional metadata about
198 /// the file, such as its name.
199 file: RegularFile,
200 /// The current position in the file (offset).
201 ///
202 /// This field keeps track of the current position of the file pointer
203 /// within the file. The position is measured in bytes from the
204 /// beginning of the file. It is updated whenever a read or write
205 /// operation is performed, allowing the system to track where
206 /// the next operation will occur.
207 ///
208 /// Example: If the file's position is 100, the next read or write
209 /// operation will begin at byte 100.
210 position: usize,
211 },
212 /// A directory of the filesystem.
213 ///
214 /// This variant represents a directory in the filesystem. Unlike regular
215 /// files, directories serve as containers for other files and
216 /// directories. Operations on directories typically include listing
217 /// contents, searching for files, and navigating file structures.
218 Directory {
219 dir: Directory,
220 /// The current position in the directory (offset).
221 ///
222 /// This field is internally used in readdir() function to track
223 /// how much entries
224 position: usize,
225 },
226 /// A special file for standard input/output streams.
227 ///
228 /// This variant represents standard I/O streams like stdin, stdout, and
229 /// stderr. These are not associated with physical files on disk but are
230 /// used for interaction between processes and the console or terminal.
231 ///
232 /// - **Standard Input (`stdin`)**: Used to receive user input.
233 /// - **Standard Output (`stdout`)**: Used to display process output.
234 /// - **Standard Error (`stderr`)**: Used to display error messages.
235 Stdio,
236 /// A receive endpoint for interprocess communication (IPC).
237 ///
238 /// This variant represents a receiving channel in an IPC mechanism,
239 /// commonly used for message-passing between processes. It
240 /// acts as a read-only endpoint, allowing a process to receive data
241 /// from a corresponding [`FileKind::Tx`] (transmit) channel.
242 ///
243 /// Data sent through the corresponding [`FileKind::Tx`] endpoint is
244 /// buffered and can be read asynchronously using this receiver. Once
245 /// all [`FileKind::Tx`] handles are closed, reads will return an
246 /// end-of-file (EOF) indication.
247 ///
248 /// This is useful for implementing features like pipes, message queues, or
249 /// event notifications.
250 Rx(keos::channel::Receiver<u8>),
251 /// A transmit endpoint for interprocess communication (IPC).
252 ///
253 /// This variant represents a sending channel in an IPC mechanism. It serves
254 /// as a write-only endpoint, allowing a process to send data to a
255 /// corresponding [`FileKind::Rx`] (receive) channel.
256 ///
257 /// Data written to this [`FileKind::Tx`] endpoint is buffered until it is
258 /// read by the corresponding [`FileKind::Rx`] endpoint. If no receiver
259 /// exists, writes may block or fail depending on the system's IPC
260 /// behavior.
261 ///
262 /// This is commonly used in pipes, producer-consumer queues, and task
263 /// synchronization mechanisms.
264 Tx(keos::channel::Sender<u8>),
265}
266
267/// The [`File`] struct represents an abstraction over a file descriptor in the
268/// operating system.
269///
270/// This struct encapsulates information about an open file,
271/// access mode, and other metadata necessary for performing file operations
272/// such as reading, writing, and seeking. It also holds a reference to the
273/// kernel's underlying file structure ([`FileKind`]), allowing the operating
274/// system to perform actual file operations on the filesystem.
275///
276/// The [`File`] struct is used to track the state of an open file, ensuring
277/// that the correct file operations are executed and resources are managed
278/// efficiently.
279#[derive(Clone)]
280pub struct File {
281 /// The access mode of the file (e.g., read, write, read/write).
282 ///
283 /// [`FileMode`] is used by user program to tell kernel "how" open the file,
284 /// and records internally what operation can be done on the file.
285 ///
286 /// Refer to [`FileMode`] for detail.
287 pub mode: FileMode,
288
289 /// The kernel file structure.
290 ///
291 /// This field contains the underlying representation of the file within the
292 /// operating system kernel. It holds the kernel's metadata for the
293 /// file, such as its name, permissions, and the actual file object used
294 /// to perform system-level file operations.
295 ///
296 /// The [`FileKind`] enum allows this field to represent either a regular
297 /// file ([`FileKind::RegularFile`]) or a special file such as standard
298 /// input/output ([`FileKind::Stdio`]).
299 pub file: FileKind,
300}
301
302/// Represents an index into a process’s file descriptor table.
303///
304/// In most operating systems, each process maintains a **file descriptor
305/// table** that maps small integers (file descriptors) to open file objects.
306/// A [`FileDescriptor`] is a wrapper around an `i32` that provides
307/// stronger type safety when handling these indices in the kernel.
308#[derive(Debug, PartialEq, Eq, PartialOrd, Ord, Clone, Copy)]
309pub struct FileDescriptor(pub i32);
310
311/// The [`FileStruct`] represents the filesystem state for a specific
312/// process, which corresponding to the Linux kernel's `struct files_struct`.
313///
314/// This struct encapsulates information about the current state of the
315/// filesystem for the process, including the management of open
316/// files, their positions, and the operations that can be performed on them.
317/// The [`FileStruct`] is responsible for keeping track of file descriptors and
318/// their associated file states, ensuring that file operations (like
319/// reading, writing, seeking, and closing) are executed correctly and
320/// efficiently.
321///
322/// It also provides a mechanism for the operating system to manage and interact
323/// with files in a multi-process environment, allowing for
324/// process-local filesystem management.
325///
326/// # Filesystem State
327///
328/// The filesystem state refers to the set of files currently open for a given
329/// process. This includes managing the file descriptors (unique
330/// identifiers for open files), file positions, and ensuring that resources are
331/// freed once a file is closed.
332#[derive(Clone)]
333pub struct FileStruct {
334 /// The current working directory of the process.
335 pub cwd: Directory,
336 /// The file descriptor table of the process.
337 pub files: BTreeMap<FileDescriptor, File>,
338}
339
340impl Default for FileStruct {
341 fn default() -> Self {
342 Self::new()
343 }
344}
345
346impl FileStruct {
347 /// Creates a new instance of [`FileStruct`].
348 ///
349 /// This function initializes a new filesystem state, typically when a
350 /// process starts or when a fresh file operation is needed.
351 ///
352 /// # Returns
353 ///
354 /// Returns a new [`FileStruct`] struct, representing a clean slate for the
355 /// filesystem state. The clean state must initialize the STDIN, STDOUT,
356 /// STDERR.
357 pub fn new() -> Self {
358 let mut this = Self {
359 cwd: keos::fs::FileSystem::root(),
360 files: BTreeMap::new(),
361 };
362 this.install_file(File {
363 mode: FileMode::Read,
364 file: FileKind::Stdio,
365 })
366 .unwrap();
367 this.install_file(File {
368 mode: FileMode::Write,
369 file: FileKind::Stdio,
370 })
371 .unwrap();
372 this.install_file(File {
373 mode: FileMode::Write,
374 file: FileKind::Stdio,
375 })
376 .unwrap();
377 this
378 }
379
380 /// Installs a [`File`] into the process’s file descriptor table.
381 ///
382 /// This method assigns the lowest available file descriptor number to
383 /// `file` and returns it as a [`FileDescriptor`].
384 /// The descriptor can then be used by the process to perform I/O operations
385 /// such as `read`, `write`, `stat`, or `close`.
386 ///
387 /// # Errors
388 /// - Returns [`KernelError::TooManyOpenFile`] if the process already has
389 /// more than **1024 open files**, meaning no additional descriptors are
390 /// available.
391 pub fn install_file(&mut self, file: File) -> Result<FileDescriptor, KernelError> {
392 todo!()
393 }
394
395 /// Opens a file.
396 ///
397 /// This function handles the system call for opening a file, including
398 /// checking if the file exists, and setting up the file's access mode
399 /// (e.g., read, write, or append). It modifies the [`FileStruct`] by
400 /// associating the file with the current process and prepares the file
401 /// for subsequent operations.
402 ///
403 /// # Errors
404 /// - Returns [`KernelError::InvalidArgument`] if unexpected access mode
405 /// is provided.
406 /// - Propagates any errors from underlying APIs (e.g. [`uaccess`](keos::syscall::uaccess)).
407 ///
408 /// # Syscall API
409 /// ```c
410 /// int open(const char *pathname, int flags);
411 /// ```
412 /// - `pathname`: Path to the file to be opened.
413 /// - `flags`: Specifies the access mode. The possible values are:
414 /// - `O_RDONLY` (0): The file is opened for read only.
415 /// - `O_WRONLY` (1): The file is opened for write only.
416 /// - `O_RDWR` (2): The file is opened for both read and write.
417 ///
418 /// Returns the corresponding file descriptor number for the opened file.
419 pub fn open(&mut self, abi: &SyscallAbi) -> Result<usize, KernelError> {
420 todo!()
421 }
422
423 /// Reads data from an open file.
424 ///
425 /// This function implements the system call for reading from an open file.
426 /// It reads up to a specified number of bytes from the file and returns
427 /// them to the user. The current file position is adjusted accordingly.
428 ///
429 /// # Errors
430 /// - Returns [`KernelError::IsDirectory`] if the specified file is a directory.
431 /// - Returns [`KernelError::BrokenPipe`] if the specified file is a disconnected
432 /// interprocesscommunication channel.
433 /// - Returns [`KernelError::BadFileDescriptor`] if the specified file descriptor is
434 /// invalid.
435 ///
436 /// # Syscall API
437 /// ```c
438 /// ssize_t read(int fd, void *buf, size_t count);
439 /// ```
440 /// - `fd`: File descriptor of the file to read from.
441 /// - `buf`: Buffer to store the data read from the file.
442 /// - `count`: Number of bytes to read.
443 ///
444 /// Returns the actual number of bytes read.
445 pub fn read(&mut self, abi: &SyscallAbi) -> Result<usize, KernelError> {
446 todo!()
447 }
448
449 /// Writes data to an open file.
450 ///
451 /// This function implements the system call for writing data to a file. It
452 /// writes a specified number of bytes to the file, starting from the
453 /// current file position. The file's state is updated accordingly.
454 ///
455 /// # Errors
456 /// - Returns [`KernelError::IsDirectory`] if the specified file is a directory.
457 /// - Returns [`KernelError::BrokenPipe`] if the specified file is a disconnected
458 /// interprocesscommunication channel.
459 /// - Returns [`KernelError::BadFileDescriptor`] if the specified file descriptor is
460 /// invalid.
461 /// - Propagates any errors from underlying APIs (e.g. [`uaccess`](keos::syscall::uaccess)).
462 ///
463 /// # Syscall API
464 /// ```c
465 /// ssize_t write(int fd, const void *buf, size_t count);
466 /// ```
467 /// - `fd`: File descriptor of the file to write to.
468 /// - `buf`: Buffer containing the data to be written.
469 /// - `count`: Number of bytes to write.
470 ///
471 /// Returns the number of bytes written
472 pub fn write(&mut self, abi: &SyscallAbi) -> Result<usize, KernelError> {
473 todo!()
474 }
475
476 /// Seeks to a new position in the file.
477 ///
478 /// This function implements the system call for moving the file pointer to
479 /// a specified position within the file. The position can be set
480 /// relative to the beginning, current position, or end of the file.
481 ///
482 /// # Errors
483 /// - Returns [`KernelError::InvalidArgument`] if the calculated position is
484 /// invalid.
485 /// - Returns [`KernelError::InvalidArgument`] if the specified file is not a
486 /// [`FileKind::RegularFile`].
487 /// - Returns [`KernelError::BadFileDescriptor`] if specified file descriptor is
488 /// invalid.
489 /// - Propagates any errors from underlying APIs (e.g. [`uaccess`](keos::syscall::uaccess)).
490 ///
491 /// # Syscall API
492 /// ```c
493 /// off_t seek(int fd, off_t offset, int whence);
494 /// ```
495 /// - `fd`: File descriptor of the file to seek in.
496 /// - `offset`: Number of bytes to move the file pointer.
497 /// - `whence`: Specifies how the offset is to be interpreted. Common values
498 /// are:
499 /// - `SEEK_SET` (0): The offset is relative to the beginning of the file.
500 /// - `SEEK_CUR` (1): The offset is relative to the current file position.
501 /// - `SEEK_END` (2): The offset is relative to the end of the file.
502 ///
503 /// Returns the new position of the file descriptor after moving it.
504 pub fn seek(&mut self, abi: &SyscallAbi) -> Result<usize, KernelError> {
505 todo!()
506 }
507
508 /// Tells the current position in the file.
509 ///
510 /// This function implements the system call for retrieving the current file
511 /// pointer position. It allows the program to know where in the file
512 /// the next operation will occur.
513 ///
514 /// # Errors
515 /// - Returns [`KernelError::InvalidArgument`] if the specified file is not a
516 /// [`FileKind::RegularFile`].
517 /// - Returns [`KernelError::BadFileDescriptor`] if specified file descriptor is
518 /// invalid.
519 ///
520 /// # Syscall API
521 /// ```c
522 /// off_t tell(int fd);
523 /// ```
524 /// - `fd`: File descriptor of the file.
525 ///
526 /// Returns the position of the file descriptor.
527 pub fn tell(&mut self, abi: &SyscallAbi) -> Result<usize, KernelError> {
528 todo!()
529 }
530
531 /// Closes an open file.
532 ///
533 /// This function implements the system call for closing an open file.
534 ///
535 /// # Errors
536 /// - Returns [`KernelError::BadFileDescriptor`] if specified file descriptor is
537 /// invalid.
538 ///
539 /// # Syscall API
540 /// ```c
541 /// int close(int fd);
542 /// ```
543 /// - `fd`: File descriptor to close.
544 ///
545 /// Returns 0 if success.
546 pub fn close(&mut self, abi: &SyscallAbi) -> Result<usize, KernelError> {
547 todo!()
548 }
549
550 /// Creates an interprocess communication channel between two file
551 /// descriptors.
552 //
553 /// Pipes are unidirectional communication channels commonly used for IPC.
554 /// Data written to `pipefd[1]` can be read from `pipefd[0]`.
555 ///
556 /// A process that read from pipe must wait if there are no bytes to be
557 /// read.
558 ///
559 /// # Syscall API
560 /// ```c
561 /// int pipe(int pipefd[2]);
562 /// ```
563 /// - `pipefd`: An array of two file descriptors, where `pipefd[0]` is for
564 /// reading and `pipefd[1]` is for writing.
565 ///
566 /// Returns 0 if success.
567 pub fn pipe(&mut self, abi: &SyscallAbi) -> Result<usize, KernelError> {
568 todo!()
569 }
570}