keos_project2/loader/
mod.rs

1//! ## ELF Loading.
2//!
3//! When you run a program on a modern operating system, the kernel needs to
4//! know **how to take the program stored on disk and place it into memory so it
5//! can start running**. The file format that describes this mapping is called
6//! **ELF (Executable and Linkable Format)**.
7//!
8//! ![ELF](<https://raw.githubusercontent.com/casys-kaist/KeOS/79a689838dc34de607bb8d1beb89aa5535cd4af2/images/image.png>)
9//!
10//! Think of ELF as a "blueprint" for a program’s memory layout. It tells the
11//! kernel where each part of the program (code, data, uninitialized variables)
12//! should go in memory, what permissions they need (read, write, execute), and
13//! where the program should begin execution.
14//!
15//! An ELF file contains:
16//! - **ELF header** – a small table of contents that points to the rest of the
17//!   file and gives the program’s entry point (where execution starts).
18//! - **Program headers** – a list of **segments**, each describing a chunk of
19//!   the file that should be loaded into memory.
20//!
21//! In KeOS, we only care about the **program headers**, because they tell us
22//! how to build the process’s memory image. Each program header (`Phdr`) says:
23//! - **Virtual address** ([`Phdr::p_vaddr`]) – Where in memory they should go.
24//! - **Memory size** ([`Phdr::p_memsz`]) – How big the memory region should be
25//!   in total.
26//! - **File size** ([`Phdr::p_filesz`]) – How many bytes come from the file.
27//! - **File offset** ([`Phdr::p_offset`]) - Where in the file the bytes are
28//!   stored.
29//! - **Permissions** ([`Phdr::p_flags`]) – What permissions the region needs.
30//!
31//! You can iterate through the [`Phdr`]s with [`Elf::phdrs()`], which returns
32//! an iterator over [`Phdr`] entries. The most important type of header is
33//! [`PType::Load`], meaning “this segment must be loaded into memory.” To load
34//! it, KeOS does the following:
35//! 1. Allocate memory at the given virtual address using [`MmStruct::do_mmap`].
36//! 2. Copy `filesz` bytes from the ELF file (starting at `p_offset`) into that
37//!    memory.
38//! 3. If `memsz > filesz`, fill the extra space with zeros — this is how ELF
39//!    represents the **`.bss` section**, which holds uninitialized global
40//!    variables.
41//!
42//! By repeating this for every loadable segment, the kernel reconstructs the
43//! program’s expected memory image: code in `.text`, constants in `.rodata`,
44//! variables in `.data`, and zero-initialized memory in `.bss`. When this is
45//! done, the program’s virtual memory matches exactly what the compiler and
46//! linker prepared, and the kernel can safely jump to the entry point to start
47//! execution.
48//!
49//! There are some pitfalls while loding a ELF:
50//!  - `p_vaddr` must be page-aligned. If not, round it down and adjust offsets
51//!    accordingly.
52//!  - Ensure segments do not overwrite existing mappings like the stack or
53//!    kernel memory.
54//!
55//! ## State on Program Startup
56//!
57//! The KeOS user-space C library (`kelibc`) defines `_start()`, located in
58//! `kelibc/entry.c`, as the program entry point. It calls `main()` and
59//! exits when `main()` returns. The kernel must set up the registers and user
60//! program's stack correctly before execution, passing arguments according to
61//! the standard calling convention. [`Registers`] contains the CPU states on
62//! launching a program, including instruction pointer, stack pointer, and
63//! general-purpose registers.
64//
65//!
66//! **Example command:** `/bin/ls -l foo bar`
67//!
68//! 1. Split the command into words: `"/bin/ls"`, `"-l"`, `"foo"`, `"bar"`.
69//! 2. Copy the argument strings to the top of the stack (order does not
70//!    matter).
71//! 3. Push their addresses, followed by a null sentinel (`argv[argc] = NULL`).
72//!    - Align the stack pointer to an 8-byte boundary for performance.
73//! 4. Set `%rdi = argc` (argument count) and `%rsi = argv` (argument array).
74//! 5. Push a fake return address to maintain stack integrity.
75//!
76//! **Example stack layout before execution:**
77//!
78//! | Address    | Name           | Data       | Type        |
79//! | ---------- | -------------- | ---------- | ----------- |
80//! | 0x4747fffc | argv\[3\]\[...\]   | 'bar\0'    | char\[4\]     |
81//! | 0x4747fff8 | argv\[2\]\[...\]   | 'foo\0'    | char\[4\]     |
82//! | 0x4747fff5 | argv\[1\]\[...\]   | '-l\0'     | char\[3\]     |
83//! | 0x4747ffed | argv\[0\]\[...\]   | '/bin/ls\0'| char\[8\]     |
84//! | 0x4747ffe8 | word-align     | 0          | uint8_t\[\]   |
85//! | 0x4747ffe0 | argv\[4\]        | 0          | char *      |
86//! | 0x4747ffd8 | argv\[3\]        | 0x4747fffc | char *      |
87//! | 0x4747ffd0 | argv\[2\]        | 0x4747fff8 | char *      |
88//! | 0x4747ffc8 | argv\[1\]        | 0x4747fff5 | char *      |
89//! | 0x4747ffc0 | argv\[0\]        | 0x4747ffed | char *      |
90//! | 0x4747ffb8 | return address | 0          | void (*) () |
91//!
92//! The stack pointer (`rsp`) is initialized to `0x4747ffb8`. The first two
93//! arguments, `%rdi` and `%rsi`, should be `4` and `0x4747ffc0`, respectively.
94//! The user program stack always starts at `0x47480000` in KeOS, and always
95//! grows downward.
96//!
97//! [`StackBuilder`] is a utility for constructing user-space stacks. It
98//! provides:
99//!
100//! - [`StackBuilder::push_usize`] – Pushes a `usize` value (e.g., pointers like
101//!   `argv[]`).
102//! - [`StackBuilder::push_str`] – Pushes a null-terminated string and returns
103//!   its address.
104//! - [`StackBuilder::align`] – Aligns the stack pointer for proper memory
105//!   access.
106//!
107//! You can use these methods to set up the stack in
108//! [`LoadContext::build_stack`].
109//!
110//! #### Launching a Process
111//!
112//! After loading the program in memory and setting up the stack, the kernel
113//! switches to user mode and begin execution. This is done using
114//! [`Registers::launch`]. Calling [`Registers::launch`] causes the CPU to
115//! change the privilege level to user mode and start executing from `rip`. You
116//! don't need to implement this functionality as it is already implemented in
117//! the outside of this module.
118//!
119//! ## Implementation Requirements
120//! You need to implement the followings:
121//! - [`Phdr::permission`]
122//! - [`StackBuilder::push_bytes`]
123//! - [`LoadContext::load_phdr`]
124//! - [`LoadContext::build_stack`]
125//!
126//! This ends the project 2.
127//!
128//! [`Registers::launch`]: ../../keos/syscall/struct.Registers.html#method.launch
129
130#[allow(dead_code)]
131pub mod elf;
132pub mod stack_builder;
133
134use crate::{mm_struct::MmStruct, pager::Pager};
135#[cfg(doc)]
136use elf::Phdr;
137use elf::{Elf, PType};
138#[cfg(doc)]
139use keos::mm::page_table::Permission;
140use keos::{
141    KernelError,
142    addressing::{PAGE_MASK, Va},
143    fs::RegularFile,
144    syscall::Registers,
145};
146use stack_builder::StackBuilder;
147
148/// A context that holds the necessary state for loading and initializing a user
149/// program.
150///
151/// `LoadContext` is used during the loading an ELF binary into memory. It
152/// encapsulates both the memory layout for the program and its initial register
153/// state, allowing the loader to fully prepare the user-space
154/// execution context.
155pub struct LoadContext<P: Pager> {
156    /// Virtual memory layout for the new user program.
157    pub mm_struct: MmStruct<P>,
158    /// Initial CPU register values for the user process, including the
159    /// instruction pointer.
160    pub regs: Registers,
161}
162
163impl<P: Pager> LoadContext<P> {
164    /// Loads program headers ([`Phdr`]s) from an ELF binary into memory.
165    ///
166    /// This function iterates over the ELF program headers and maps the
167    /// corresponding segments into the process's memory space. It ensures
168    /// that each segment is correctly mapped according to its permissions
169    /// and alignment requirements.
170    ///
171    /// # Parameters
172    /// - `elf`: The ELF binary representation containing program headers.
173    ///
174    /// # Returns
175    /// - `Ok(())` on success, indicating that all segments were successfully
176    ///   loaded.
177    /// - `Err(KernelError)` if any error occurs during the loading process,
178    ///   such as an invalid memory mapping, insufficient memory, or an
179    ///   unsupported segment type.
180    ///
181    /// # Behavior
182    /// - Iterates over all program headers using [`Elf::phdrs`].
183    /// - Maps each segment into memory if its type is [`PType::Load`].
184    /// - Applies appropriate memory permissions using [`Phdr::permission`].
185    /// - Ensures proper alignment and memory allocation before mapping.
186    pub fn load_phdr(&mut self, elf: Elf) -> Result<(), KernelError> {
187        let mut bss = Va::new(0).unwrap();
188
189        for phdr in elf.phdrs().map_err(|_| KernelError::InvalidArgument)? {
190            if phdr.type_ == PType::Load {
191                let (vaddr, memsz, filesz, fileofs, perm): (Va, _, _, _, _) =
192                    (todo!(), todo!(), todo!(), todo!(), phdr.permission());
193                bss = bss.max(vaddr + filesz as usize);
194                todo!()
195            }
196        }
197
198        if bss.into_usize() & PAGE_MASK != 0 {
199            self.mm_struct
200                .get_user_page_and(bss, |mut page, _| {
201                    page.inner_mut()[bss.into_usize() & PAGE_MASK..].fill(0);
202                })
203                .unwrap();
204        }
205        Ok(())
206    }
207
208    /// Builds a user stack and initializes it with arguments.
209    ///
210    /// This function sets up a new stack for the process by allocating memory,
211    /// pushing program arguments (`argv`), and preparing the initial register
212    /// state.
213    ///
214    /// # Parameters
215    /// - `arguments`: A slice of strs representing the command-line arguments
216    ///   (`argv`).
217    /// - `regs`: A mutable reference to the register state, which will be
218    ///   updated with the initial stack pointer (`sp`) and argument count
219    ///   (`argc`).
220    ///
221    /// # Returns
222    /// - `Ok(())` on success, indicating that the stack has been built
223    ///   correctly.
224    /// - `Err(KernelError)` if an error occurs during memory allocation or
225    ///   argument copying.
226    ///
227    /// # Behavior
228    /// - Pushes the argument strings onto the stack.
229    /// - Sets up `argv` and `argc` for the process.
230    /// - Aligns the stack pointer to ensure proper function call execution.
231    ///
232    /// # Safety
233    /// - The function must be called before transferring control to user space.
234    /// - The memory layout should follow the standard calling convention for
235    ///   argument passing.
236    pub fn build_stack(&mut self, arguments: &[&str]) -> Result<(), KernelError> {
237        let Self {
238            mm_struct: mm_state,
239            regs,
240        } = self;
241        let mut builder = StackBuilder::new(mm_state)?;
242        todo!()
243    }
244
245    /// Creates a new memory state and initializes a user process from an ELF
246    /// executable.
247    ///
248    /// This function loads an ELF binary into memory, sets up the program
249    /// headers, and constructs the initial user stack with arguments. It
250    /// also prepares the register state for execution.
251    ///
252    /// # Parameters
253    /// - `file`: A reference to the ELF executable file.
254    /// - `args`: A slice of strs representing the command-line arguments
255    ///   (`argv`).
256    ///
257    /// # Returns
258    /// - `Ok((Self, Registers))` on success, where:
259    ///   - `Self` is the initialized memory state.
260    ///   - `Registers` contains the initial register values.
261    /// - `Err(KernelError)` if an error occurs while loading the ELF file or
262    ///   setting up memory.
263    ///
264    /// # Behavior
265    /// - Parses the ELF file and validates its format.
266    /// - Loads program headers ([`PType::Load`]) into memory.
267    /// - Allocates and builds the user stack.
268    /// - Initializes the register state (`rip` -> entry point, `rsp` -> stack
269    ///   pointer, arg1 -> the number of arguments, arg1 -> address of arguments
270    ///   vector.).
271    pub fn load(mut self, file: &RegularFile, args: &[&str]) -> Result<Self, KernelError> {
272        if let Some(elf) = elf::Elf::from_file(file) {
273            *self.regs.rip() = elf.header.e_entry as usize;
274            self.load_phdr(elf)?;
275            self.build_stack(args)?;
276
277            Ok(self)
278        } else {
279            Err(KernelError::InvalidArgument)
280        }
281    }
282}