Module loader

Module loader 

Source
Expand description

§ELF Loading.

When you run a program on a modern operating system, the kernel needs to know how to take the program stored on disk and place it into memory so it can start running. The file format that describes this mapping is called ELF (Executable and Linkable Format).

ELF

Think of ELF as a “blueprint” for a program’s memory layout. It tells the kernel where each part of the program (code, data, uninitialized variables) should go in memory, what permissions they need (read, write, execute), and where the program should begin execution.

An ELF file contains:

  • ELF header – a small table of contents that points to the rest of the file and gives the program’s entry point (where execution starts).
  • Program headers – a list of segments, each describing a chunk of the file that should be loaded into memory.

In KeOS, we only care about the program headers, because they tell us how to build the process’s memory image. Each program header (Phdr) says:

  • Virtual address (Phdr::p_vaddr) – Where in memory they should go.
  • Memory size (Phdr::p_memsz) – How big the memory region should be in total.
  • File size (Phdr::p_filesz) – How many bytes come from the file.
  • File offset (Phdr::p_offset) - Where in the file the bytes are stored.
  • Permissions (Phdr::p_flags) – What permissions the region needs.

You can iterate through the Phdrs with Elf::phdrs(), which returns an iterator over Phdr entries. The most important type of header is PType::Load, meaning “this segment must be loaded into memory.” To load it, KeOS does the following:

  1. Allocate memory at the given virtual address using MmStruct::do_mmap.
  2. Copy filesz bytes from the ELF file (starting at p_offset) into that memory.
  3. If memsz > filesz, fill the extra space with zeros — this is how ELF represents the .bss section, which holds uninitialized global variables.

By repeating this for every loadable segment, the kernel reconstructs the program’s expected memory image: code in .text, constants in .rodata, variables in .data, and zero-initialized memory in .bss. When this is done, the program’s virtual memory matches exactly what the compiler and linker prepared, and the kernel can safely jump to the entry point to start execution.

There are some pitfalls while loding a ELF:

  • p_vaddr must be page-aligned. If not, round it down and adjust offsets accordingly.
  • Ensure segments do not overwrite existing mappings like the stack or kernel memory.

§State on Program Startup

The KeOS user-space C library (kelibc) defines _start(), located in kelibc/entry.c, as the program entry point. It calls main() and exits when main() returns. The kernel must set up the registers and user program’s stack correctly before execution, passing arguments according to the standard calling convention. Registers contains the CPU states on launching a program, including instruction pointer, stack pointer, and general-purpose registers.

Example command: /bin/ls -l foo bar

  1. Split the command into words: "/bin/ls", "-l", "foo", "bar".
  2. Copy the argument strings to the top of the stack (order does not matter).
  3. Push their addresses, followed by a null sentinel (argv[argc] = NULL).
    • Align the stack pointer to an 8-byte boundary for performance.
  4. Set %rdi = argc (argument count) and %rsi = argv (argument array).
  5. Push a fake return address to maintain stack integrity.

Example stack layout before execution:

AddressNameDataType
0x4747fffcargv[3][…]‘bar\0’char[4]
0x4747fff8argv[2][…]‘foo\0’char[4]
0x4747fff5argv[1][…]‘-l\0’char[3]
0x4747ffedargv[0][…]‘/bin/ls\0’char[8]
0x4747ffe8word-align0uint8_t[]
0x4747ffe0argv[4]0char *
0x4747ffd8argv[3]0x4747fffcchar *
0x4747ffd0argv[2]0x4747fff8char *
0x4747ffc8argv[1]0x4747fff5char *
0x4747ffc0argv[0]0x4747ffedchar *
0x4747ffb8return address0void (*) ()

The stack pointer (rsp) is initialized to 0x4747ffb8. The first two arguments, %rdi and %rsi, should be 4 and 0x4747ffc0, respectively. The user program stack always starts at 0x47480000 in KeOS, and always grows downward.

StackBuilder is a utility for constructing user-space stacks. It provides:

You can use these methods to set up the stack in LoadContext::build_stack.

§Launching a Process

After loading the program in memory and setting up the stack, the kernel switches to user mode and begin execution. This is done using Registers::launch. Calling Registers::launch causes the CPU to change the privilege level to user mode and start executing from rip. You don’t need to implement this functionality as it is already implemented in the outside of this module.

§Implementation Requirements

You need to implement the followings:

This ends the project 2.

Modules§

elf
Utility to parsing ELF file.
stack_builder
StackBuilder, a utility for constructing a user-space stack layout.

Structs§

LoadContext
A context that holds the necessary state for loading and initializing a user program.