LDD notes: Anatomy of Linux loadable kernel modules

for details check out the link,

http://www.ibm.com/developerworks/linux/library/l-lkm/index.html?ca=dgr-lnxw07LinuxLKM

The Linux kernel is what's known as a monolithic kernel, which means that the majority of the operating system functionality is called the kernel and runs in a privileged mode. This differs from a micro-kernel, which runs only basic functionality as the kernel (inter-process communication [IPC], scheduling, basic input/output [I/O], memory management) and pushes other functionality outside the privileged space (drivers, network stack, file systems). You'd think that Linux is then a very static kernel, but in fact it's quite the opposite. Linux can be dynamically altered at run time through the use of Linux kernel modules (LKMs).

Linux is not the only monolithic kernel that can be dynamically altered (and it wasn't the first). You'll find loadable module support in Berkeley Software Distribution (BSD) variants, Sun Solaris, in older kernels such as OpenVMS, and other popular operating systems such as Microsoft® Windows® and Apple Mac OS X.Dynamically alterable means that you can load new functionality into the kernel, unload functionality from the kernel, and even add new LKMs that use other LKMs. The advantage to LKMs is that you can minimize the memory footprint for a kernel, loading only those elements that are needed (which can be an important feature in embedded systems).

Anatomy of a kernel module

An LKM has some fundamental differences from elements that compile directly into the kernel and also typical programs. A typical program has a main, where an LKM has a module entry and exit function (in version 2.6, you can name these functions anything you wish). The entry function is called when the module is inserted into the kernel, and the exit function called when it's removed. Because the entry and exit functions are user-defined, amodule_init and module_exit macro exist to define which functions these are. An LKM also includes a required and optional set of module macros. These define the license of the module, the module's author, a description of the module, and more. Figure 1 provides view of a very simple LKM.

Figure 1. Source view of a simple LKM

The version 2.6 Linux kernel provides a new (simpler) method for building LKMs. When built, you can use the typical user tools for managing modules (though the internals have changed): the standard insmod (installing an LKM), rmmod (removing an LKM),modprobe (wrapper for insmod and rmmod), depmod (to create module dependencies), and modinfo (to find the values for module macros). For more information on building LKMs for the version 2.6 kernel, check out Resources.

Anatomy of a kernel module object

An LKM is nothing more than a special Executable and Linkable Format (ELF) object file. Typically, object files are linked to resolve their symbols and result in an executable. But because an LKM can't resolve its symbols until it's loaded into the kernel, the LKM remains an ELF object. You can use standard object tools on LKMs (which for version 2.6 have the suffix .ko, for kernel object). For example, if you used the objdump utility on an LKM, you'd find several familiar sections, such as .text (instructions),.data (initialized data), and .bss (Block Started Symbol, or uninitialized data).

You'll also find additional sections in a module to support its dynamic nature. The .init.text section contains the module_initcode, and the .exit.text contains the module_exit code (see Figure 2). The .modinfo section contains the various macro text indicating module license, author, description, and so on.

Figure 2. An example of an LKM with various ELF sections

So, with that introduction to the basics of LKMs, let's dig in to see how modules get into the kernel and are managed internally.

Life cycle of an LKM

The process of module loading begins in user space with insmod (insert module). The insmod command defines the module to load and invokes the init_module user-space system call to begin the loading process. The insmod command for the version 2.6 kernel has become extremely simple (70 lines of code) based on a change to do more work in the kernel. Rather thaninsmod doing any of the symbol resolution that's necessary (working with kerneld), the insmod command simply copies the module binary into the kernel through the init_module function, where the kernel takes care of the rest.

The init_module function works through the system call layer and into the kernel to a kernel function called sys_init_module(see Figure 3). This is the main function for module loading, making use of numerous other functions to do the difficult work. Similarly, the rmmod command results in a system call for delete_module, which eventually finds its way into the kernel with a call to sys_delete_module to remove the module from the kernel.

Figure 3. Primary commands and functions involved in module loading and unloading

During module load and unload, the module subsystem maintains a simple set of state variables to indicate the operation of a module. If the module is being loaded, then the state is MODULE_STATE_COMING. If the module has been loaded and is available, it is MODULE_STATE_LIVE. Otherwise, if the module is being unloaded, then the state is MODULE_STATE_GOING.

Module loading details

Let's now look at the internal functions for module loading (see Figure 4). When the kernel function sys_init_module is called, it begins with a permissions check to see whether the caller can actually perform this operation (through the capable function). Then, the load_module function is called, which takes care of the mechanical work to bring the module into the kernel and perform the necessary plumbing (I review this shortly). The load_module function returns a module reference that refers to the newly loaded module. This module is loaded onto a doubly linked list of all modules in the system, and any threads currently waiting for module state change are notified through the notifier list. Finally, the module's init() function is called, and the module's state is updated to indicate that it is loaded and live.

Figure 4. The internal (simplified) module loading process

The internal details of module loading are ELF module parsing and manipulation. The load_module function (which resides in ./linux/kernel/module.c) begins by allocating a block of temporary memory to hold the entire ELF module. The ELF module is then read from user space into the temporary memory using copy_from_user. As an ELF object, this file has a very specific structure that can be easily parsed and validated.

The next step is to perform a set of sanity checks on the loaded image (is it a valid ELF file? is it defined for the current architecture? and so on). When these sanity checks are passed, the ELF image is parsed and a set of convenience variables are created for each section header to simplify their access later. Because the ELF objects are based at offset 0 (until relocation), the convenience variables include the relative offset into the temporary memory block. During the process of creating the convenience variables, the ELF section headers are also validated to ensure that a valid module is being loaded.

Any optional module arguments are loaded from user space into another allocated block of kernel memory (step 4), and the module state is updated to indicate that it's being loaded (MODULE_STATE_COMING). If per-CPU data is needed (as determined in the section header checks), a per-CPU block is allocated.

In the prior steps, the module sections are loaded into kernel (temporary) memory, and you also know which are persistent and which can be removed. The next step (7) is to allocate the final location for the module in memory and move the necessary sections (indicated in the ELF headers by SHF_ALLOC, or the sections that occupy memory during execution). Another allocation is then performed of the size needed for the required sections of the module. Each section in the temporary ELF block is iterated, and those that need to be around for execution are copied into the new block. This is followed by some additional housekeeping. Symbol resolution also occurs, which can resolve to symbols that are resident in the kernel (compiled into the kernel image) or symbols that are transient (exported from other modules).

The new module is then iterated for each remaining section and relocations performed. This step is architecture dependent and therefore relies on helper functions defined for that architecture (./linux/arch/<arch>/kernel/module.c). Finally, the instruction cache is flushed (because the temporary .text sections were used), a bit more housekeeping is performed (free temporary module memory, setup the sysfs), and the module is finally returned to load_module.

Module unloading details

Unloading the module is essentially a mirror of the load process, except that several sanity checks must occur to ensure safe removal of the module. Unloading a module begins in user space with the invocation of the rmmod (remove module) command. Inside the rmmod command, a system call is made to delete_module, which eventually results in a call to sys_delete_moduleinside the kernel (recall from Figure 3). Figure 5 illustrates the basic operation of the module removal process.

Figure 5. The internal (simplified) module unloading process

When the kernel function sys_delete_module is invoked (with the name of the module to be removed, passed in as the argument), the first step is to ensure that the caller has permissions. Next, a list is checked to see whether any other modules depend on this module. There exists a list called modules_which_use_me that contains an element per dependent module. If this list is empty, no module dependencies exist and the module is a candidate for removal (otherwise, an error is returned). The next test is to see if the module is loaded. Nothing prohibits a user calling rmmod on a module that's currently being installed, so this check ensures that the module is live. After a few more housekeeping checks, the penultimate step is to call the module's exit function (provided within the module itself). Finally, the free_module function is called.

When free_module is called, the module has been found to be safely removable. No dependencies exist now for the module, and the process of cleaning up the kernel can begin for this module. This process begins by removing the module from the various lists that it was placed on during installation (sysfs, module list, and so on). Next, an architecture-specific cleanup routine is invoked (which can be found in ./linux/arch/<arch>/kernel/module.c). You then iterate the modules that depended on you and remove this module from their lists. Finally, with the cleanup complete—from the kernel's perspective—the various memory that was allocated for the module is freed, including the argument memory, per-CPU memory, and the module ELF memory (core and init).

Optimizing the kernel for module management

In many applications, the need for dynamic loading of modules is important, but when loaded, it's not necessary for the modules to be unloaded. This allows the kernel to be dynamic at startup (load modules based on the devices that are found) but not dynamic throughout operation. If it's not required to unload a module after it's loaded, you can make several optimizations to reduce the amount of code needed for module management. You can "unset" the kernel configuration optionCONFIG_MODULE_UNLOAD to remove a considerable amount of kernel functionality related to module unloads.

LDD notes

Thursday, 21 March 2013

Anatomy of Linux loadable kernel modules

No comments:

Post a Comment

About Me