230 likes | 265 Views
The kernel’s task list. Introduction to process descriptors and their related data-structures for Linux kernel version 2.6.10. Multi-tasking. Modern operating systems allow multiple users to share a computer’s resources Users are allowed to run multiple tasks
E N D
The kernel’s task list Introduction to process descriptors and their related data-structures for Linux kernel version 2.6.10
Multi-tasking • Modern operating systems allow multiple users to share a computer’s resources • Users are allowed to run multiple tasks • The OS kernel must protect each task from interference by other tasks, while allowing every task to take its turn using some of the processor’s available time
Stacks and task-descriptors • To manage multitasking, the OS needs to use a data-structure which can keep track of every task’s progress and usage of the computer’s available resourcres (physical memory, open files, pending signals, etc.) • Such a data-structure is called a ‘process descriptor’ – every active task needs one • Every task needs its own ‘private’ stack
What’s on a program’s stack? Upon entering ‘main()’: • A program’s exit-address is on user stack • Command-line arguments on user stack • Environment variables are on user stack During execution of ‘main()’: • Function parameters and return-addresses • Storage locations for ‘automatic’ variables
Entering the kernel… A user process enters ‘kernel-mode’: • when it decides to execute a system-call • when it is ‘interrupted’ (e.g. by the timer) • when ‘exceptions’ occur (e.g. divide by 0)
Switching to a different stack • Entering kernel-mode involves not only a ‘privilege-level transition’ (from level 3 to level 0), but also a stack-area ‘switch’ • This is necessary for robustness: e.g., user-mode stack might be exhausted • This is desirable for security: e.g, privileged data might be accessible
What’s on the kernel stack? Upon entering kernel-mode: • task’s registers are saved on kernel stack (e.g., address of task’s user-mode stack) During execution of kernel functions: • Function parameters and return-addresses • Storage locations for ‘automatic’ variables
Supporting structures • So every task, in addition to having its own code and data, will also have a stack-area that is located in user-space, plus another stack-area that is located in kernel-space • Each task also has a process-descriptor which is accessible only in kernel-space
A task’s virtual-memory layout Process descriptor and kernel-mode stack User space Kernel space Privilege-level 0 User-mode stack-area Privilege-level 3 Task’s code and data
Something new in 2.6 • Linux uses part of a task’s kernel-stack page-frame to store ‘thread information’ • The thread-info includes a pointer to the task’s process-descriptor data-structure Task’s kernel-stack struct task_struct Task’s process-descriptor Task’s thread-info kernel page-frame
Tasks have ’states’ • From kernel-header: <linux/sched.h> • #define TASK_RUNNING 0 • #define TASK_INTERRUPTIBLE 1 • #define TASK_UNINTERRUPTIBLE 2 • #define TASK_ZOMBIE 4 • #define TASK_STOPPED 8
Fields in a process-descriptor struct task_struct { volatile long state; struct thread_into *thread_info; unsigned long flags; struct mm_struct *mm; pid_t pid; char comm[16]; /* plus many other fields */ };
Finding a task’s ‘thread-info’ • During a task’s execution in kernel-mode, it’s very quick to find that task’s thread-info object • Just use two assembly-language instructions: movl $0xFFFFF000, %eax andl %esp, %eax Ok, now %eax = the thread-info’s base-address There’s a macro that implements this computation
Finding the task-descriptor • Use a macro ‘current_thread_info()’ to get a pointer to the ‘thread_info’ structure: struct thread_info *info = current_thread_info(); • Then one more step gets you the address of the task’s process-descriptor: struct task_struct *task = info->task; • You can also use ‘current’ to perform this two-step assignment: task = current;
Parenthood • New tasks get created by calling ‘fork()’ • Old tasks get terminated by calling ‘exit()’ • When ‘fork()’ is called, two tasks return • One task is known as the ‘parent’ process • And the other is called the ‘child’ process • The kernel keeps track of this relationship
A parent can have many children • If a user task calls ‘fork()’ twice, that will create two distinct ‘child’ processes • These children are called ‘siblings’ • Kernel track of all this with lists of pointers
Parenthood relationships P1 P2 P3 P4 P5 See “Linux Kernel Programming” (Chapter 3) for additional details
The kernel’s ‘task-list’ • Kernel keeps a list of process descriptors • A ‘doubly-linked’ circular list is used • The ‘init_task’ serves as a fixed header • Other tasks inserted/deleted dynamically • Tasks have forward & backward pointers, implemented as fields in the ‘tasks’ field • To go forward: task = next_task( task ); • To go backward: task = prev_task( task );
Doubly-linked circular list next_task init_task (pid=0) newest task … prev_task
Demo • We can write a module that lets us create a pseudo-file (named ‘/proc/tasklist’) for viewing the list of all currently active tasks • Our ‘tasklist.c’ module shows the name and process-ID of each task, along with the task’s current state • Use the command: $ cat /proc/tasklist
In-class exercise #1 • Different versions of the 2.6 Linux kernel use slightly different definitions for these task-related kernel data-structures (e.g, our 2.6.10 kernel uses a smaller-sized ‘thread-info’ structure than 2.6.9 did) • So can you write an installable kernel module that will tell you: • the size of a ‘task_struct’ object (in bytes)? • the size of a ‘thread_info’ object (in bytes)?
‘Kernel threads’ • Some tasks don’t have a page-directory of their own – because they don’t need one • They can just ‘borrow’ the page-dirtectory that belongs to another task • These ‘kernel thread’ tasks will have an NULL value (i.e., zero) stored in the ‘mm’ field of their ‘task_struct’ descriptor
In-class exercise #2 • Can you modify our ‘tasklist.c’ module so it will display a list of only those tasks which are kernel threads?