This lab is split into four parts. In the first part of this lab, you will add multiprocessor support to mCertiKOS. Next, you will implement a preemptive scheduler, and make some designated parts of the kernel preemtable by turning on the interrupts during those parts of kernel code. Last, you will be asked to design and implement the producer-consumer problem (also known as the bounded-buffer problem) with shared objects and condition variables. Unlike previous lab, you are allowed to make arbitrary changes to the kernel source, including adding new source files, modifying any existing files, and removing existing files if necessary.
* Since you are now dealing with concurrent threads, debugging may be significantly more difficult. START EARLY!
/u/318/code/lab4/
on the courselab
machines. You do not need to copy in any additional files this time.
Include the following information in the file called README
:
the name and netid of both partners (the submitting partner should put
their information first), a brief description of what you have
implemented, what works and what you didn't get to work, and whether
you attempted the extra credit.
When you are ready to submit, compress your lab4
code
directory into tarball with:
tar -czvf lab4.tar.gz lab4
and submit it on
Dropbox.
Exercise 1
We encourage you to (re)read Section 2.3 of the textbook "Interprocess Communication". In particular, the sections describing Mutexes, Condition Variables, Monitors, and respective implementations of the Bounded Buffer / Producer Consumer problem will be most helpful.
In order to facilitate testing implementations of various topics
introduced in this assignment (e.g., multicore, preemption in both
kernel and user), we have developed an initial test setup for you. If
you checkout user/pingpong/ping.c
and user/pingpong/pong.c
, you will see that we have wrote
two user programs that call two system calls produce
and
consume
at different speeds.
These system calls, defined in kern/trap/TSyscall/TSyscall.c
,
simply print out 5 numbers with a loop.
In kern/init/init.c
, we created two producer processes on
CPU #1, and two consumer processes on CPU #2.
With this setup, you will see different levels of message interleavings as you implement more and more parts in this assignment. For example, once you have implemented the first part (multicore support), you will see that the producer outputs and consumer outputs can interleave arbitrarily. However, you will notice that only one user process on each processor has ever been started, and if you just look at outputs from one particular CPU, they are perfectly ordered, as shown in a sample output below.
*As in the previous assignments, the provided program will not run until you implement required modules.
CPU1: process ping1 1 is created.
CPU1: process ping2 3 is created.
CPU2: process pong1 2 is created.
From cpu 1: ping started.
CPU2: process pong2 4 is created.
[D] kern/trap/TSyscall/TSyscall.c:147: CPU 1: Process 1: Produced 0
[D] kern/trap/TSyscall/TSyscall.c:147: CPU 1: Process 1: Produced 1
From cpu 2: pong started.
[D] kern/trap/TSyscall/TSyscall.c:158: CPU 2: Process 2: Consumed 0
[D] kern/trap/TSyscall/TSyscall.c:147: CPU 1: Process 1: Produced 2
[D] kern/trap/TSyscall/TSyscall.c:147: CPU 1: Process 1: Produced 3
[D] kern/trap/TSyscall/TSyscall.c:147: CPU 1: Process 1: Produced 4
[D] kern/trap/TSyscall/TSyscall.c:147: CPU 1: Process 1: Produced 0
[D] kern/trap/TSyscall/TSyscall.c:147: CPU 1: Process 1: Produced 1
[D] kern/trap/TSyscall/TSyscall.c:158: CPU 2: Process 2: Consumed 1
[D] kern/trap/TSyscall/TSyscall.c:158: CPU 2: Process 2: Consumed 2
[D] kern/trap/TSyscall/TSyscall.c:158: CPU 2: Process 2: Consumed 3
[D] kern/trap/TSyscall/TSyscall.c:158: CPU 2: Process 2: Consumed 4
[D] kern/trap/TSyscall/TSyscall.c:147: CPU 1: Process 1: Produced 2
[D] kern/trap/TSyscall/TSyscall.c:147: CPU 1: Process 1: Produced 3
[D] kern/trap/TSyscall/TSyscall.c:147: CPU 1: Process 1: Produced 4
...
You may have already guessed why the other process for each processor does not get any chance to run.
If you checkout user/lib/entry.S
, now we no longer wrap each user program with infinite
sequences of yield calls. Instead, every process just spins once its main function returns. With the
current cooperative multitasking scheme, the first process on each processor will never release the CPU
to the other process since there is no explicit call to the function yield
.
To fix this issue, we need a preemptive scheduler, which we ask you to implement in part 2 of this
assignment. After part 2, you will see the outputs from both user processes on each CPU, and the
set of outputs from each process may alternate, meaning the processes get preempted during their
executions.
...
[D] kern/trap/TSyscall/TSyscall.c:147: CPU 1: Process 1: Produced 0
[D] kern/trap/TSyscall/TSyscall.c:147: CPU 1: Process 1: Produced 1
[D] kern/trap/TSyscall/TSyscall.c:147: CPU 1: Process 1: Produced 2
[D] kern/trap/TSyscall/TSyscall.c:147: CPU 1: Process 1: Produced 3
[D] kern/trap/TSyscall/TSyscall.c:147: CPU 1: Process 1: Produced 4
[D] kern/trap/TSyscall/TSyscall.c:158: CPU 2: Process 2: Consumed 0
[D] kern/trap/TSyscall/TSyscall.c:158: CPU 2: Process 2: Consumed 1
[D] kern/trap/TSyscall/TSyscall.c:158: CPU 2: Process 2: Consumed 2
[D] kern/trap/TSyscall/TSyscall.c:158: CPU 2: Process 2: Consumed 3
[D] kern/trap/TSyscall/TSyscall.c:158: CPU 2: Process 2: Consumed 4
From cpu 1: ping started.
[D] kern/trap/TSyscall/TSyscall.c:147: CPU 1: Process 4: Produced 0
[D] kern/trap/TSyscall/TSyscall.c:147: CPU 1: Process 4: Produced 1
[D] kern/trap/TSyscall/TSyscall.c:147: CPU 1: Process 4: Produced 2
[D] kern/trap/TSyscall/TSyscall.c:147: CPU 1: Process 4: Produced 3
[D] kern/trap/TSyscall/TSyscall.c:147: CPU 1: Process 4: Produced 4
[D] kern/trap/TSyscall/TSyscall.c:147: CPU 1: Process 4: Produced 0
[D] kern/trap/TSyscall/TSyscall.c:147: CPU 1: Process 4: Produced 1
[D] kern/trap/TSyscall/TSyscall.c:147: CPU 1: Process 4: Produced 2
[D] kern/trap/TSyscall/TSyscall.c:147: CPU 1: Process 4: Produced 3
[D] kern/trap/TSyscall/TSyscall.c:147: CPU 1: Process 4: Produced 4
[D] kern/trap/TSyscall/TSyscall.c:147: CPU 1: Process 4: Produced 0
[D] kern/trap/TSyscall/TSyscall.c:147: CPU 1: Process 4: Produced 1
[D] kern/trap/TSyscall/TSyscall.c:147: CPU 1: Process 4: Produced 2
[D] kern/trap/TSyscall/TSyscall.c:147: CPU 1: Process 4: Produced 3
[D] kern/trap/TSyscall/TSyscall.c:147: CPU 1: Process 4: Produced 4
From cpu 2: pong started.
[D] kern/trap/TSyscall/TSyscall.c:158: CPU 2: Process 3: Consumed 0
[D] kern/trap/TSyscall/TSyscall.c:158: CPU 2: Process 3: Consumed 1
[D] kern/trap/TSyscall/TSyscall.c:158: CPU 2: Process 3: Consumed 2
[D] kern/trap/TSyscall/TSyscall.c:158: CPU 2: Process 3: Consumed 3
[D] kern/trap/TSyscall/TSyscall.c:158: CPU 2: Process 3: Consumed 4
[D] kern/trap/TSyscall/TSyscall.c:147: CPU 1: Process 1: Produced 0
[D] kern/trap/TSyscall/TSyscall.c:147: CPU 1: Process 1: Produced 1
[D] kern/trap/TSyscall/TSyscall.c:147: CPU 1: Process 1: Produced 2
[D] kern/trap/TSyscall/TSyscall.c:147: CPU 1: Process 1: Produced 3
[D] kern/trap/TSyscall/TSyscall.c:147: CPU 1: Process 1: Produced 4
[D] kern/trap/TSyscall/TSyscall.c:147: CPU 1: Process 1: Produced 0
[D] kern/trap/TSyscall/TSyscall.c:147: CPU 1: Process 1: Produced 1
...
So, this is much better. On the other hand, if you check the individual process's output,
the printed numbers from 0 to 4 are always consecutive, which means this part of the
code was never preempted. This is because we always disable interrupts when the user traps into
the kernel (check the first line of the implementation of the assembly function _alltraps
in kern/dev/idt.S
). Since the interrupt was always turned off inside the kernel, when
the system call handlers (e.g., sys_produce
or sys_consume
) print out numbers 0-4
in a loop, the timer interrupt would not occur, thus they could not be preempted.
In part 3, you will be asked to enable interrupt in system call handlers for the producer and consumer.
After part 3, you should see that all the printed numbers are randomly interleaved as expected:
CPU1: process ping1 2 is created.
CPU2: process pong1 1 is created.
CPU2: process pong2 4 is created.
CPU1: process ping2 3 is created.
From cpu 2: pong started.
[D] kern/trap/TSyscall/TSyscall.c:158: CPU 2: Process 1: Consumed 0
From cpu 1: ping started.
[D] kern/trap/TSyscall/TSyscall.c:158: CPU 2: Process 1: Consumed 1
From cpu 2: pong started.
From cpu 1: ping started.
[D] kern/trap/TSyscall/TSyscall.c:158: CPU 2: Process 4: Consumed 0
[D] kern/trap/TSyscall/TSyscall.c:158: CPU 2: Process 4: Consumed 1
[D] kern/trap/TSyscall/TSyscall.c:158: CPU 2: Process 4: Consumed 2
[D] kern/trap/TSyscall/TSyscall.c:158: CPU 2: Process 4: Consumed 3
[D] kern/trap/TSyscall/TSyscall.c:147: CPU 1: Process 3: Produced 0
[D] kern/trap/TSyscall/TSyscall.c:147: CPU 1: Process 3: Produced 1
[D] kern/trap/TSyscall/TSyscall.c:147: CPU 1: Process 2: Produced 0
[D] kern/trap/TSyscall/TSyscall.c:147: CPU 1: Process 2: Produced 1
[D] kern/trap/TSyscall/TSyscall.c:158: CPU 2: Process 1: Consumed 2
[D] kern/trap/TSyscall/TSyscall.c:158: CPU 2: Process 1: Consumed 3
[D] kern/trap/TSyscall/TSyscall.c:158: CPU 2: Process 1: Consumed 4
[D] kern/trap/TSyscall/TSyscall.c:158: CPU 2: Process 1: Consumed 0
[D] kern/trap/TSyscall/TSyscall.c:158: CPU 2: Process 1: Consumed 1
[D] kern/trap/TSyscall/TSyscall.c:158: CPU 2: Process 4: Consumed 4
[D] kern/trap/TSyscall/TSyscall.c:158: CPU 2: Process 4: Consumed 0
[D] kern/trap/TSyscall/TSyscall.c:158: CPU 2: Process 4: Consumed 1
[D] kern/trap/TSyscall/TSyscall.c:147: CPU 1: Process 2: Produced 2
[D] kern/trap/TSyscall/TSyscall.c:147: CPU 1: Process 2: Produced 3
[D] kern/trap/TSyscall/TSyscall.c:147: CPU 1: Process 2: Produced 4
[D] kern/trap/TSyscall/TSyscall.c:158: CPU 2: Process 4: Consumed 2
[D] kern/trap/TSyscall/TSyscall.c:158: CPU 2: Process 4: Consumed 3
[D] kern/trap/TSyscall/TSyscall.c:158: CPU 2: Process 1: Consumed 2
[D] kern/trap/TSyscall/TSyscall.c:158: CPU 2: Process 1: Consumed 3
[D] kern/trap/TSyscall/TSyscall.c:158: CPU 2: Process 1: Consumed 4
[D] kern/trap/TSyscall/TSyscall.c:158: CPU 2: Process 1: Consumed 0
[D] kern/trap/TSyscall/TSyscall.c:147: CPU 1: Process 3: Produced 2
[D] kern/trap/TSyscall/TSyscall.c:147: CPU 1: Process 3: Produced 3
[D] kern/trap/TSyscall/TSyscall.c:147: CPU 1: Process 3: Produced 4
[D] kern/trap/TSyscall/TSyscall.c:158: CPU 2: Process 1: Consumed 1
[D] kern/trap/TSyscall/TSyscall.c:158: CPU 2: Process 4: Consumed 4
...
We have developed this simple test setup so that you can test your implementations in the first three parts before you dive into the actual implementations of producer-consumer in part 4. If you start directly on part 4 without fully testing the first three parts step by step, debugging could become a nightmare because it may not be clear where the bugs are located. In part 4, you can freely design your kernel structure.
We do not provide any test scripts for this part of the assignment, because: writing comprehensive test cases would leak too much information on the required implementation. For many other parts, you should now be quite familiar on how to write good test scripts for them, so we no longer provide simple scripts to get you started as in previous assignments. On the other hand, we encourage you to write your own test scripts to challenge your implementation. The more of your own code is covered by the test cases, the less likely that any bugs will be found during the actual grading.
We are going to make mCertiKOS support "symmetric multiprocessing" (SMP), a multiprocessor model in which all CPUs have equivalent access to system resources such as memory and I/O buses. While all CPUs are functionally identical in SMP, during the boot process they can be classified into two types: the bootstrap processor (BSP) is responsible for initializing the system and for booting the operating system; and the application processors (APs) are activated by the BSP only after the operating system is up and running. Which processor is the BSP is determined by the hardware and the BIOS. Up to this point, all your existing mCertiKOS code has been running on the BSP.
In an SMP system, each CPU has an accompanying
local APIC (LAPIC) unit.
The LAPIC units are responsible
for delivering interrupts throughout the system.
The LAPIC also provides its connected CPU with a unique identifier.
An implementation of the LAPIC driver can be found in kern/dev/lapic.h
and kern/dev/lapic.c
. It is ok if you do not understand the code completely.
Before booting up APs,
the BSP should first collect information about the multiprocessor system,
such as the total number of CPUs, their APIC IDs
and the MMIO address of the LAPIC unit.
This is done by functions defined in kern/dev/pcpu_mp.c
.
Don't be panic if you cannot understand some of the code in the file.
Once you've skimmed through the file, further read kern/pcpu/PCPUIntro/PCPUIntro.c
,
kern/pcpu/PCPUInit/PCPUInit.c
, and kern/init/init.c
to try to understand
the control flow transfer during the bootstrap of APs.
When writing a multiprocessor OS, it is important to distinguish between per-CPU state that is private to each processor, and global state that the whole system shares.
Here is the per-CPU state you should be aware of:
pcpu
structure defined in kern/pcpu/PCPUIntro/PCPUIntro.c
.
struct kstack bsp_kstack[NUM_CPUS]
defined in
kern/clib/kstack.h
reserves space
for NUM_CPUS's worth of kernel bootstrap stacks.
Once the kernel is started, then each kernel thread uses its own stack,
allocated in struct kstack proc_kstack[NUM_IDS]
defined in
the same file.Per-CPU TSS and TSS descriptor.
A per-CPU task state segment (TSS) is also needed
in order to specify where each CPU's kernel stack lives.
You can refer to kern/lib/seg.c
for how the TSS is
initialized for each CPU.
Per-CPU current thread ID.
Since each CPU can run different user processes simultaneously,
we need to remember the ID of currently running thread for each CPU.
Read kern/thread/PCurID/PCurID.c
for more details.
Per-CPU thread queues.
Using one thread ready queue would eliminate concurrency in the scheduler,
since at any moment, at most one CPU's scheduler could access the ready queue.
Instead, we allocate one ready queue for each scheduler on each processor.
Check kern/thread/PTQueueIntro/PTQueueIntro.c
for more details.
Per-CPU trap handlers.
Because multiple CPUs can trap into the kernel simultaneously, we need to
register the appropriate trap handlers for each CPU.
Part of the code is implemented in kern/trap/TTrapInit/TTrapInit.c
.
Exercise 2
Complete the implementation of
trap_init
inkern/trap/TTrapInit/TTrapInit.c
by registering the appropriate trap handler for each trap number. You can refer tokern/dev/intr.h
for the list of trap numbers.
We are now done with initializing the per-CPU states in the kernel. Before going any further, we need to first address race conditions when multiple CPUs run kernel code simultaneously. We would like to prevent the situation where multiple processes try to write to global, system-shared state simultaneouly. The simplest way to achieve this is to use a big kernel lock. The big kernel lock is a single global lock that is held whenever a process enters kernel mode, and is released when the process returns to user mode. In this model, processes in user mode can run concurrently on any available CPUs, but no more than one process can run in kernel mode; any other processes that try to enter kernel mode are forced to wait. The big kernel lock is simple and easy to use. Nevertheless, it eliminates all concurrency in kernel mode. Most modern operating systems use different locks to protect different parts of their shared states, an approach called fine-grained locking. Fine-grained locking can increase performance significantly, but is more difficult to implement and error-prone. mCertiKOS implements fine-grained locking. In this section, you will be asked to add different locks to various places in the kernel to ensure the conrrectness of the concurrent mCertiKOS kernel.
Exercise 3
This exercise is to help you get familiar with the locking implemented in mCertiKOS. You do not need to write any code for this exercise. We have already implemented a spinlock for you, which can be found in
kern/lib/spinlock.c
. In any of the subsequent exercises, you are free to use our implementation of spinlock wherever it is convenient. We have also added locks to protect the physical memory allocators and the virtual memory allocators. Go through the relevant code to make sure you understand them.
Exercise 4
If you have not done so already, read the following code carefully to understand their relationships:
kern/dev/serial.c
,kern/dev/console.c
,kern/lib/dprintf.c
, andkern/lib/debug.c
. Once you have figured out how they work together, add locks as appropriate to their implementations. You should try to use multiple locks to improve the performance, i.e., we do not want two non-related jobs waiting on the same lock. This is a warning that simply adding locks to every function WILL NOT WORK.
This completes part 1. Compile and run your OS. Check the section on Test the User-Process Setup above to see the expected outcome.Exercise 5
Similarly, check out the scheduler implementation in
kern/thread/PThread/PThread.c
, and add locks in appropriate places.
In this part of the assignment, you will modify the kernel to preempt uncooperative processes.
As described in the section on Test the User-Process Setup,
up until now, the first process on each CPU simply spins forever in a
tight loop (see user/lib/entry.S
)
after the main function returns, and thus, the second process never gains the CPU.
This is obviously not an ideal situation
in terms of protecting the system
from bugs or malicious code in user-mode processes,
because any user-mode process can bring the whole system to a halt
simply by getting into an infinite loop
and never giving back the CPU.
In order to allow the kernel to preempt a running process,
forcefully retaking control of the CPU from it,
the mCertiKOS kernel
must support external hardware interrupts from the clock hardware.
External interrupts (i.e., device interrupts) are referred to as IRQs.
There are 16 possible IRQs, numbered 0 through 15.
The mapping from IRQ number to IDT entry is not fixed.
intr_init_idt
in kern/dev/intr.c
maps IRQs 0-15
to IDT entries T_IRQ0
through T_IRQ0+15
.
In kern/dev/intr.h
, T_IRQ0
is defined to be decimal 32.
Thus the IDT entries 32-47 correspond to the IRQs 0-15.
For example, the clock interrupt is IRQ 0.
Thus, idt[T_IRQ0+0] (i.e., idt[32]) contains
the address of the clock's interrupt handler routine in the kernel.
This T_IRQ0
is chosen so that the device interrupts
do not overlap with the processor exceptions,
which could obviously cause confusion.
(In fact, in the early days of PCs running MS-DOS,
the T_IRQ0
effectively was zero,
which caused massive confusion
between handling hardware interrupts and handling processor exceptions!)
In this part of the assignment, we make a key simplification.
External device interrupts are always disabled
when in the kernel but they are always enabled when in user space.
We will mitigate this assumption by allowing external interrupts in some
parts of the kernel in Part 3.
External interrupts are controlled by the FL_IF
flag bit
of the %eflags
register.
When this bit is set, external interrupts are enabled.
Once a user process traps into the kernel, we immediately disable the interrupt
(check the cli
command in first line of function _alltraps
in kern/dev/idt.S
, which is an x86 instruction that disables the interrupt).
Note that when we reach _alltraps
, the CPU has already saved the old %eflags
register in the current trap frame, pushed on to the current thread's stack, specified in the TSS.
Thus, when we return back to the user with the iret
instruction, the %eflags
register was restored with the original value saved in the trap frame and thus the interrupts are
turned back on when we reach the user space.
We need to program the hardware to generate clock interrupts periodically, which will force control back to the kernel where we can switch control to a different user process.
The calls to lapic_init
, which we have written for you in
kern/dev/lapic.c
, sets up the clock
and the interrupt controller to generate interrupts.
You now need to write the code to handle these interrupts.
Now, check whether your current program execution meets the expected output described in Test the User-Process Setup.Exercise 6
Modify the function
timer_intr_handler
inkern/trap/TTrapHandler/TTrapHandler.c
to switch to another thread in everySCHED_SLICE
miliseconds, defined inkern/lib/thread.h
. The constantLAPIC_TIMER_INTR_FREQ
(kern/dev/lapic.h
) defines the frequency of the LAPIC timer interrupt (# interrupts per second). You should first implement a function namedsched_update
inkern/thread/PThread/PThread.c
that records (for each CPU) the number of miliseconds that has elapsed since the last thread switch. Once that reachesSCHED_SLICE
miliseconds, you should yield the current thread to other ready thread by callingthread_yield
, and reset the counter. Add locking statements as necessary. Finally, intimer_intr_handler
, you can simply invoke this new function.
Exercise 7
Modify
syscall_dispatch
to temporarily enable interrupts during the executions ofsys_produce
andsys_consume
. You will findintr_local_enable
andintr_local_disable
defined inkern/dev/intr.h
helpful.
There are a couple of things that you need to consider when you turn on the interrupts in the kernel.
First, our old way of saving/restoring the trap frames (see assignment 3) no longer works.
In the last assignment, when a user program traps into the kernel, in the function trap
,
we save the current trap frame into memory (uctx_pool
). From then on, whenever we need to
read the values in the trap frame (to retrieve the system call number and function arguments), or
write values to the trap frame (to set the error number and return values), we directly manipulate
uctx_pool
, instead of the actual trap frame pushed onto
the kernel stack. Later when we return to the user, we restore all
the user states from uctx_pool
as well. Doing so can
have some advantages. For example, the trap frame is saved into the
memory instead of the stack, thus it is more secure, e.g., it may not
be overwritten by unexpected stack overflow. Furthermore, the trap
frame is saved into a global memory, which can be accessed by any part
of the kernel code. Thus, you do not have to pass along the trap frame
stack pointers everywhere, making the implementation more clean. Also,
passing pointers to the stack allocated variables in C are generally
considered dangerous.
However, this scheme no longer works as soon as you enable interrupts in any part of the kernel.
Consider the scenario where the user called the produce
system call, and now we are in the
function trap
. Then we save the current trap frame into uctx_pool
, then
dispatch the request to the handler sys_produce
. At this moment, you reenable interrupts,
and the kernel execution is interrupted by a timer interrupt. You are now back in the function
trap
again to handle this timer interrupt. Note that here we have nested interrupts,
a timer interrupt nested inside a user system call. Now in this trap
function, you
save the current trap frame into uctx_pool
... wait, you overwrote the existing trap frame
for the outer system call!
You may think that this is easy to solve: you just need to define a
kictx_pool
and save the trap frame into this location if you have detected that the interrupt
was triggered inside the kernel (assuming there's an easy way to detect this). This seems to work because
in the current set up, the maximum depth of the nested interrupt calls is only two. This is because we
always turn off the interrupt whenever there is a trap, and we only temporarily reenable it for the
system call handlers of producer and consumer. Thus, when there is a timer interrupt in the kernel,
it goes to the interrupt handler and this part of the code cannot be interrupted.
Unfortunately, this solution does not work either. The simple reason is that it does not restore
the correct kernel stack pointer. Check the function proc_start_user
in kern/proc/PProc/PProc.c
on how we went back to the user from the kernel. We did it by calling trap_return
to the
saved trap frame in uctx_pool
. Now check the implementation of trap_return
in kern/dev/idt.S
. Specifically, check the first line in the function body. We are setting
the current %esp
register to the address of the argument (uctx_pool
entry).
Then we restore various registers for the user program through the pop
instructions.
As you may have already noticed, we are changing our kernel stack pointer to the address of one of the
entries in uctx_pool
. Previously, this was OK because nested interrupts were not possible, and
every trap was triggered by the user. Thus, inside the trap handler, it was OK to "ruin" the kernel stack pointer,
because we are about to return back to the user, and next time there is another trap from the user, the CPU
will set the kernel stack pointer back to appropriate STACK_TOP configured in TSS!
However, if an interrupt is triggered in the kernel, you definitely do not want to change the stack
pointer to a random memory location.
Well, you may now think that this problem can be solved if we can somehow manage to set the %esp
register back to the original value in the case that we are returning back to the kernel instead of the user.
This may work, but the implementation will not be trivial. Even worse, if we allow nested interrupts of bigger depth,
e.g., if we enable interrupts in the interrupt handlers themselves, then you need to turn the
kictx_pool
entries into linked lists.
Instead of tackling all these complications, we have made a simple decision. We just use the original trap frame
saved in the stack and pass the trap frame pointers along. If you check out the current implementation, you will
notice that we only use uctx_pool
for process creation. In the function trap
,
we no longer save the current trap frame into the memory. Instead, we pass the trap frame pointer to all the
trap handling functions, and those functions further pass the pointer to subsequent funtions like the ones
to retrieve the system call arguments. After the trap is handled completely, we simply call trap_return
to the current trap frame on the stack. This way, after the trap returns, the stack will be the same as before.
And it works perfectly with nested interrupts, regardless of nesting depth. With nested interrupts, the trap
frames just pile up onto the current stack and will eventually return back to the original stack as the interrupts
are handled.
So now we have solved the trap frame saving/restoring issue for
you. But there will be another issue. Try to run your program, and
you will most likely notice that your kernel gets hung in the middle
of some calls to the print function. So what else could go wrong by
turning on interrupts for these two fixed places in the kernel?
The issue was from the print calls inside sys_produce
and sys_consume
. Consider the case where you are in the
middle of executing one of those print statements. At this moment,
your thread holds the debugging lock used by the print statement. Then
your execution is preempted, and the scheduler schedules another
thread. That thread may be executing something other than producer or
consumer system call handling code (thus the interrupt is off) and it
may also want to print something to the screen. Since the debugging
spinlock was held by some other thread, and the interrupt was turned
off, the current thread will spin forever hoping that some other
thread would release the lock, but the thread who holds the lock will
not get any chance to run unless the current thread gives up the CPU.
The main problem here is that we intended to only turn on the interrupt within those two system call
handlers. Therefore, the rest of the kernel modules (including the debugging modules) were written under the
assumption that the interrupts were always turned off.
Thus, the obvious solution is to turn off the interrupt temporarily
whenever sys_produce
or sys_consume
tries to
call any other kernel function, and to turn the interrupt back on once
the function returns.
At this stage, you should be able to run the program and see the desired output (see the section on Test the User-Process Setup). Lastly, here's one more improvement we can add to make our kernel safer:Exercise 8
Modify
sys_produce
andsys_consume
to temporarily disable interrupts whenever it calls any printing function, and turn interrupts back on afterwards.
Exercise 9
An operating system should guarantee that under all possible arguments, no system call fails ungracefully. Thus, the system call handlers are responisble for performing argument validity checking and returning the appropriate error code if any of the checks fail. Currently in mCertiKOS, it is still possible that the call to
sys_spawn
goes wrong. We have added three more error codes inkern/lib/syscall.h
:E_EXCEEDS_QUOTA
,E_MAX_NUM_CHILDEN_REACHED
, andE_INVAL_CHILD_ID
. They should be self explanatory from their names. You are asked to enhance the implementation ofsys_spawn
to do argument checking to detect possible errors and set appropriate error codes, making sure any calls tosys_spawn
, under all possible arguments, never fail ungracefully. You should be able to do this solely by just changing the implementation ofsys_spawn
.
From Wikipedia:
In computing, the producer–consumer problem (also known as the bounded-buffer problem) is a classic example of a multi-process synchronization problem. The problem describes two processes, the producer and the consumer, who share a common, fixed-size buffer used as a queue. The producer's job is to generate a piece of data, put it into the buffer and start again. At the same time, the consumer is consuming the data (i.e., removing it from the buffer) one piece at a time. The problem is to make sure that the producer won't try to add data into the buffer if it's full and that the consumer won't try to remove data from an empty buffer.
In this part of the assignment, you are asked to implement a
particular version of the producer-consumer problem. We have
implemented produce
and consume
as system
calls and created multiple user processes to invoke these system
calls. Recall that we have created two producer user processes in CPU
#1 and two consumer user processes in CPU #2
(check kern/init/init.c
). When the buffer is full, any
process who calls the produce
system call should be put into
a waiting list, later woken up by consumers when the buffer is no longer
full. A similar blocking mechanism should be implemented for
the consumer
as well, i.e., the consumers should be
blocked if the buffer is empty. You should implement it using the
shared objects and condition variables.
Exercise 10
Implement the appropriate condition variables, and implement a bounded buffer as a shared object. You can use the provided spinlock implemention in
kern/lib/spinlock.c
. Then modify thesys_produce
andsys_consume
system call handlers to manipulate the bounded buffer accordingly. Delete the existing code in those two functions. Changeuser/include/syscall.h
anduser/lib/proc.c
accordingly if necessary. Feel free to change the signatures of relevant functions and define new functions as needed. You should add appropriate (easy to understand) debugging outputs (either in the system call handlers, or in the user program, or in both) to illustrate that your implementation actually works, with all the settings you've implemented in the Part 1-3. Explain your implementation in the README file.
Design Review
At this point, you should be able to find the answer to the following questions. Please come prepared to answer them in your design review:
- Explain how to use condition variables and locks to implement a bounded buffer.
- Provide pseudocode for the implementation of
sys_produce
andsys_consume
, using above bounded buffer.
This completes the lab.
Make sure you pass all of your custom tests, and don't forget to edit your
README
file. Make sure your output for the producer consumer
problem matches your expectations. Compress your lab4
directory with
tar
and submit to
Dropbox when ready.