Let's try to develop something that has fewer disadvantages.
One thing to notice is that we can get simpler solution to
the critical section problem if we have a "fancy"
instruction implemented by the computer's hardware. Our
text gives the examples of what can be done with an atomic
test-and-set instruction or an atomic swap instruction.
DEFINITION OF THE TestAndSet INSTRUCTION
This must be implemented as an atomic operation.
boolean TestAndSet(boolean &target)
{
boolean rv=target; /* make copy */
target=true; /* set */
return rv ; /* return copy */
}
Code as simple as this:
---------------------
shared boolean locked=false;
---------------------
do
{
while (TestAndSet(locked))
/* do nothing */ ;
criticalSection(me) ;
locked=false ;
remainderSection(me) ;
} while(1) ;
---------------------
implements mutual exclusion for a a set of n processes. (Each process
Pi executes the code above with me == i.)
We can satisfy all requirements for a solution to the critical
section problem with code like this:
---------------------
shared boolean waiting[n]={false,...,false};
shared boolean locked=false;
---------------------
void SolveCS(int me)
{
local boolean wasLocked ;
local int you;
do
{
waiting[me]=true;
wasLocked=true;
while( waiting[me] && wasLocked )
wasLocked = TestAndSet(locked) ;
waiting[me]=false;
criticalSection(me) ;
you=(me+1)%n ;
while ( (you!=me) && (!waiting[you]) )
you = (you+1)%n;
if (you==me) locked=false
else waiting[you]=false ;
remainderSection(me) ;
} while(1) ;
}
---------------------
The code above is not shorter than the bakery algorithm, but it is easier
to understand.
Implementation
With a queuing semaphore we can create a simple solution to a critical
section problem - one that does not require busy waiting.
The queuing semaphore is a special data structure. The data part consists
of an integer value and a list. It might be represented this way:
typedef struct
{
int value ;
struct process *L ;
} semaphore ;
The semaphore data stucture requires two operations, wait() and
signal(), which must be implemented atomically. The following
pseudo code describes what the operations do, but does not give any
clue about how to implement the operations atomically:
void wait(semaphore S)
{
S.value--;
if (S.value<0)
{
add this process to S.L;
block() ;
}
}
void signal(semaphore S)
{
S.value++;
if (S.value<=0)
{
remove a process P from S.L;
wakeup(P);
}
}
One can implement block() and wakeup(P) as system calls. A
call to block() would put the calling process to sleep. The OS would
get control of the CPU and put the process that called block() into a
special sleep queue. The sleep queue is a data structure that the OS
maintains. A process is not runnable while in the sleep queue. A call to
wakeup(P) would cause the OS to get control of the CPU and to remove
P from the sleep queue and return it to the ready queue.
On a uniprocessor, one can implement wait() and signal()
(without any busy waiting) as system calls. The OS can guarantee the atomic
execution of wait() and signal() if it does two things while
executing the code of wait() or signal():
- refuse to relinquish the CPU, and
- mask interrupts
Under those circumstances nothing can "sneak in" and run in the
CPU until after the wait() or signal() has completed.
Unfortunately the method described above is hard to generalize to a
multiprocessor platform. We would have to guarantee that no code running on
any of the other CPU's would do anything to "conflict" with the critical
section of code running the wait() or the signal().
However on a multiprocessor, we could implement wait() and
signal() using one of the software solutions we examined earlier in
the chapter. For example, wait() could be implemented like this:
typedef struct
{
boolean waiting[n] ;
boolean lock ;
int value ;
struct process *L ;
} semaphore ;
void wait(semaphore S, int me)
{
boolean willBlock=false ;
int wasLocked ;
/* Entry Code for making wait() atomic */
S.waiting[me]=true;
wasLocked=true;
while(S.waiting[me] && wasLocked) wasLocked=TestAndSet(S.lock);
S.waiting[me]=false;
S.value--;
if (S.value<0)
{
add this process to S.L;
willBlock=true ;
}
/* Exit Code for making wait() atomic */
int you=(me+1)%n ;
while ( (you != me) && !S.waiting[you] ) you=(you+1)%n ;
if (you==me)
{
if (willBlock) block(S.lock,false);
else S.lock=false ;
}
else
{
if (willBlock) block(S.waiting[you],false);
else S.waiting[you]=false ;
}
}
Note that the code above employs a modified version of the block() system
call. The meaning of block(x,v) is "block the process making this call
and then set the variable x equal to the value v."
Why do we have to change the form of block()?
Basically it is due to a problem that comes up if a process P executing a
wait() needs to block. In that case P needs to block and
perform the exit code. Unfortunately no matter what order P tries to perform
these actions, it will do something wrong.
If P blocks it can't do anything next, so it can't execute the exit
code.
Consider that if P does not set one of the flags to false -- S.lock or
S.waiting[you] -- then none of the other processes using the semaphore will be
able to perform a signal() or a wait(). All progress of the
group of processes will stop. In particular, no process will ever wake P up.
On the other hand it is not acceptable for P to set one of the flags to false
first and then block. The problem is that another process Q might execute a
signal() and a wakeup(P) before P is able to block.
Therefore, depending on exactly how wakeup() works on the system, P
could "miss" its wakeup. P might wake up later when some other process
executes a signal(), or it might never wake up. Either way, a lost
wakeup can cause processes to malfunction.
The solution we employ here is to take the responsibility away from the
process P and place it with the OS. The OS sets the flag to false after
blocking P.
Note that the solution we posed for the multiprocessor does require some busy
waiting. However generally the amount of time spent doing this busy waiting
will be negligible. There are only a few instructions involved in the wait
and signal code, and processes do their busy waiting only when waiting to
perform those short sequences of instructions.
Contrast that with the case of such code as that below. Here some of the
critical sections could be very long. There is the potential, for example,
that one process will executes a very long time in its critical section and
that several other processes busy wait the whole time.
void SolveCS(int me)
{
local boolean wasLocked ;
local int you;
do
{
waiting[me]=true;
wasLocked=true;
while( waiting[me] && wasLocked )
wasLocked = TestAndSet(lock) ;
waiting[me]=false;
criticalSection(me) ; /* could be very long */
you=(me+1)%n ;
while ( (you!=me) && (!waiting[you]) )
you = (you+1)%n;
if (you==me) lock=false
else waiting[you]=false ;
remainderSection(me) ;
} while(1) ;
}
In the version of the code below, implementing the wait and signal as
described above for the multiprocessor case, the processes are blocked
most of the time while waiting to enter their critical section. They only do
busy waiting for a brief time while executing wait() and
signal().
As a result there is no significant busy waiting in this
solution.
---------------------
shared semaphore mutex ;
---------------------
void SolveCS(int me)
{
do
{
wait (mutex) ;
criticalSection(me) ; /* could be very long */
signal (mutex) ;
remainderSection(me) ;
} while(1) ;
}