Formal design of a distributed mutual exclusion protocol
We consider the problem of mutual exclusion in distributed systems which cannot provide an atomic test-and-set instruction. We use model-checking to design an algorithm that solves this problem. We start from a design that we prove incorrect. Then, we use the counter-examples provided by the TLC model-checker to improve the design, until we reach an algorithm that matches the specification.
A first proposal
The solution below is a first attempt to design a distributed mutual exclusion algorithm. To make the problem simpler, we consider only 2 processes (i.e. N=2
). We assume that the processes can share memory as this can be implemented in a distributed system.
------------------------------- MODULE mutex1 -------------------------------
EXTENDS Naturals, TLC
CONSTANTS N (* Number of processes MUST BE = 2 *)
(*
--algorithm Mutex1 {
variables flags = [ i \in 1..N |-> FALSE ]; (* flags[i]=TRUE when process
i wants to enter CS *)
process (P \in 1..N)
{
loop: while (TRUE) {
test: await( \A q \in 1..N: flags[q]=FALSE );
set: flags[self] := TRUE;
cs: skip; (* In CS *)
release: flags[self] := FALSE;
}
}
}
*)
\* BEGIN TRANSLATION
\* END TRANSLATION
(* Propositional variables *)
CS(i) == pc[i] = "cs"
(* There is at most one process in the critical section *)
Mutex == (\A p \in 1..N: \A q \in (1..N \{p}): ~(CS(p) /\ CS(q)))
=============================================================================
- Model-check the algorithm with respect to the invariant
Mutex
, as well as absence of deadlock. Which one of the two properties is violated? - We propose to fix the algorithm by swapping the statements that correspond to labels
test
andset
. Justify this proposition with respect to the counter-example provided by the model-checker.
A second proposal
The second proposal implements the modification suggested above. In order to enter the critical section, a process first advertises that it requires access to the critical section, then it checks that no other process wants to access the critical section.
------------------------------- MODULE mutex2 -------------------------------
EXTENDS Naturals, TLC
CONSTANTS N (* Number of processes MUST BE = 2 *)
(*
--algorithm Mutex2 {
variables flags = [ i \in 1..N |-> FALSE ]; (* flags[i]=TRUE when process i wants to enter CS *)
process (P \in 1..N)
{
loop: while (TRUE) {
set: flags[self] := TRUE;
test: await( \A q \in (1..N \ {self}): flags[q]=FALSE );
cs: skip; (* In CS *)
release: flags[self] := FALSE;
}
}
}
*)
\* BEGIN TRANSLATION
\* END TRANSLATION
(* Propositional variables *)
CS(i) == pc[i] = "cs"
(* There is at most one process in the critical section *)
Mutex == (\A p \in 1..N: \A q \in (1..N \{p}): ~(CS(p) /\ CS(q)))
=============================================================================
- Model-check the algorithm w.r.t. invariant
Mutex
, as well as for absence of deadlock. Which one of the two properties is violated? Explain why.
Peterson’s algorithm for 2 processes
We consider Peterson’s mutual exclusion algorithm below (see Wikipedia page for extra information).
The algorithm below shows Peterson’s algorithm for process i
:
do forever {
flag[i] := true
turn := i
wait until (turn != i) or (forall j != i, flag[j] = false)
// in critical section
flag[i] := false
}
The algorithm above uses two shared variables:
flag
which is an array indexed from1
toN
, initialised tofalse
turn
which is an integer variable with values in0..N
, initialised to0
In the entering phase of the algorithm, a process first requests access to the critical section by setting its flag. Then, the turn
variable is used to arbitrate between processes that require access concurrently. If turn=i
, then process i
gives priority to the other process. Thus process i
can enter the critical section if either no other process requires access, or if turn != i
.
- Justify the solution proposed by Peterson with respect to the counter-example provided by TLC for algorithm
Mutex2
. - Write a PlusCal model of Peterson’s algorithm including the invariant
Mutex
(each line can be counted as an atomic instruction) - Check that the algorithm is correct (
Mutex
and absence of deadlocks) forN=2
processes - The algorithm is incorrect for
N=3
processes. Use the model-checker to get a counter-example. From the counter-example, tell which one of the two properties is violated, and explain why.