Labelled Transition Systems Last time, we started talking about intensional aspects of programs because we wanted to have a different kind of comparison than the usual notion of observational equivalence. Of course, the languages we've looked at so far have only very trivial observations in the sense that they either run to completion, producing a value of ground type, or else they run forever. A critical question is how to construct models that interact with the outside world, for then we have many new forms of observation. To begin with, let's consider a core function language with support for recursive functions (so we can possibly loop forever) as well as the ability to print an integer: e ::= i | x | \x.e | e1 e2 | fix f(x).e | () | print e Now the obvious notion of program equivalence that we would like to talk about is that e1 should be considered equal to e2 when they print out the same sequence of values. In particular, not all diverging computations should be considered equivalent because now we have some way to distinguish them, assuming that we can observe the integers being printed. One way to model this is to move to a labelled transition system. The basic idea is to fix an alphabet (which need not be finite) L of labels and to define our evaluation relation -> as a relation on triples of an expression, a label, and another expression. For instance, in the language above, we might pick as our labels: L ::= i | . and then define evaluation to be of the form e -i-> e' or e -.-> e'. Here, a label of "i" represents that the program output the value i, whereas a label of "." represents that the program took a step, but did no output. As an example, a CBV version of the language above might have an evaluation semantics that looks like this: (\x.e) v -.-> e[v/x] fix f(x).e -.-> \x.e[fix f(x).e/f] print i -i-> () e1 -L-> e1' ----------------- e1 e2 -L-> e1' e2 e2 -L-> e2' --------------- v e2 -L-> v e2' e -L-> e' --------------------- print e -L-> print e' Given an expression e, we say the trace of e is a (possibly infinite) sequence of pairs: (e,L), (e1,L1), (e2,L2), (e3,L3), ... such that e -L-> e1 -L1-> e2 -L2-> e3 -L3-> ... To avoid having to deal with the distinction between finite and infite sequences, it's usual to pad finite sequences in some way to make everything infinite. For instance, we could add a special label Term reflecting termination and take the final value v and define v -Term-> v. Technically, we'd need to add side conditions of the form (L != Term) to the congruence rules above to prevent "local" infinite loops on values. More formally, we can think of Trace[e] as a function T from a natural number to a pair of an expression and label with the properties that: (a) #1 T(0) = e (b) All i >= 0.if T(i) = (ei,Li) and Li != term then T(i+1) = (ei+1,Li+1) for some ei+1 and Li+1 such that ei -L-> ei+1 (c) if T(i) = (e,Term) then e is a value and for all j >= i, T(j) = (v,Term). We can think of a trace T as defining an infinite stream of expressions and labels. To that end, it's useful to define some meta-functions such as: head(T) = T(0) tail(T) = fn i => T(i+1) labels(T) = fn i => #2(T(i)) Now we would like to set up some notion of equality that equates e1 and e2 when they produce the same outputs. One definition is of course to just say: e1 == e2 iff labels(Trace[e1]) = labels(Trace[e2]) but then this will force the two expressions to take exactly the same number of steps between outputs. If you want to ignore this level of detail, we need to first filter out all of the "." labels. One idea is to define: e1 == e2 iff labels(Trace[e1]) =~= labels(Trace[e2]) and then take =~= to be defined as follows, using our head and tail functions: T1 =~= T2 iff hd(T1) = hd(T2) and tail(T1) =~= tail(T2) or hd(T1) = . and tail(T1) =~= T2 or hd(T2) = . and T1 =~= tail(T2) Of course, the problem with this definition of equality is that, well, it's not a definition! Rather, it's an equation. Now it's possible to write a definition of "filter" that works on these sequences and then use standard equality after filtering, but it's instructive to think how we might "solve" for the definition of =~= as given above. If your first thought is to compute a fixed-point, you're on the right track. If you then proceed to define: LabSeq = Nat -> Label F : (LabSeq x LabSeq) -> (LabSeq x LabSeq) F(S) = { (T1,T2) | hd(T1) = hd(T2) and (tail(T1),tail(T2)) in S or hd(T1) = . and (tail(T1),T2) in S or hd(T2) = . and (T1,tail(T2)) in S } Then you're also on the right track. However, if you try to define: [wrong] =~= =def= U_i>=0 F^i({}) then you're in bad shape. The problem is that, of course, F({}) = {} since each of the clauses in F demands that we build a pair of sequences out of a pre-existing pair of sequences in the input. It's worth remarking, however, that {} is a fixed- point of F. It's just not the fixed-point we're looking for... The problem is that we were trying to define =~= as the least fixed point of F. If instead, we take =~= to be the *greatest* fixed point of F (assuming one exists and it's unique) then perhaps that would work? The answer is yes and we can solve for the gfp by simply starting off with the set of all pairs of label-sequences, then apply F, and then intersect: =~= =def= Intersect_i>=0 F^i(U) where U = {(T1,T2) in LabSeq x LabSeq} Think about what this will do: F^0(U) = U = {(T1,T2)} <---- every pair of labels is equated at this level F^1(U) = { (L::T1,L::T2),(.::T1,T2),(T1,.::T2) | (T1,T2) in U } F^2(U) = { (L::T1,L::T2),(.::T1,T2),(T1,.::T2) | (T1,T2) in F^1(U) } ... So at each level, we're adding stuff on to the front of the streams that would make them be equivalent if only their tails were equivalent. By the time we intersect all of them, we filter out all of the ones that have inequivalent tails. This is an example of co-induction because we're starting from a bigger set (all pairs of label sequences) and filtering things out with the F function as a sort of consistency check. Of course, it's crucial here that F is a monotone function else there's no guarantee that we will ever converge. Anyway, the key thing is that we are forced to manipulate infinite sequences when we want to model interacting computations (eg, a server) that isn't supposed to terminate. Using labelled transition relations is the obvious choice and then we're usually going to want to define a relatively coarse notion of trace equivalence which allows us to insert/delete so-called "silent-transitions". ------------------------------------------------------------------ A key advantage of labelled transition systems is that they can be used to help modularize the "configuration" information in an operational semantics. As a simple example, recall that to model something like refs, we defined configurations to be of the form: (H,e) where H is a mapping from locations to values. For instance, we defined: (H,ref v) -> (H+{p->v},p) (H+{p->v'},p := v) -> (H+{p->v},()) (H+{p->v'},!p) -> (H+{p->v},v) and so forth. What's so annoying about this is that we have to carry the heap around even when we're not manipulating refs. An alternative way to phrase the language is as follows: ref v -alloc(p->v)-> p !p -read(p,v)-> v p := v -write(p,v)-> () That is, we put the information that we need to communicate to some other part of the program state in the label. Most of the congruence rules can then treat labels abstractly. For instance, we may have: e1 -L-> e1' ------------------ e1 e2 -L-> e1' e2 e2 -L-> e2' --------------- v e2 -L-> v e2' (\x.e) v -.-> e[v/x] Now we can define transitions for configurations as follows: e -alloc(p,v)-> e' ---------------------- (H,e -.-> (H+{p->v},e') e -read(p,v)-> e' ----------------------------- (H+{p->v},e) -.-> (H+{p->v},e') e -write(p,v)-> e' ------------------------------ (H+(p->v'),e) -.-> (H+{p->v},e') Although this seems like a lot of duplication, it has its advantages. For instance, suppose we wish to add support for printing to the language. Then we don't have to modify the current rules, but rather only add: print i -print(i)-> () e -print(i)-> e' ------------------------ (H,e) -print(i)-> (H,e') This style of interaction, where all of the "state" of the abstract machine is placed in the labels is common in process calculi such as CCS or the pi-calculus. For example, suppose we wished to add some Concurrent ML-style primitives for communication. In particular, suppose we add: e ::= ... | newchan | c | send e1 e2 | receive e | fork e with the intention that fork e starts a new thread executing e, newchan allocates a fresh communication channel, send c v sends the value v on the channel c, blocking until a receiver accepts. We might add to our language so far: newchan -newchan(c)-> c send c v -send(c,v)-> () receive c -receive(c,v)-> v fork e -fork(e)-> () and then add the following: e1 -fork e-> e2 ---------------------------------------- P || fork e1 -.-> P || fork e2 || fork e P1 -.-> P1' ----------------------- P1 || P2 -.-> P1' || P2 P1 -send(c,v)-> P1' P2 -receive(c,v)-> P2' --------------------------------------------- P1 || P2 -.-> P1' || P2' P || fork v -.-> P (H,e) -.-> (H',e') -------------------------------------- (H,P || fork e) -.-> (H', P || fork e') P1 -.-> P1' --------------------------------- (H, P1 || P2) -.-> (H, P1' || P2) So the configurations of this machine include a global heap H together with a list of processes P1 || ... || Pn where each process is of the form fork e. Normally, I would also need to add congruence rules for the processes (e.g., that we consider || associative and commutative) but we might just separate that out as follows: P1 == P2 P2 -L-> P3 P3 == P4 ---------------------------------- P1 -L-> P4 and define == to reflect these facts. The key idea is that you can modularize the treatment of the threads and the refs and the I/O, etc.