Library imp

Import libraries for arithmetic, strings, and booleans
Require Import Arith.
Require Import String.
Require Import Bool.
Turn on implicit arguments -- not really needed here.
Set Implicit Arguments.
To use the string notation, we need to open this scope.
Local Open Scope string_scope.

We will epresent variables as strings.
Definition var : Type := string.

Arithmetic expressions, this time including variables.
Inductive aexp : Type :=
| AVar : var -> aexp
| ANum : nat -> aexp
| APlus : aexp -> aexp -> aexp
| AMinus : aexp -> aexp -> aexp
| AMult : aexp -> aexp -> aexp.

Boolean expressions.
Inductive bexp : Type :=
| Bool : bool -> bexp
| Eq : aexp -> aexp -> bexp
| Lte : aexp -> aexp -> bexp
| And : bexp -> bexp -> bexp
| Or : bexp -> bexp -> bexp.
Commands.
Inductive com : Type :=
| Skip : com
| Assign : var -> aexp -> com
| Seq : com -> com -> com
| If : bexp -> com -> com -> com
| While : bexp -> com -> com.
States or memories, provide a value for each variable. We can model this with a function from variables to natural numbers.
Definition state : Type := var -> nat.
Lookup the value of a variable in a given state.
Definition lookup(x:var)(s:state) : nat := s x.
Update the value associted with a variable and return a new state.
Definition update(x:var)(v:nat)(s:state) : state :=
  fun y => if string_dec x y then v else s y.

Evaluate an arithmetic expression in a given state.
Fixpoint aeval (a:aexp)(s:state) : nat :=
  match a with
    | AVar x => lookup x s
    | ANum n => n
    | APlus a1 a2 => (aeval a1 s) + (aeval a2 s)
    | AMinus a1 a2 => (aeval a1 s) - (aeval a2 s)
    | AMult a1 a2 => (aeval a1 s) * (aeval a2 s)
  end.

Evaluate a boolean expression in a given state.
Fixpoint beval (b:bexp)(s:state) : bool :=
  match b with
    | Bool x => x
    | Eq a1 a2 =>
      if eq_nat_dec (aeval a1 s) (aeval a2 s) then true else false
    | Lte a1 a2 =>
      if le_gt_dec (aeval a1 s) (aeval a2 s) then true else false
    | And b1 b2 => (beval b1 s) && (beval b2 s)
    | Or b1 b2 => (beval b1 s) || (beval b2 s)
  end.

Perform constant folding on an arithmetic expression.
Fixpoint opt_aexp(a:aexp) : aexp :=
  match a with
    | APlus a1 a2 =>
      match opt_aexp a1, opt_aexp a2 with
        | ANum n1, ANum n2 => ANum (n1 + n2)
        | a1', a2' => APlus a1' a2'
      end
    | AMinus a1 a2 =>
      match opt_aexp a1, opt_aexp a2 with
        | ANum n1, ANum n2 => ANum (n1 - n2)
        | a1', a2' => AMinus a1' a2'
      end
    | AMult a1 a2 =>
      match opt_aexp a1, opt_aexp a2 with
        | ANum n1, ANum n2 => ANum (n1 * n2)
        | a1', a2' => AMult a1' a2'
      end
    | a' => a'
  end.

Exercise: Let us prove that the constant folding is correct.
Lemma opt_aexp_corr : forall s a, aeval (opt_aexp a) s = aeval a s.
Proof.
  induction a ; simpl ; auto ; intros ;
    destruct (opt_aexp a1) ; simpl in * ; try congruence ;
      destruct (opt_aexp a2) ; simpl in * ; try congruence.
Qed.

Next, we try to define an evaluator for commands.
Definition ceval'(ceval : com -> state -> state)(c:com)(s:state) : state :=
  match c with
    | Skip => s
    | Assign x a => update x (aeval a s) s
    | Seq c1 c2 => ceval c2 (ceval c1 s)
    | If b c1 c2 => if (beval b s) then ceval c1 s else ceval c2 s
    | While b c => ceval (If b (Seq c (While b c)) Skip) s
  end.

Unfortunately, we can't use a Fixpoint to define this, as the case for While recurses with a bigger command, not a smaller one.

Section BAD.
If Coq provided a general fixpoint combinator, then we could "tie the knot" on our command evaluator.
  Variable ffix : forall A:Type, (A -> A) -> A.
  Definition wishful_thinking_ceval := ffix ceval'.

Alas, if we had a general fixpoint for all types, then we could also prove 3 = 4 or in fact any statement. That is, a general fixpoint function would render the logic inconsistent.
  Lemma three_equals_four : 3 = 4.
    destruct (ffix (fun p:False => p)).
  Qed.
End BAD.

So we will use an alternative kind of model based on predicates. In particular, we will define a new predicate that relates a command, an input state, and an output state. The predicate will provide "proof constructors". In this way, we can encode traditional inference rules, which are just a formalism for inductively defining some relation. So to argue that a command c takes us from state s1 to s2, we have to use these constructors to build a proof that cevals c s1 s2.
Inductive ceval : com -> state -> state -> Prop :=
| Ceval_skip :
  forall s, ceval Skip s s
| Ceval_assign :
  forall x a s, ceval (Assign x a) s (update x (aeval a s) s)
| Ceval_seq :
  forall c1 c2 s1 s2 s3,
    ceval c1 s1 s2 ->
    ceval c2 s2 s3 ->
    ceval (Seq c1 c2) s1 s3
| Ceval_if_true :
  forall b c1 c2 s1 s2,
    beval b s1 = true ->
    ceval c1 s1 s2 ->
    ceval (If b c1 c2) s1 s2
| Ceval_if_false :
  forall b c1 c2 s1 s2,
    beval b s1 = false ->
    ceval c2 s1 s2 ->
    ceval (If b c1 c2) s1 s2
| Ceval_while :
  forall b c s1 s2,
    ceval (If b (Seq c (While b c)) Skip) s1 s2 ->
    ceval (While b c) s1 s2.

Here is a little command which computes n factorial and stores the result in x, clobbering n along the way.
Definition fact_com :=
  Seq (Assign "x" (ANum 1))
      (While (Lte (ANum 1) (AVar "n"))
        (Seq (Assign "x" (AMult (AVar "x") (AVar "n")))
             (Assign "n" (AMinus (AVar "n") (ANum 1))))).

We're going to prove that if we start in state s1 and run fact_com, then we get state s2.
Definition s1 : state := update "n" 3 (fun _ => 0).
Definition s2 : state := update "n" 0 (update "x" 6 (fun _ => 0)).

Coq doesn't natively allow us to conclude that f1 = f2 when forall x, f1 x = f2 x. But the axiom is consistent with Coq's logic. So we will import this fact from a library, as it will be needed in our reasoning below.
Require Import Program.

These two lemmas will help us to reason about states -- this one says that we can ignore earlier updates to the same variable.
Lemma update_x_x : forall x i j s,
  update x i (update x j s) = update x i s.
Proof.
  intros. unfold update.
  extensionality y.
  destruct (string_dec x y). auto. auto.
Qed.

This lemma is related, but lets us skip over an intermediate update.
Lemma update_n_x_n : forall x n i j k s,
  update x i (update n j (update x k s)) = update x i (update n j s).
Proof.
  intros. unfold update.
  extensionality y.
  destruct (string_dec x y). auto. auto.
Qed.

Finally, here we construct a proof that fact_com, when run in state s1, produces state s2. Most of the steps involve eapplying the appropriate proof constructor. We use eapply instead of apply so that we don't have to guess intermediate states -- rather, Coq inserts a meta-level unification variable and then resolves this variable later on.
Lemma fact_3_is_6 : ceval fact_com s1 s2.
Proof.
  unfold fact_com.
  eapply Ceval_seq.
  eapply Ceval_assign. simpl.
  eapply Ceval_while.
  eapply Ceval_if_true. auto.
  eapply Ceval_seq.
  eapply Ceval_seq.
  eapply Ceval_assign.
  eapply Ceval_assign.
  eapply Ceval_while.
  eapply Ceval_if_true. simpl. auto.
  eapply Ceval_seq.
  eapply Ceval_seq.
  eapply Ceval_assign.
  eapply Ceval_assign.
  eapply Ceval_while.
  eapply Ceval_if_true. simpl. auto.
  eapply Ceval_seq.
  eapply Ceval_seq.
  eapply Ceval_assign.
  eapply Ceval_assign.
  eapply Ceval_while. simpl.
  eapply Ceval_if_false. simpl. auto.
  unfold s2. unfold s1. simpl.
At this point, we would like to use the Ceval_skip proof rule, but Coq can't natively determine that the input state is equal to the output state. However, they are extensionally equal, and we can use our lemmas above to show this.
  repeat rewrite update_n_x_n. repeat rewrite update_x_x. rewrite update_n_x_n.
Here, Coq can see that they are the same, so we can complete the proof with an application of the Ceval_skip rule.
  eapply Ceval_skip.
Qed.

If we print out the proof, we see that it's just a big "datatype" tree built from our proof rules. We could've built this tree by hand, but that would be very painful...
Print fact_3_is_6.

In fact, we can automate most of the proof by employing a tactic. This tactic, simplify_ceval, looks at a the current goal state. If we need to construct a proof of ceval for Skip, Assign, or While, we just try applying the appropriate constructor via the econstructor tactic. The case for If is a little harder, as there are two rules we might apply, and we don't really know which one to apply without doing a bit more work. So here, we just try the Ceval_if_true rule first, which generates two goals -- one to show that the boolean expression evaluates to "true" and another that argues about the "true" branch of the conditional. We use auto to try to discharge the first case -- if we're successful, we'll never hit the fail. Otherwise, this whole tactic will fail and we will back-track (restoring the proof state) and try the Ceval_if_false case.
Ltac simplify_ceval :=
  match goal with
    | [ |- ceval (Seq _ _) _ _ ] => econstructor
    | [ |- ceval (Assign _ _) _ _] => econstructor
    | [ |- ceval (While _ _) _ _] => econstructor
    | [ |- ceval Skip _ _] => econstructor
    | [ |- ceval (If _ _ _) _ _ ] => eapply Ceval_if_true ; [auto ; fail | idtac]
    | [ |- ceval (If _ _ _) _ _ ] => eapply Ceval_if_false ; [auto ; fail | idtac]
    | [ |- _ ] => rewrite update_n_x_n
    | [ |- _ ] => rewrite update_x_x
  end.

Now our proof can just repeatedly call simplify_ceval to generate the whole proof for us.
Lemma fact_3_is_6' : ceval fact_com s1 s2.
Proof.
  unfold fact_com, s1.
  repeat (simpl ; simplify_ceval).
Qed.

Let us now do a proof that involves tearing apart another proof. In this case, we are proving that Seq associates to the right. We proceed by inverting the known facts (inversion) until we have enough primitive facts that we can then build a proof of the desired goal.
Lemma seq_assoc_r : forall c1 c2 c3 s1 s2,
  ceval (Seq (Seq c1 c2) c3) s1 s2 ->
  ceval (Seq c1 (Seq c2 c3)) s1 s2.
Proof.
  intros.
  inversion H.
  subst.
  clear H.
  inversion H2.
  subst.
  clear H2.
  econstructor. eauto.
  econstructor. eauto.
  auto.
Qed.

Fixpoint not_in(x:var)(a:aexp) : Prop :=
  match a with
    | AVar y => if string_dec x y then False else True
    | ANum n => True
    | APlus a1 a2 => not_in x a1 /\ not_in x a2
    | AMinus a1 a2 => not_in x a1 /\ not_in x a2
    | AMult a1 a2 => not_in x a1 /\ not_in x a2
  end.

Lemma not_in_aeval : forall x n s a,
  not_in x a ->
  aeval a s = aeval a (update x n s).
Proof.
  induction a ; simpl ; unfold update, lookup ;
    try (intros ; destruct H ; rewrite (IHa1 H) ; rewrite (IHa2 H0) ; auto) ;
    try (destruct (string_dec x v)) ; tauto.
Qed.

Lemma update_comm : forall x i y j s,
  x <> y ->
  update x i (update y j s) = update y j (update x i s).
Proof.
  unfold update. intros.
  extensionality z.
  destruct (string_dec x z) ; destruct (string_dec y z) ; auto.
  congruence.
Qed.

Lemma assign_comm : forall x ax y ay s1 s2,
  x <> y ->
  not_in x ay ->
  not_in y ax ->
  ceval (Seq (Assign x ax) (Assign y ay)) s1 s2 ->
  ceval (Seq (Assign y ay) (Assign x ax)) s1 s2.
Proof.
  Ltac helper H :=
    inversion H ; subst ; clear H.
  intros.
  helper H2.
  helper H5.
  helper H8.
  econstructor. econstructor.
  rewrite <- (not_in_aeval x _ _ ay H0).
  assert (update x (aeval ax (update y (aeval ay s3) s3)) (update y (aeval ay s3) s3) =
          update y (aeval ay s3) (update x (aeval ax s3) s3)).
  rewrite <- (not_in_aeval y _ _ ax H1).
  rewrite update_comm ; auto.
  rewrite <- H2.
  constructor.
Qed.

Definition aexp_eq : forall (a1 a2:aexp), {a1=a2} + {a1<>a2}.
  pose (string_dec).
  pose (eq_nat_dec).
  decide equality.
Qed.

Fixpoint opt_bexp(b:bexp) : bexp :=
  match b with
    | Eq a1 a2 =>
      match opt_aexp a1, opt_aexp a2 with
        | ANum n1, ANum n2 => Bool (if eq_nat_dec n1 n2 then true else false)
        | a1', a2' => Eq a1' a2'
      end
    | Lte a1 a2 =>
      match opt_aexp a1, opt_aexp a2 with
        | ANum n1, ANum n2 => Bool (if le_gt_dec n1 n2 then true else false)
        | a1', a2' => Lte a1' a2'
      end
    | And b1 b2 =>
      match opt_bexp b1, opt_bexp b2 with
        | Bool b1, Bool b2 => Bool (b1 && b2)
        | b1', b2' => And b1' b2'
      end
    | Or b1 b2 =>
      match opt_bexp b1, opt_bexp b2 with
        | Bool b1, Bool b2 => Bool (b1 || b2)
        | b1', b2' => Or b1' b2'
      end
    | Bool b => Bool b
  end.

Lemma opt_bexp_corr : forall s b, beval (opt_bexp b) s = beval b s.
Proof.
  induction b ; simpl ; auto ;
    try (rewrite <- (opt_aexp_corr s a) ;
         rewrite <- (opt_aexp_corr s a0) ;
           destruct (opt_aexp a) ; simpl ; auto ;
             destruct (opt_aexp a0) ; simpl ; auto) ;
  (rewrite <- IHb1 ;
  rewrite <- IHb2 ;
    destruct (opt_bexp b1) ; simpl ; auto ;
      destruct (opt_bexp b2) ; simpl ; auto).
Qed.

In class, Evan asked if we could prove that a diverging computation has no proof. Turns out this is provable, but we have to strengthen our induction hypothesis to handle the "intermediate" proof goals that can arise. Now that my medication has worn off, I can do this :-)
Definition loop := While (Bool true) Skip.

Fixpoint active_loop(c:com) : Prop :=
  c = While (Bool true) Skip \/
  c = If (Bool true) (Seq Skip (While (Bool true) Skip)) Skip \/
  c = Seq Skip (While (Bool true) Skip).

Lemma loop_no_proof' :
  forall c1 s1 s2,
    ceval c1 s1 s2 ->
    active_loop c1 -> False.
Proof.
  Ltac s := match goal with
              | [ H : _ \/ _ |- _ ] => destruct H ; try congruence
              | [ H : _ = _ |- _ ] => injection H ; intros ; subst ; clear H
            end.
  induction 1 ; simpl ; intros ; repeat s.
  apply IHceval2. left. auto.
  apply IHceval. right. right. auto.
  simpl in H. congruence.
  apply IHceval. right. left. auto.
Qed.

Lemma loop_no_proof : forall s1 s2, ceval loop s1 s2 -> False.
Proof.
  intros.
  eapply loop_no_proof' ; eauto.
  unfold loop. left ; auto.
Qed.

So we might try to define diverging computation/state pairs as any such pair for which there does not exist an output state that we can derive:
Definition diverges : com -> state -> Prop :=
  fun c s => forall s', ceval c s s' -> False.

Then we might like to claim that for every command c and starting state s, either it evaluates to some state s' or else diverges. Alas, this would require excluded middle...



Exercise: Extend the optimizer and proof of correctness so that we reduce:
  • x + 0 = x
  • 0 + x = x
  • 0 * x = 0
  • x * 0 = 0
  • 1 * x = x
  • x * 1 = x

Exericse: Write an optimizer for commands which uses your aexp and bexp optimizers and prove it correct.

The ceval relation evaluates a command in one fell swoop. But notice that we can't say anything about commands that might run forever (e.g., "while true do skip"). That's because we cannot construct a finite derivation/proof that the command evaluates to some result.

An alternative semantics, called a "small-step" semantics, will let us say something about such diverging computations. It will also allow us to model some things, such as concurrency in a way that's difficult to do with large-step models.

We begin by defining a cstep function which takes a command and a state, and produces an optional command and state. Intuitively, the cstep function runs the command for one step, if there's any step to be taken. If there's no step to be taken, then we return None. Otherwise, we return the continuation (what command to be run next) and the resulting state we get from running one step of the command.
Fixpoint cstep (c:com) (s:state) : option (com * state) :=
  match c with
    | Skip => None
    | Assign x a => Some (Skip, update x (aeval a s) s)
    | Seq c1 c2 =>
      match cstep c1 s with
        | None => Some (c2, s)
        | Some (c1',s') => Some (Seq c1' c2, s')
      end
    | If b c1 c2 => if beval b s then Some(c1,s) else Some(c2,s)
    | While b c => Some (If b (Seq c (While b c)) Skip, s)
  end.

An alternative would be to define a cstep using an inductive definition as in:
Inductive cstep' : com -> state -> com -> state -> Prop :=
| Cstep_assign : forall x a s, cstep' (Assign x a) s Skip (update x (aeval a s) s)
| Cseq_skip : forall c s, cstep' (Seq Skip c) s c s
| Cseq_step : forall c1 s c1' s' c2, cstep' c1 s c1' s' -> cstep' (Seq c1 c2) s (Seq c1' c2) s'
| Cseq_if_true : forall b c1 c2 s, beval b s = true -> cstep' (If b c1 c2) s c1 s
| Cseq_if_false : forall b c1 c2 s, beval b s = false -> cstep' (If b c1 c2) s c2 s
| Cwhile : forall b c s, cstep' (While b c) s (If b (Seq c (While b c)) Skip) s.

Now we can define a csteps relation as follows. This should be equivalent to our cevals relation (and is as we prove below.)
Inductive csteps : com -> state -> state -> Prop :=
| Csteps_skip : forall s,
  csteps Skip s s
| Csteps_step : forall c1 s1 c2 s2 s3,
  cstep c1 s1 = Some (c2,s2) ->
  csteps c2 s2 s3 ->
  csteps c1 s1 s3.

Alternatively, we can can use cstep':
Inductive csteps' : com -> state -> state -> Prop :=
| Csteps'_skip : forall s, csteps' Skip s s
| Csteps'_step : forall c1 s1 c2 s2 s3,
  cstep' c1 s1 c2 s2 ->
  csteps' c2 s2 s3 ->
  csteps' c1 s1 s3.

In the proofs below, I'm going to use this helper tactic.
Ltac h :=
  match goal with
    | [ H : None = Some _ |- _ ] => congruence
    | [ H : Some _ = None |- _ ] => congruence
    | [ H : Some _ = Some _ |- _ ] => injection H ; clear H ; intros ; subst
    | [ H : ceval Skip _ _ |- _ ] => inversion H ; clear H ; subst
    | [ H : ceval (Seq _ _) _ _ |- _ ] => inversion H ; clear H ; subst
    | [ p : prod _ _ |- _ ] => destruct p
    | _ => constructor ; auto ; fail
  end.

I want to show that ceval c s1 s2 implies csteps c s1 s2. To do so, I'm going to proceed by induction on the derivation of ceval c s1 s2. (Trying to induct on c won't be any good, as the command gets larger in the case of a while-loop.) Even by doing induction on the derivation of ceval c s1 s2, the Seq case is somewhat problematic. I need the following lemma, which shows that if I have csteps c1 s1 s2 and csteps c2 s2 s3, then I can conclude csteps (Seq c1 c2) s1 s3. To prove this, I'll proceed by induction on the derivation of csteps c1 s1 s2.
Lemma csteps_seq : forall c2 s2 s3,
  csteps c2 s2 s3 ->
  forall c1 s1,
    csteps c1 s1 s2 -> csteps (Seq c1 c2) s1 s3.
Proof.
  induction 2.
  econstructor ; simpl ; eauto.
  econstructor ; simpl. rewrite H0 ; eauto. auto.
Qed.

With that lemma in hand, things become much easier:
Lemma csteps_ceval : forall c s1 s2,
  ceval c s1 s2 -> csteps c s1 s2.
Proof.
  induction 1.
  constructor.
  econstructor ; simpl ; auto ; constructor.
  eapply csteps_seq ; eauto.
  econstructor. simpl. rewrite H. eauto. auto.
  econstructor. simpl. rewrite H. eauto. auto.
  econstructor. simpl. eauto. auto.
Qed.

At this point, we've shown that if we run c in state s1 using the big-step semantics to get state s2, then we can run c in state s1 using the small-step semantics to also get state s2. Let us now prove the other direction -- that any derivation using the small step semantics can be simulated with the big-step. That is, csteps c s1 s2 -> cevals c s1 s2. Again, we'll proceed by induction on the derivation of csteps c s1 s2. But as before, we're going to need a couple of lemmas to push this through.

This first lemma argues that cstep is only undefined on the Skip command. At first blush, this is obvious, but it really requires an induction to show this. In general, I tend to break inductions out as separate lemmas... Notice here that I use the helper tactic h.
Lemma cstep_none : forall s c, cstep c s = None -> c = Skip.
Proof.
  induction c ; simpl ; intros ; repeat h. destruct (cstep c1 s) ; repeat h.
  destruct (beval b s) ; h.
Qed.

This next lemma is crucial for reasoning about csteps derivations that end with the Csteps_step rule. In essence, it translates the rule to use ceval instead of csteps: if c1 in s1 steps to c2 and s2, and c2 in s2 evaluates to s3, then c1 in s1 evaluates to s3. So in essence, it's composing a step with the big-step evaluation rules. Try to prove our main lemma below without this, and see why we need this nested induction...
Lemma ceval_cstep : forall c1 s1 c2 s2 s3,
  cstep c1 s1 = Some (c2, s2) ->
  ceval c2 s2 s3 -> ceval c1 s1 s3.
Proof.
  induction c1 ; simpl ; intros ; repeat h.
  remember (cstep c1_1 s3) as x. destruct x. repeat h.
  generalize (IHc1_1 s3 c s4 s6 (eq_sym Heqx) H2). intro. econstructor ; eauto.
  rewrite (cstep_none s3 c1_1 (eq_sym Heqx)) in *. repeat h.
  econstructor ; [ econstructor | eauto ].
  remember (beval b s3) as x. destruct x ; repeat h. eapply Ceval_if_false ; auto.
Qed.

Once we have the ceval_cstep lemma, the proof is easy.
Lemma ceval_csteps : forall c s1 s2,
  csteps c s1 s2 -> ceval c s1 s2.
Proof.
  induction 1. constructor. eapply ceval_cstep ; eauto.
Qed.

So in the end, we can conclude that the two semantics are equivalent:
Lemma ceval_equiv_csteps : forall c s1 s2,
  ceval c s1 s2 <-> csteps c s1 s2.
Proof.
  intros ; split ; [ apply csteps_ceval | apply ceval_csteps ].
Qed.

Exercises:

1) reformulate the evaluation of boolean and arithmetic expressions using:
  • large step semantics
  • small-step semantics
  • prove the equivalence of your semantics to the functional/denotational semantics.
2) Try re-proving the correctness of the arithmetic optimizer using your small-step semantics. Is it easier or harder than the functional semantics?

3) Consider adding a parallel statement Par s1 s2 which is meant to execute s1 and s2 in parallel. We can model that by interleaving the evaluation of the two statements. Explain why adding Par s1 s2 isn't really faithful when we use our denotational and big-step models, but can be faithful with a small-step model.