Ce module définit un compilateur pour le langage IMP. Il produit du code pour une machine virtuelle (un petit sous-ensemble de la machine virtuelle Java JVM). Nous prouvons que ce compilateur est correct car il préserve la sémantique des programmes source.

Require Import Coq.Program.Equality. (* Pour faciliter certaines preuves. *)
Require Import Bool. (* La théorie des booléens *)
Require Import List. (* La théorie des listes *)
Require Import Arith. (* La théorie des entiers naturels (type nat) *)
Require Import ZArith. (* La théorie des entiers relatifs (type Z) *)
Require Import ZOdiv. (* Division entière avec arrondi vers zéro *)
Require Import Sequences. (* Une théorie des suites de transitions *)
Require Import IMP. (* Syntaxe et sémantique d'IMP *)

Open Scope nat_scope. (* Interpréter +, -, etc, comme les opérations du type nat *)

1. La machine virtuelle.

La machine opère sur un code c (une liste fixée d'instructions) et trois composants qui varient au cours du temps:

un compteur de programme (PC), dénotant une position dans le code c
un état mémoire qui donne les valeurs entières des variables
une pile d'évaluation, contenant des valeurs entières.

Le jeu d'instructions de la machine.

Inductive instruction: Type :=
  | Iconst: Z -> instruction (* empiler l'entier n sur la pile *)
  | Ivar: ident -> instruction (* empiler la valeur de la variable x *)
  | Isetvar: ident -> instruction (* dépiler un entier et l'affecter à la variable x *)
  | Iadd: instruction (* dépiler n2, dépiler n1, empiler n1+n2 *)
  | Isub: instruction (* dépiler n2, dépiler n1, empiler n1-n2 *)
  | Imul: instruction (* dépiler n2, dépiler n1, empiler n1*n2 *)
  | Idiv: instruction (* dépiler n2, dépiler n1, empiler n1/n2 *)
  | Ibranch_forward: nat -> instruction (* sauter ofs instructions en avant *)
  | Ibranch_backward: nat -> instruction (* sauter ofs instructions en arrière *)
  | Ibeq: nat -> instruction (* dépiler n2, dépiler n1, sauter ofs en avant si n1=n2 *)
  | Ibne: nat -> instruction (* dépiler n2, dépiler n1, sauter ofs en avant si n1<>n2 *)
  | Ible: nat -> instruction (* dépiler n2, dépiler n1, sauter ofs en avant si n1<=n2 *)
  | Ibgt: nat -> instruction (* dépiler n2, dépiler n1, sauter ofs en avant si n1>n2 *)
  | Ihalt: instruction. (* arrêter la machine *)

Definition code := list instruction.

code_at C pc = Some i si i est l'instruction à la position pc dans la liste d'instructions C.

Fixpoint code_at (C: code) (pc: nat) : option instruction :=
  match C, pc with
  | nil, _ => None
  | i :: C', O => Some i
  | i :: C', S pc' => code_at C' pc'
  end.

Une pile de la machine est une liste de valeurs entières. n :: s est la pile obtenue en empilant n au sommet de la pile s.

Definition stack := list Z.

L'état de la machine est un triplet (PC, pile, état mémoire).

Definition machine_state := (nat * stack * store)%type.

La sémantique de la machine est donnée en style "petits pas", comme une relation de transition entre deux états machine. On a une règle de transition pour chaque sorte d'instruction, à l'exception de Ihalt qui ne fait pas de transition.

Inductive transition (C: code): machine_state -> machine_state -> Prop :=
  | trans_const: forall pc stk s n,
      code_at C pc = Some(Iconst n) ->
      transition C (pc, stk, s) (pc + 1, n :: stk, s)
  | trans_var: forall pc stk s x n,
      code_at C pc = Some(Ivar x) ->
      s x = Some n ->
      transition C (pc, stk, s) (pc + 1, n :: stk, s)
  | trans_setvar: forall pc stk s x n,
      code_at C pc = Some(Isetvar x) ->
      transition C (pc, n :: stk, s) (pc + 1, stk, update x n s)
  | trans_add: forall pc stk s n1 n2,
      code_at C pc = Some(Iadd) ->
      transition C (pc, n2 :: n1 :: stk, s) (pc + 1, (n1 + n2)%Z :: stk, s)
  | trans_sub: forall pc stk s n1 n2,
      code_at C pc = Some(Isub) ->
      transition C (pc, n2 :: n1 :: stk, s) (pc + 1, (n1 - n2)%Z :: stk, s)
  | trans_mul: forall pc stk s n1 n2,
      code_at C pc = Some(Imul) ->
      transition C (pc, n2 :: n1 :: stk, s) (pc + 1, (n1 * n2)%Z :: stk, s)
  | trans_div: forall pc stk s n1 n2,
      code_at C pc = Some(Idiv) ->
      transition C (pc, n2 :: n1 :: stk, s) (pc + 1, (n1 / n2)%Z :: stk, s)
  | trans_branch_forward: forall pc stk s ofs pc',
      code_at C pc = Some(Ibranch_forward ofs) ->
      pc' = pc + 1 + ofs ->
      transition C (pc, stk, s) (pc', stk, s)
  | trans_branch_backward: forall pc stk s ofs pc',
      code_at C pc = Some(Ibranch_backward ofs) ->
      pc' = pc + 1 - ofs ->
      transition C (pc, stk, s) (pc', stk, s)
  | trans_beq: forall pc stk s ofs n1 n2 pc',
      code_at C pc = Some(Ibeq ofs) ->
      pc' = (if Z_eq_dec n1 n2 then pc + 1 + ofs else pc + 1) ->
      transition C (pc, n2 :: n1 :: stk, s) (pc', stk, s)
  | trans_bne: forall pc stk s ofs n1 n2 pc',
      code_at C pc = Some(Ibne ofs) ->
      pc' = (if Z_eq_dec n1 n2 then pc + 1 else pc + 1 + ofs) ->
      transition C (pc, n2 :: n1 :: stk, s) (pc', stk, s)
  | trans_ble: forall pc stk s ofs n1 n2 pc',
      code_at C pc = Some(Ible ofs) ->
      pc' = (if Z_le_dec n1 n2 then pc + 1 + ofs else pc + 1) ->
      transition C (pc, n2 :: n1 :: stk, s) (pc', stk, s)
  | trans_bgt: forall pc stk s ofs n1 n2 pc',
      code_at C pc = Some(Ibgt ofs) ->
      pc' = (if Z_le_dec n1 n2 then pc + 1 else pc + 1 + ofs) ->
      transition C (pc, n2 :: n1 :: stk, s) (pc', stk, s).

Il y a plusieurs situation où la machine se bloque (erreur à l'exécution) et où aucune transition ne peut s'effectuer. Par exemple:

Si le PC sort du code pc >= length C, on a code_at C pc = None et aucune règle de transition ne s'applique
S'il n'y a pas assez de valeurs sur la pile pour les instructions qui dépilent une ou deux valeurs. Exemple: Iadd sur une pile vide.

Cependant, il y a un cas où la machine est plus tolérante que le langage IMP: l'instruction Idiv ne plante pas si le diviseur est 0. Dans ce cas, le résultat est 0 car c'est ainsi que Coq définit la division par zéro.

Comme toujours en sémantique "petits pas", nous formons des suites de transitions de la machine pour définir le comportement d'un code. Toutes les exécutions commencent avec pc = 0 et une pile vide. La machine s'arrête sans erreur lorsque pc pointe vers une instruction Ihalt et la pile est vide.

Definition mach_terminates (C: code) (final_store: store) :=
  exists pc,
  code_at C pc = Some Ihalt /\
  star (transition C) (0, nil, initial_store) (pc, nil, final_store).

La divergence (non-terminaison) correspond à une suite infinie de transitions de la machine.

Definition mach_diverges (C: code) :=
infseq (transition C) (0, nil, initial_store).

Le "plantage" correspond à une suite finie de transitions qui aboutit sur un état non final et qui ne peut pas faire de transitions.

Definition mach_runtime_error (C: code) :=
  exists pc, exists stk, exists m,
  star (transition C) (0, nil, initial_store) (pc, stk, m)
  /\ irred (transition C) (pc, stk, m)
  /\ (code_at C pc <> Some Ihalt \/ stk <> nil).

2. Le compilateur

Le code engendré pour une expression arithmétique a

s'exécute en séquence (pas de branchements)
dépose la valeur de a au sommet de la pile
préserve l'état mémoire.

C'est la traduction bien connue en "notation polonaise inverse" (RPN). Note: :: est le "cons" des listes et ++ est la concaténation de listes.

Fixpoint compile_aexp (a: aexp) : code :=
  match a with
  | Const n => Iconst n :: nil
  | Var v => Ivar v :: nil
  | Plus a1 a2 => compile_aexp a1 ++ compile_aexp a2 ++ Iadd :: nil
  | Minus a1 a2 => compile_aexp a1 ++ compile_aexp a2 ++ Isub :: nil
  | Times a1 a2 => compile_aexp a1 ++ compile_aexp a2 ++ Imul :: nil
  | Div a1 a2 => compile_aexp a1 ++ compile_aexp a2 ++ Idiv :: nil
  end.

Exemples de compilations.

Compute (compile_aexp (Plus (Var vx) (Const 1))).

Compute (compile_aexp (Times (Var vy) (Plus (Var vx) (Const 1)))).

Le code compile_bexp b cond ofs engendré pour une expression booléenne b

saute par dessus les ofs instructions qui le suivent si b s'évalue en le booléen cond
s'exécute en séquence si b s'évalue en la négation de cond
préserve la pile et l'état mémoire.

Les transparents expliquent les mystérieux branchements relatifs!

Fixpoint compile_bexp (b: bexp) (cond: bool) (ofs: nat) : code :=
  match b with
  | TRUE =>
      if cond then Ibranch_forward ofs :: nil else nil
  | FALSE =>
      if cond then nil else Ibranch_forward ofs :: nil
  | Eq a1 a2 =>
      compile_aexp a1 ++ compile_aexp a2 ++
      (if cond then Ibeq ofs :: nil else Ibne ofs :: nil)
  | Le a1 a2 =>
      compile_aexp a1 ++ compile_aexp a2 ++
      (if cond then Ible ofs :: nil else Ibgt ofs :: nil)
  | Not b1 =>
      compile_bexp b1 (negb cond) ofs
  | And b1 b2 =>
      let c2 := compile_bexp b2 cond ofs in
      let c1 := compile_bexp b1 false (if cond then length c2 else ofs + length c2) in
      c1 ++ c2
  end.

Exemples.

Compute (compile_bexp (Eq (Var vx) (Const 1)) true 42).

Compute (compile_bexp (And (Le (Const 1) (Var vx)) (Le (Var vx) (Const 10))) false 42).

Compute (compile_bexp (Not (And TRUE FALSE)) true 42).

Le code produit pour une commande c

modifie l'état mémoire comme prescrit par la sémantique de c
préserve la pile
peut contenir des branchements, mais termine toujours sur l'instruction qui suit immédiatement le code engendré.

À nouveau, on se reportera aux transparents pour une explication des offsets de branchements.

Fixpoint compile_com (c: command) : code :=
  match c with
  | Skip =>
      nil
  | Assign id a =>
      compile_aexp a ++ Isetvar id :: nil
  | Seq c1 c2 =>
      compile_com c1 ++ compile_com c2
  | If b ifso ifnot =>
      let code_ifso := compile_com ifso in
      let code_ifnot := compile_com ifnot in
      compile_bexp b false (length code_ifso + 1)
      ++ code_ifso
      ++ Ibranch_forward (length code_ifnot)
      :: code_ifnot
  | While b body =>
      let code_body := compile_com body in
      let code_test := compile_bexp b false (length code_body + 1) in
      code_test
      ++ code_body
      ++ Ibranch_backward (length code_test + length code_body + 1)
      :: nil
  end.

Le code compilé pour un programme complet p (une commande) est similaire, mais termine proprement sur une instruction Ihalt.

Definition compile_program (p: command) : code :=
compile_com p ++ Ihalt :: nil.

Exemples de compilations:

Compute (compile_program (Assign vx (Plus (Var vx) (Const 1)))).

Compute (compile_program (While TRUE Skip)).

Compute (compile_program (If (Eq (Var vx) (Const 1)) (Assign vx (Const 0)) Skip)).

3. Preuve de préservation sémantique utilisant la sémantique naturelle

Résultats auxiliaires sur les suites d'instructions.

Pour raisonner sur l'exécution du code produit par le compilateur, nous avons besoin d'examiner des morceaux de code C2 qui sont à la position pc dans un code plus grand C = C1 ++ C2 ++ C3. Cette situation est capturée par le prédicat codeseq_at C pc C2 ci-dessous.

Inductive codeseq_at: code -> nat -> code -> Prop :=
  | codeseq_at_intro: forall C1 C2 C3 pc,
      pc = length C1 ->
      codeseq_at (C1 ++ C2 ++ C3) pc C2.

Nous montrons un ensemble de propriétés évidentes de code_at et de codeseq_at, que nous mettons dans une "hint database" pour que Coq puisse les utiliser automatiquement.

Lemma code_at_app:
  forall i c2 c1 pc,
  pc = length c1 ->
  code_at (c1 ++ i :: c2) pc = Some i.

Proof.

induction c1; simpl; intros; subst pc; auto.
Qed.

Lemma codeseq_at_head:
  forall C pc i C',
  codeseq_at C pc (i :: C') ->
  code_at C pc = Some i.

Proof.

intros. inv H. simpl. apply code_at_app. auto.
Qed.

Lemma codeseq_at_tail:
  forall C pc i C',
  codeseq_at C pc (i :: C') ->
  codeseq_at C (pc + 1) C'.

Proof.

  intros. inv H.
  change (C1 ++ (i :: C') ++ C3)
    with (C1 ++ (i :: nil) ++ C' ++ C3).
  rewrite <- app_ass. constructor. rewrite app_length. auto.
Qed.

Lemma codeseq_at_app_left:
  forall C pc C1 C2,
  codeseq_at C pc (C1 ++ C2) ->
  codeseq_at C pc C1.

Proof.

intros. inv H. rewrite app_ass. constructor. auto.
Qed.

Lemma codeseq_at_app_right:
  forall C pc C1 C2,
  codeseq_at C pc (C1 ++ C2) ->
  codeseq_at C (pc + length C1) C2.

Proof.

intros. inv H. rewrite app_ass. rewrite <- app_ass. constructor. rewrite app_length. auto.
Qed.

Lemma codeseq_at_app_right2:
  forall C pc C1 C2 C3,
  codeseq_at C pc (C1 ++ C2 ++ C3) ->
  codeseq_at C (pc + length C1) C2.

Proof.

intros. inv H. repeat rewrite app_ass. rewrite <- app_ass. constructor. rewrite app_length. auto.
Qed.

Hint Resolve codeseq_at_head codeseq_at_tail codeseq_at_app_left codeseq_at_app_right codeseq_at_app_right2: codeseq.

Ltac normalize :=
repeat rewrite app_length; repeat rewrite plus_assoc; simpl.

Correction du code produit pour les expressions

Rappelons-nous la spécification informelle que nous avions donnée pour le code engendré pour une expression arithmétique a. Il est censé:

s'exécuter en séquence (pas de branchements)
déposer la valeur de a au sommet de la pile
préserver l'état mémoire.

Nous prouvons maintenant que le code compile_aexp a remplit ce contrat. La preuve est une récurrence élégante sur la dérivation d'évaluation eval m a n.

Lemma compile_aexp_correct:
  forall C m a n, eval m a n ->
  forall pc stk,
  codeseq_at C pc (compile_aexp a) ->
  star (transition C)
       (pc, stk, m)
       (pc + length (compile_aexp a), n :: stk, m).

Proof.

  induction 1; simpl; intros.
  apply star_one. apply trans_const. eauto with codeseq.
  apply star_one. eapply trans_var; eauto with codeseq.
  eapply star_trans.
  apply IHeval1. eauto with codeseq.
  eapply star_trans.
  apply IHeval2. eauto with codeseq.
  apply star_one. normalize. apply trans_add. eauto with codeseq.
  eapply star_trans.
  apply IHeval1. eauto with codeseq.
  eapply star_trans.
  apply IHeval2. eauto with codeseq.
  apply star_one. normalize. apply trans_sub. eauto with codeseq.
  eapply star_trans.
  apply IHeval1. eauto with codeseq.
  eapply star_trans.
  apply IHeval2. eauto with codeseq.
  apply star_one. normalize. apply trans_mul. eauto with codeseq.
  eapply star_trans.
  apply IHeval1. eauto with codeseq.
  eapply star_trans.
  apply IHeval2. eauto with codeseq.
  apply star_one. normalize. apply trans_div. eauto with codeseq.
Qed.

Voici la preuve correspondante pour la compilation des expressions booléennes.

Lemma compile_bexp_correct:
  forall C m b v, beval m b v ->
  forall cond ofs pc stk,
  codeseq_at C pc (compile_bexp b cond ofs) ->
  star (transition C)
       (pc, stk, m)
       (pc + length (compile_bexp b cond ofs) + if eqb v cond then ofs else 0, stk, m).

Proof.

  induction 1; simpl; intros.
  destruct cond; simpl.
  apply star_one. apply trans_branch_forward with ofs. eauto with codeseq. auto.
  repeat rewrite plus_0_r. apply star_refl.
  destruct cond; simpl.
  repeat rewrite plus_0_r. apply star_refl.
  apply star_one. apply trans_branch_forward with ofs. eauto with codeseq. auto.
  eapply star_trans.
  apply compile_aexp_correct with (a := a1); eauto with codeseq.
  eapply star_trans.
  apply compile_aexp_correct with (a := a2); eauto with codeseq.
  apply star_one. normalize.
  destruct cond.
  apply trans_beq with ofs. eauto with codeseq.
  destruct (Z_eq_dec n1 n2); try congruence. simpl. omega.
  apply trans_bne with ofs. eauto with codeseq.
  destruct (Z_eq_dec n1 n2); try congruence. simpl. omega.
  eapply star_trans.
  apply compile_aexp_correct with (a := a1); eauto with codeseq.
  eapply star_trans.
  apply compile_aexp_correct with (a := a2); eauto with codeseq.
  apply star_one. normalize.
  destruct cond.
  apply trans_beq with ofs. eauto with codeseq.
  destruct (Z_eq_dec n1 n2); try congruence. simpl. omega.
  apply trans_bne with ofs. eauto with codeseq.
  destruct (Z_eq_dec n1 n2); try congruence. simpl. omega.
  eapply star_trans.
  apply compile_aexp_correct with (a := a1); eauto with codeseq.
  eapply star_trans.
  apply compile_aexp_correct with (a := a2); eauto with codeseq.
  apply star_one. normalize.
  destruct cond.
  apply trans_ble with ofs. eauto with codeseq.
  destruct (Z_le_dec n1 n2). simpl. omega. exfalso; omega.
  apply trans_bgt with ofs. eauto with codeseq.
  destruct (Z_le_dec n1 n2). simpl. omega. exfalso; omega.
  eapply star_trans.
  apply compile_aexp_correct with (a := a1); eauto with codeseq.
  eapply star_trans.
  apply compile_aexp_correct with (a := a2); eauto with codeseq.
  apply star_one. normalize.
  destruct cond.
  apply trans_ble with ofs. eauto with codeseq.
  destruct (Z_le_dec n1 n2). exfalso; omega. simpl. omega.
  apply trans_bgt with ofs. eauto with codeseq.
  destruct (Z_le_dec n1 n2). exfalso; omega. simpl. omega.
  replace (eqb (negb bv1) cond)
     with (eqb bv1 (negb cond)).
  apply IHbeval; auto.
  destruct bv1; destruct cond; auto.
  set (code_b2 := compile_bexp be2 cond ofs) in *.
  set (ofs' := if cond then length code_b2 else ofs + length code_b2) in *.
  set (code_b1 := compile_bexp be1 false ofs') in *.
  apply star_trans with (pc + length code_b1 + (if eqb bv1 false then ofs' else 0), stk, m).
  apply IHbeval1. eauto with codeseq.
  destruct cond.
    destruct bv1; simpl.
    normalize. rewrite plus_0_r. apply IHbeval2. eauto with codeseq.
    normalize. rewrite plus_0_r. apply star_refl.
    destruct bv1; simpl.
    normalize. rewrite plus_0_r. apply IHbeval2. eauto with codeseq.
    replace ofs' with (length code_b2 + ofs). normalize. apply star_refl.
    unfold ofs'; omega.
Qed.

Correction du code compilé pour une commande qui termine

Lemma compile_com_correct_terminating:
  forall C m c m',
  exec m c m' ->
  forall stk pc,
  codeseq_at C pc (compile_com c) ->
  star (transition C)
       (pc, stk, m)
       (pc + length (compile_com c), stk, m').

Proof.

  induction 1; intros stk pc AT.
  simpl in *. rewrite plus_0_r. apply star_refl.
  simpl in *.
  eapply star_trans. apply compile_aexp_correct; eauto with codeseq.
  apply star_one. normalize. apply trans_setvar. eauto with codeseq.
  simpl in *.
  eapply star_trans. apply IHexec1. eauto with codeseq.
  normalize. apply IHexec2. eauto with codeseq.
  simpl in *.
  set (code1 := compile_com ifso) in *.
  set (codeb := compile_bexp b false (length code1 + 1)) in *.
  set (code2 := compile_com ifnot) in *.
  eapply star_trans.
  apply compile_bexp_correct with (b := b) (v := true) (cond := false) (ofs := length code1 + 1);
  eauto with codeseq.
  simpl. rewrite plus_0_r. fold codeb. normalize.
  eapply star_trans. apply IHexec. eauto with codeseq.
  apply star_one. eapply trans_branch_forward. eauto with codeseq. omega.
  simpl in *.
  set (code1 := compile_com ifso) in *.
  set (codeb := compile_bexp b false (length code1 + 1)) in *.
  set (code2 := compile_com ifnot) in *.
  eapply star_trans.
  apply compile_bexp_correct with (b := b) (v := false) (cond := false) (ofs := length code1 + 1);
  eauto with codeseq.
  simpl. fold codeb. normalize.
  replace (pc + length codeb + length code1 + S(length code2))
     with (pc + length codeb + length code1 + 1 + length code2).
  apply IHexec. eauto with codeseq. omega.
  simpl in *.
  eapply star_trans.
  apply compile_bexp_correct with (b := b) (v := false) (cond := false) (ofs := length (compile_com body) + 1);
  eauto with codeseq.
  simpl. normalize. apply star_refl.
  apply star_trans with (pc, stk, m').
  simpl in *.
  eapply star_trans.
  apply compile_bexp_correct with (b := b) (v := true) (cond := false) (ofs := length (compile_com body) + 1);
  eauto with codeseq.
  simpl. rewrite plus_0_r.
  eapply star_trans. apply IHexec1. eauto with codeseq.
  apply star_one.
  eapply trans_branch_backward. eauto with codeseq. omega.
  apply IHexec2. auto.
Qed.

Theorem compile_program_correct_terminating:
  forall c final_store,
  exec initial_store c final_store ->
  mach_terminates (compile_program c) final_store.

Proof.

  intros. unfold compile_program. red.
  exists (length (compile_com c)); split.
  apply code_at_app. auto.
  apply compile_com_correct_terminating with (pc := 0). auto.
  apply codeseq_at_intro with (C1 := nil). auto.
Qed.

Nous avons prouvé que si le programme IMP source termine sans erreur, un des comportements possibles de la machine virtuelle lorsqu'elle exécute le code compilé est de terminer sans erreur sur le même état mémoire final. Il nous reste à montrer que la machine ne peut rien faire d'autre, comme par exemple diverger ou bien se bloquer ou bien terminer sur un' autre état mémoire final. C'est une conséquence du fait que la machine est déterministe: au plus une seule transition est possible à partir d'un état donné.

Lemma machine_deterministic:
forall C S1 S2 S3, transition C S1 S2 -> transition C S1 S3 -> S2 = S3.

Proof.

  intros until S3; induction 1; intros T2; inv T2; try congruence.
  replace ofs0 with ofs; auto. congruence.
  replace ofs0 with ofs; auto. congruence.
  replace ofs0 with ofs; auto. congruence.
  replace ofs0 with ofs; auto. congruence.
Qed.

Theorem mach_terminates_unique_behavior:
  forall C final_store,
  mach_terminates C final_store ->
  ~mach_diverges C /\ ~mach_runtime_error C /\ forall m, mach_terminates C m -> m = final_store.

Proof.

  intros. red in H. destruct H as [pc [A B]].
  assert (I: irred (transition C) (pc, nil, final_store)).
    red; intros. red; intros. inv H; congruence.
  split. unfold mach_diverges; red; intros.
  eapply infseq_finseq_excl. eexact (machine_deterministic C). eauto. eauto. eauto.
  split. unfold mach_runtime_error. red; intros [pc' [stk' [m' [P [Q R]]]]].
  assert ((pc, nil, final_store) = (pc', stk', m')).
    eapply finseq_unique. eexact (machine_deterministic C). eauto. eauto. eauto. eauto.
  inv H. intuition congruence.
  intros m' [pc' [P Q]].
  assert (J: irred (transition C) (pc', nil, m')).
    red; intros. red; intros. inv H; congruence.
  assert ((pc, @nil Z, final_store) = (pc', nil, m')).
    eapply finseq_unique. eexact (machine_deterministic C). eauto. eauto. eauto. eauto.
  congruence.
Qed.

4. Preuve de préservation sémantique utilisant la sémantique à transitions

Nous aimerions étendre le résultat de la section 3 aux programmes source qui divergent: si le programme IMP ne termine pas, la machine exécutant son code compilé devrait faire une infinité de transitions. Pour ce faire, nous laissons tomber la sémantique naturelle et utilisons la sémantique à transitions d'IMP à sa place. Cela nous permet de montrer un résultat de simulation "en avant" entre IMP et la machine: chaque transition de l'exécution du programme IMP est simulée par zéro, une ou plusieurs transitions de la machine.

Pour énoncer la simulation en avant, il nous faut définir une relation entre états IMP et états de la machine qui est préservée par l'exécution. Soient (c, k, m) un état d'exécution IMP et (pc, stk, m') un état de la machine. Certaines conditions sont évidentes:

Les états mémoire doivent être les mêmes: m' = m.
Le code machine à la position pc doit être le code compilé pour la commande c: codeseq_at C pc (compile_com c).
La pile doit être vide: stk = nil. (La pile est toujours vide sauf pendant l'évaluation d'une expression.)

Ce qui n'est pas évident du tout est de savoir quoi supposer sur la continuation k. Intuitivement, lorsque la machine finit d'exécuter le code compilé pour la commande c, c'est-à-dire lorsqu'elle atteint le PC pc + length(compile_com c), elle devrait continuer en exécutant des instructions qui réalisent la continuation k, puis finir sur une instruction Ihalt. Nous formalisons cette intuition par le prédicat inductif suivant:

Inductive compile_cont (C: code): continuation -> nat -> Prop :=
  | ccont_stop: forall pc,
      code_at C pc = Some Ihalt ->
      compile_cont C Kstop pc
  | ccont_seq: forall c k pc pc',
      codeseq_at C pc (compile_com c) ->
      pc' = pc + length (compile_com c) ->
      compile_cont C k pc' ->
      compile_cont C (Kseq c k) pc
  | ccont_while: forall b c k pc ofs pc' pc'',
      code_at C pc = Some(Ibranch_backward ofs) ->
      pc' = pc + 1 - ofs ->
      codeseq_at C pc' (compile_com (While b c)) ->
      pc'' = pc' + length (compile_com (While b c)) ->
      compile_cont C k pc'' ->
      compile_cont C (Kwhile b c k) pc
  | ccont_branch: forall ofs k pc pc',
      code_at C pc = Some(Ibranch_forward ofs) ->
      pc' = pc + 1 + ofs ->
      compile_cont C k pc' ->
      compile_cont C k pc.

Nous sommes prêts à prouver une simulation "en avant" à l'aide d'un diagramme de simulation de la forme suivante:

                      match_states
     c / k / m  ----------------------- machstate
       |                                   |
       |                                   | + ou ( * et |c',k'| < |c,k| )
       |                                   |
       v                                   v
    c' / k' / m' ----------------------- machstate'
                      match_states

Hypothèses:

À gauche: une transition dans la sémantique "petits pas" de IMP.
En haut: l'invariant match_state défini ci-dessous. Conclusions:
En bas: l'invariant match_state qui doit être préservé.
À droite: ou bien
- une ou plusieurs transitions de la machine virtuelle
- ou zéro, une ou plusieurs transitions de la machine virtuelle + une preuve que la mesure de (c,k) décroît strictement.
La mesure est nécessaire pour montrer que la machine ne peut pas "bégayer" infiniment, c'est-à-dire n'effectuer aucune transition alors que le programme source diverge.

Inductive match_state (C: code): state -> machine_state -> Prop :=
  | match_state_intro:
      forall c k m pc
        (CODEAT: codeseq_at C pc (compile_com c))
        (CONTAT: compile_cont C k (pc + length (compile_com c))),
      match_state C (c, k, m) (pc, nil, m).

La mesure "anti-bégaiement" est une fonction des tailles de la commande et de la continuation courantes.

Fixpoint com_size (c: command) : nat :=
  match c with
  | Skip => 1
  | Assign id a => 1
  | Seq c1 c2 => com_size c1 + com_size c2 + 1
  | If b ifso ifnot => com_size ifso + com_size ifnot + 1
  | While b body => com_size body + 1
  end.

Remark com_size_nonzero: forall c, com_size c > 0.

Proof.

induction c; simpl; omega.
Qed.

Fixpoint cont_size (k: continuation) : nat :=
  match k with
  | Kstop => 0
  | Kseq c k' => com_size c + cont_size k'
  | Kwhile b c k' => cont_size k'
  end.

Definition measure (s: state) : nat :=
  match s with (c, k, m) => com_size c + cont_size k end.

Quelques lemmes techniques pour faciliter la preuve de simulation.

Lemma compile_cont_Kstop_inv:
  forall C pc m,
  compile_cont C Kstop pc ->
  exists pc',
  star (transition C) (pc, nil, m) (pc', nil, m)
  /\ code_at C pc' = Some Ihalt.

Proof.

  intros. dependent induction H.
  exists pc; split. apply star_refl. auto.
  destruct IHcompile_cont as [pc'' [A B]].
  exists pc''; split; auto. eapply star_step; eauto. eapply trans_branch_forward; eauto.
Qed.

Lemma compile_cont_Kseq_inv:
  forall C c k pc m,
  compile_cont C (Kseq c k) pc ->
  exists pc',
  star (transition C) (pc, nil, m) (pc', nil, m)
  /\ codeseq_at C pc' (compile_com c)
  /\ compile_cont C k (pc' + length(compile_com c)).

Proof.

  intros. dependent induction H.
  exists pc; split. apply star_refl. split; congruence.
  destruct (IHcompile_cont _ _ (refl_equal _)) as [pc'' [A [B D]]].
  exists pc''; split; auto. eapply star_step; eauto. eapply trans_branch_forward; eauto.
Qed.

Lemma compile_cont_Kwhile_inv:
  forall C b c k pc m,
  compile_cont C (Kwhile b c k) pc ->
  exists pc',
  plus (transition C) (pc, nil, m) (pc', nil, m)
  /\ codeseq_at C pc' (compile_com (While b c))
  /\ compile_cont C k (pc' + length(compile_com (While b c))).

Proof.

  intros. dependent induction H.
  exists pc'; split.
  apply plus_one. eapply trans_branch_backward; eauto.
  split; congruence.
  destruct (IHcompile_cont _ _ _ (refl_equal _)) as [pc'' [A [B D]]].
  exists pc''; split; auto. eapply plus_left. eapply trans_branch_forward; eauto. apply plus_star; auto.
Qed.

Remark code_at_inv:
forall C pc i, code_at C pc = Some i -> exists C1, exists C2, C = C1 ++ C2 /\ length C1 = pc.

Proof.

  induction C; simpl; intros.
  inv H.
  destruct pc. inv H. exists (@nil instruction); exists (i :: C); auto.
  destruct (IHC _ _ H) as [C1 [C2 [A B]]].
  exists (a :: C1); exists C2; split. simpl; congruence. simpl; congruence.
Qed.

Remark code_at_codeseq:
forall C pc i, code_at C pc = Some i -> codeseq_at C pc nil.

Proof.

intros. destruct (code_at_inv _ _ _ H) as [C1 [C2 [A B]]].
subst. change C2 with (nil ++ C2). constructor. auto.
Qed.

Lemma match_state_skip:
  forall C k m pc,
  compile_cont C k pc ->
  match_state C (Skip, k, m) (pc, nil, m).

Proof.

  intros C.
  assert (forall k pc, compile_cont C k pc -> codeseq_at C pc nil).
    induction 1.
    eapply code_at_codeseq; eauto.
    change (compile_com c) with (nil ++ compile_com c) in H. eauto with codeseq.
    eapply code_at_codeseq; eauto.
    eapply code_at_codeseq; eauto.
  intros. constructor. simpl. eauto. simpl. rewrite plus_0_r; auto.
Qed.

Nous pouvons maintenant énoncer et prouver le diagramme de simulation.

Lemma step_simulation:
  forall C s1 s2,
  step s1 s2 ->
  forall S1,
  match_state C s1 S1 ->
  exists S2,
  (plus (transition C) S1 S2 \/ (measure s2 < measure s1 /\ star (transition C) S1 S2))
  /\ match_state C s2 S2.

Proof.

  induction 1; intros S1 MS; inv MS; simpl in CODEAT; simpl in CONTAT.
  econstructor; split.
  left. eapply star_plus_trans.
  apply compile_aexp_correct; eauto with codeseq.
  apply plus_one. apply trans_setvar. eauto with codeseq.
  apply match_state_skip. rewrite app_length in CONTAT. simpl in CONTAT.
  rewrite plus_assoc in CONTAT. auto.
  econstructor; split.
  right; split. simpl. omega.
  apply star_refl. constructor. eauto with codeseq.
  eapply ccont_seq; eauto with codeseq.
  rewrite app_length in CONTAT. rewrite plus_assoc in CONTAT. auto.
  econstructor; split.
  right; split.
  unfold measure; simpl. omega.
  apply compile_bexp_correct with (b := b) (v := true) (cond := false) (ofs := length (compile_com ifso) + 1);
  eauto with codeseq.
  simpl. rewrite plus_0_r.
  econstructor. eauto with codeseq.
  eapply ccont_branch. eauto with codeseq. eauto.
  repeat rewrite app_length in CONTAT.
  simpl in CONTAT. change (S (length (compile_com ifnot))) with (1 + length (compile_com ifnot)) in CONTAT.
  repeat rewrite plus_assoc in CONTAT. auto.
  econstructor; split.
  right; split.
  unfold measure; simpl. omega.
  apply compile_bexp_correct with (b := b) (v := false) (cond := false) (ofs := length (compile_com ifso) + 1);
  eauto with codeseq.
  simpl. constructor. repeat rewrite plus_assoc. eauto with codeseq.
  repeat rewrite plus_assoc.
  repeat rewrite app_length in CONTAT.
  simpl in CONTAT. change (S (length (compile_com ifnot))) with (1 + length (compile_com ifnot)) in CONTAT.
  repeat rewrite plus_assoc in CONTAT. auto.
  set (C1 := compile_com body) in *.
  set (C0 := compile_bexp b false (length C1 + 1)) in *.
  set (ofs := length C0 + length C1 + 1) in *.
  econstructor; split.
  right; split.
  simpl. generalize (com_size_nonzero body). omega.
  apply compile_bexp_correct with (b := b) (v := false) (cond := false) (ofs := length C1 + 1);
  eauto with codeseq.
  simpl. fold C0. repeat rewrite app_length in CONTAT.
  simpl in CONTAT. rewrite plus_assoc in CONTAT.
  apply match_state_skip. auto.
  set (C1 := compile_com body) in *.
  set (C0 := compile_bexp b false (length C1 + 1)) in *.
  set (ofs := length C0 + length C1 + 1) in *.
  econstructor; split.
  right; split. simpl. omega.
  apply compile_bexp_correct with (b := b) (v := true) (cond := false) (ofs := length C1 + 1);
  eauto with codeseq.
  simpl. rewrite plus_0_r.
  econstructor. eauto with codeseq.
  fold C0. fold C1. eapply ccont_while. eauto with codeseq. eauto.
  replace (pc + length C0 + length C1 + 1 - ofs) with pc. exact CODEAT.
  unfold ofs; omega.
  eauto. simpl. fold C1; fold C0; fold ofs.
  replace (pc + length C0 + length C1 + 1 - ofs) with pc. exact CONTAT.
  unfold ofs; omega.
  rewrite plus_0_r in CONTAT.
  destruct (compile_cont_Kseq_inv _ _ _ _ m CONTAT) as [pc' [A [B D]]].
  econstructor; split.
  right; split. simpl; omega. eauto.
  constructor; auto.
  rewrite plus_0_r in CONTAT.
  destruct (compile_cont_Kwhile_inv _ _ _ _ _ m CONTAT) as [pc' [A [B D]]].
  econstructor; split. left; eauto. constructor; auto.
Qed.

En corollaire, nous obtenons le diagramme ci-dessous, qui fournit une nouvelle preuve de préservation sémantique pour les programmes IMP qui terminent:

                      match_states
     c / k / m  ----------------------- machstate
       |                                    |
     * |                                    | *
       |                                    |
       v                                    v
    c' / k' / m' ----------------------- machstate'
                     match_states

Lemma step_simulation_star:
  forall C cs1 cs2,
  star step cs1 cs2 ->
  forall S1,
  match_state C cs1 S1 ->
  exists S2, star (transition C) S1 S2 /\ match_state C cs2 S2.

Proof.

  induction 1; intros.
  exists S1; split. apply star_refl. auto.
  destruct (step_simulation _ _ _ H _ H1) as [S2 [EXEC1 MS1]].
  destruct (IHstar _ MS1) as [S3 [EXEC2 MS2]].
  exists S3; split.
  eapply star_trans; eauto.
  destruct EXEC1 as [PLUS | [MEAS STAR]].
  inv PLUS. econstructor; eauto.
  auto.
  auto.
Qed.

Lemma match_initial_state:
forall c m,
match_state (compile_program c) (c, Kstop, m) (0, nil, m).

Proof.

  intros. constructor.
  change (compile_program c) with (nil ++ compile_com c ++ Ihalt :: nil). constructor. auto.
  simpl. unfold compile_program. constructor. apply code_at_app. auto.
Qed.

Theorem compile_program_correct_terminating_2:
  forall c m,
  prog_terminates c m ->
  mach_terminates (compile_program c) m.

Proof.

  intros. red in H.
  generalize (match_initial_state c initial_store); intro.
  destruct (step_simulation_star _ _ _ H _ H0) as [S2 [STAR MS]].
  inv MS.
  simpl in CONTAT. rewrite plus_0_r in CONTAT.
  destruct (compile_cont_Kstop_inv _ _ m CONTAT) as [pc' [A B]].
  red. exists pc'; split. auto. eapply star_trans; eauto.
Qed.

Pour montrer la préservation sémantique des programmes qui divergent, nous montrons tout d'abord que la machine peut toujours avancer si le programme source diverge. La preuve est par récurrence sur le nombre maximal m de "bégaiements" que la machine peut faire avant d'effectuer au moins une transition.

Lemma step_simulation_productive:
  forall C m cs1 S1,
  measure cs1 < m ->
  infseq step cs1 ->
  match_state C cs1 S1 ->
  exists cs2, exists S2,
     infseq step cs2
  /\ plus (transition C) S1 S2
  /\ match_state C cs2 S2.

Proof.

  induction m; intros.
  elimtype False; omega.
  inv H0. rename b into cs2.
  destruct (step_simulation _ _ _ H2 _ H1) as [S2 [EXEC1 MS1]].
  destruct EXEC1 as [PLUS | [MEAS STAR]].
  exists cs2; exists S2; auto.
  inv STAR.
  apply IHm with cs2. omega. auto. auto.
  exists cs2; exists S2; intuition. econstructor; eauto.
Qed.

Il s'ensuit que la machine effectue une infinité de transitions si le programme source IMP effectue une infinité de transitions.

Lemma step_simulation_infseq:
  forall C cs S,
  infseq step cs ->
  match_state C cs S ->
  infseq (transition C) S.

Proof.

  intros.
  apply infseq_coinduction_principle_2 with (X := fun S => exists cs, infseq step cs /\ match_state C cs S).
  intros. destruct H1 as [cs1 [A B]].
  destruct (step_simulation_productive C (measure cs1 + 1) cs1 a) as [cs2 [S2 [P [Q R]]]].
  omega. auto. auto.
  exists S2; split. auto. exists cs2; auto.
  exists cs; auto.
Qed.

Cela prouve que le compilateur est correct pour les programmes IMP qui divergent.

Theorem compile_program_correct_diverging:
  forall c,
  prog_diverges c ->
  mach_diverges (compile_program c).

Proof.

  intros. red. red in H.
  apply step_simulation_infseq with (c, Kstop, initial_store).
  auto. apply match_initial_state.
Qed.

Module IMPcompiler

1. La machine virtuelle.

2. Le compilateur

3. Preuve de préservation sémantique utilisant la sémantique naturelle

Résultats auxiliaires sur les suites d'instructions.

Correction du code produit pour les expressions

Correction du code compilé pour une commande qui termine

4. Preuve de préservation sémantique utilisant la sémantique à transitions