More on Impredicativity:
Last time, I talked about the difference between predicate and
impredicative polymorphism: At issue is that when we introduce
type variables, the question is what do they range over? If
they range over all types, then we inevitably get a circular
definition of some sort and we must somehow break the circularity
to get a well-formed interpretation of the types.
For predicative polymorphism, we broke the cycle by stratifying
the types. This corresponds to the semantics of ML.
For impredicative polymorphism, I started to give a model that
was aimed at proving strong normalization of reduction. But I
think it's easier to understand the simpler model where we only
worry about evaluation. In either setting, the idea is to define
outright (without appealing to the definition of the types) some
universe over which the type variables will range.
In this setting, it helps to move to a proof system where terms
are not decorated by types. In particular, let us consider this
slightly different version of F2:
D;G |- x : G(x)
D;G |- i : int
D;G |- e1 : t'->t D;G |- e2 : t'
-----------------------------------
D;G |- e1 e2 : t'
D |- t1 : * D;G,x:t1 |- e : t2
--------------------------------- (x not in G)
D;G |- \x.e : t1 -> t2
D;G |- e : All a.t D |- t' : *
---------------------------------
D;G |- e : t{t'/a}
D,a;G |- e : t D |- G ok
------------------------------ (a not in D)
D;G |- e : All a.t
D |- {} : ok
D |- G : ok D |- t : *
------------------------- (x not in G)
D |- G,x:t : ok
The only differences are that:
(a) we don't put types on arguments to functions
(b) we don't have explicit type abstraction (/\a.e)
(c) we don't have explicit instantiation (e t)
Rather, this information is encoded completly in the typing proof.
Of course, type checking is no longer syntax directed, but it's
relatively easy to see that D;G |- e : t in the original system
iff D;G |- erase(e) : t in the new system.
One other note: in the case of polymorphic generalization,
we need to make sure that the context G stays well-formed
when we locally abstract the type variable a. So the
judgment D |- G ok checks that a does not occur free in the
context G.
With these preliminaries out of the way, I'd like to note
that the set VAL of all closed, (undecorated/erased) values can
be defined outright without appealing to the typing rules.
We're going to use 2^VAL as our universe and construct
an interpretation of types that maps us into this pre-defined
universe. We'll start with a value interpretation, parameterized
by a d which maps type variables to elements of 2^VAL:
V : Type -> (Tvar -> 2^VAL) -> 2^VAL
V[int]d = { i }
V[a]d = d(a)
V[t1 -> t2]d = { v | FV(v) = {} ^ All v1 in V[t1]d. v v1 in C[t2]d }
V[All a.t]d = intsersect(S in 2^VAL)(V[t][d|->S])
the definition makes use of C[t] which generates a set of
(closed) expressions:
C[t] = { e | e ->* v and v in V[t] }
with the property that every expression reduces to a (closed)
value in the right set.
This intepretation is well-founded because every time we
go around the V->C->V loop, the type gets smaller, and because
the intersection operation chooses sets from something we've
already defined: 2^VAL.
We need to argue that V[t]d indeed produces something in 2^VAL
to make sure that our quantifiers cover all of the types,
but it's pretty easy to see that we're always generating
sets of closed values for V[t].
Again, we need to extend the relation to open terms and do
so by defining:
V[G] = { g : Var -> VAL | All x in Dom(G).g(x) in V[G(x)] }
Finally, we need to establish that every well-typed expression
is in the set that we claim it should be:
Thm: If D;G |- e : t, then for all d and g in V[G]d,
g(e) in C[t]d.
The proof is extremely straightforward, but it's worth looking
at the case for polymorphic instantiation:
case: D;G |- e : All a.t D |- t' : *
-----------------------------------
D;G |- e : t{t'/a}
Pick d : Tvar -> 2^VAL and g in V[G]d. We must show
g(e) in C[t{t'/a}]d. We will need to establish the following
lemma:
Lemma: V[t]d[a|->V[t']d] = V[t{t'/a}]d
which is a straightforward induction on t, using the definition
of substitution.
With the lemma, it suffices to show g(e) in C[e]d[a->V[t']d].
Now by induction, we have g(e) in C[All a.t]d. This means
that g(e) ->* v and v in V[All a.t]d = intersect(S in 2^VAL)(V[t]d[a->S]).
Picking S = V[t']d it then follows that v in V[t]d[s->S].
Note that in this step, we're using the fact that V[t']d
is one of the possible things that we can bind to the type
variable a.
It's also worth looking at the proof for polymorphic generalization:
case: D,a;G |- e : t D |- G ok
-----------------------------
D;G |- e : All a.t
Pick d : Tvar -> 2^VAL and g in V[G]d. We must show
g(e) in C[All a.t]d. Again, we will need a lemma to show:
Lemma: if D |- G ok and g in V[G]d, then for all a not in D and
S in 2^VAL, g in V[G]d[a|->S].
That is, it's okay to remap a type variable to a different set of
values as long as the substitution doesn't depend upon that type
variable.
Now by induction, we have for all d' and g' in V[G]d', that g'(e) in
C[t]d'. By the lemma, for all S, g in V[G]d[a->S]. So it follows
that g(e) in C[t]d[a->S] for all S. In turn, this means that g(e) ->*
v in V[t]d[a->S]. And thus g(e) ->* v in V[All a.t]d and so g(e) in
C[All a.t]d.
The other cases follow in a straightforward fashion. As a corollary
of the theorem, we have:
Corr: If |- e : t, then e ->* v and v in V[t]d for any d.
That is, we've proven that every well-typed F2 term terminates
and produces a value of the right type (i.e., programs don't
go "wrong" as Milner puts it.)
But we can use the interpretation to argue other things. For
instance, recall that we defined void, the empty type, as
All a.a because I claimed there was no closed value with this
type. We can see this because:
V[All a.a] = intersect(S in 2^VAL).S
The empty set is an element of 2^VAL, so the intersection must
be empty!
We also picked unit to be All a.a->a. Note that:
V[All a.a->a] = intersect(S in 2^VAL).{v | All v1 in S.v v1 ->* v2 in S}
Here, when we pick S to be empty, the expression:
{v | All v1 in S.v v1 ->* v2 in S}
simplifies to all VAL because there is no v1 in S, so every value
trivially satisifies the constraints. But we must look at all possible
sets of closed values. Notice that for any v', picking S = {v'} we
have:
{ v | All v1 in S.v v1 -> * v2 in S} =
{ v | v v' ->* v' }
That is, v must behave like the identity function. So we know that
if there's anything in V[All a.a->a], it must behave like the identity
function. Of course, the intersection could be empty, so we need to
exhibit at least one v that behaves like the identity function.
That is, we must show \x.x in V[All a.a->a]. But this is easy
since we can construct a proof that |- \x.x : All a.a->a.
So to summarize, it is possible to construct a model that avoids
the circularity, but it hinges on constructing some suitable
universe up front, that over-approximates the interpretation of
the types. In this example, we picked sets of closed values.
In the strong-normalization case, we picked sets of saturated
expressions. The particular choice of the over-approximate is
tied in to what you're trying to prove.
When we interpret for-all types, we are forced to intersect over
all of the possible approximates which is actually a much stronger
condition than what is required. For instance, our interpretation
of forall ranges over some types that we can't even write down
(e.g., the positive integers or singleton types.) However, this
is a feature in that we can use these non-standard "types" to
argue about the properties of polymorphic functions. For instance,
we used singleton types to argue that indeed the only closed values with
type forall a.a->a are equivalent to the identity function.
The power of this technique starts to become really apparent when
we interpret types not as unary relations, but rather as binary
relations. Consider, for a moment a simple signature such as:
type bool
val true : bool
val false : bool
val if : All 'a.bool -> 'a -> 'a -> 'a
We can represent a "structure" that implements this signature
using an existential:
Exists t.(t * t * All a.t -> a -> a -> a)
Eliminating the existential, we see that a client of the package
must look like this:
boolclient: (All t.t * t * All a.t->a->a->a) -> t'
for some t'. Now consider two implementations of the signature:
imp1: we represent booleans using integers with 1 for true, and 0 for false.
imp2: we represent booleans using integers with 0 for false, and any non-zero
value for true.
Can the client tell which implementation we're using? Intuitively,
the answer is no --- this is the power of type abstraction --- one
of representation independence. We can prove that indeed, the
client will behave the same by setting up a type-indexed *binary*
relation:
V[a]d = d(a)
V[int]d = {(i,i)}
V[t1->t2]d = {(v1,v2) | All (v1',v2') in V[t1]d.(v1 v1',v2 v2') in C[t2]d }
V[All a.t]d = intersect(P in 2^VAL x 2^VAL).V[t]d[a|->P]
Now suppose we want to prove that, to *any* client, they can't observe
whether we're using imp1 or imp2. It suffices to construct a binary
logical relation that relates two t values i and j:
V[t] = { (i,j) | i = 0 implies j = 0 and i = 1 iff j != 0 }
This could be our value interpretation of the abstract type for
booleans. Then we just need to show that
V[(t * t * All a.t->a->a->a) -> t]
can't produce a "different" t. In particular, if t is int,
then running under one implementation yields the same answer
as running under the other implementation.