Explaining higher-rank and impredicative polymorphism, simply
I recently wrote a type checker for one of my projects-- specifically, one implementing higher-ranked polymorphism. It's based on this paper by Jana Dunfield and Neel Krishnaswami, if you're curious. I realized, while writing it, that its grammar for types offers an easy-to-digest explanation of the different varieties of parametric polymorphism.
For a first jab at this, you could think of predicative, rank-2, higher-rank, and impredicative polymorphism as being arranged according to the scale below, where the direction of the arrow denotes an increasing number of programs that will be accepted by each kind of type checker. For example, the set of programs accepted by a rank-2 polymorphic type checker should be a strict superset of those accepted by a predicative polymorphic type checker. The classic Hindley-Milner system used by OCaml, Rust, and others is an example of the latter.
Okay, so we've established that in some vague way, rank-2 polymorphism is 'more powerful' than predicative polymorphism. How, exactly? It's easiest to explain, I think, from the perspective of how the type checker sees your program. I'll use OCaml snippets throughout-- you might want to brush up on its syntax if you're unfamiliar (but if you know another functional language, they should be pretty readable regardless).
Desugaring polymorphic types
Firstly, it's important to understand something about the syntax sugar
generally applied to polymorphic type signatures. Take, for instance, the
following two definitions for id
, the identity function:
let id: 'a -> 'a = fun a -> a let id': 'a . 'a -> 'a = fun a -> a
The difference is minimal-- there's an extra 'a
with a dot after it in
the type for id'
. Semantically, both types for the identity function
are equivalent. The second type is the desugaring of the first, revealing the
binder for 'a
. Usually, the language hides these quantifiers from you,
but they can be written out explicitly. The quantifier binds a type variable,
and when the function is called, it's set to the type of the corresponding
argument-- it's how you'd normally think about using generics. What this
means is that you can instantiate 'a
s to any type, which is why
the quantifier is called a universal quantifier (∀). The differences
between the various points on the scale of parametric polymorphism lie in
where you're allowed to put these quantifiers.
How a type checker sees things: starting simple
The following type definitions are lifted more-or-less straightforwardly from the checker I wrote in Rust (I've translated the enum definitions to OCaml for consistency). They're what define the structure of the language's types.
type polytype = | Forall of var list * polytype | Monotype of monotype
type monotype = | MVar of var | Int | Bool | String | List of monotype | Func of monotype list * monotype
The constructors of both polytype
and monotype
look like the
types you've seen before. For instance, this type:
'a 'b . 'a -> 'b
Would be represented as:
Forall ([Var "a"; Var "b"], Monotype (Func ([MVar (Var "a")], MVar (Var "b"))))
polytype
is the important one here: it's what determines where your
type system will be placed on the scale. Anything that can take a
polytype
argument (only other polytype
s, generally) can
contain quantifiers. The question of which types can contain
quantifiers (distinct from being quantified over) is what distinguishes
each kind parametric polymorphism from another.
In this case, observe that the only type constructors which can contain
Forall
s are... other Forall
s. This is what's known as a
predicative polymorphic type system. Here, all the
quantifiers in a type, if any, have to appear at the beginning. Observe that
once we create a Monotype
, there's no way to create another
Forall
inside it-- Monotype
s cannot contain quantifiers. So any
predicative polymorphic type must look something like this, with all the
quantifiers out front:
'a 'b . 'a -> 'b
Adding higher-rank polymorphism
Now look at the updated definition of polytype
below.
type polytype = | Forall of var list * polytype | Func of polytype list * polytype | Monotype of monotype
It turns out I lied a little when I said I lifted the type definitions in the
prior section from my type checker. The checker I wrote supports
higher-rank polymorphism, which means that the function type
constructor (sometimes called ->
) is a polytype
, as in this
example. What this means is that function types can contain quantifiers!
So, pretending for a moment that OCaml supports higher-rank polymorphism, we can write things like this:
let int_and_string: ('a . 'a -> 'a) -> int * string = fun arg -> (arg 1, arg "hey!")
What does this get us? Let's try writing the predicative polymorphic version:
let int_and_string: 'a . ('a -> 'a) -> int * string = fun arg -> (arg 1, arg "hey!")
Unfortunately, we get an error:
fun arg -> (arg 1, arg "hey") ^^^^^ Error: This expression has type string but an expression was expected of type int
What happened? OCaml inferred at the site of arg
's application to
1
that 'a
should be instantiated to int
. For the body of
int_and_string
, then, 'a
became int
. Or that's what you
might think-- we're actually still going to get an error from this even
if we don't give arg
an argument of a different type:
let int_and_string: 'a . ('a -> 'a) -> int = fun arg -> arg 1
Again, we get an error. I've simplified this error from the one OCaml actually gives:
fun arg -> arg 1 ^^^ Error: [The argument has type int but an expression was expected of type 'a]
Wait, what? But 'a
is a type variable! We can instantiate it to
int
, right? The crux of the issue here is that arg
is not
itself polymorphic: it merely has input and return types that are described
in terms of int_and_string
's polymorphic type variable. By the time
the predicative polymorphic variant of int_and_string
has been called,
'a
has already been instantiated to something-- we can't do so again!
On the other hand, when the higher-rank polymorphic version of
int_and_string
is called, 'a
has not been instantiated yet.
There's still a quantifier that we can instantiate ourselves, to anything
we'd like-- in other words, arg
is polymorphic.
More succinctly, the type of arg
in the predicative polymorphic case is:
'a -> 'a
Where 'a
is some (unknown to us) type, not any type.
Meanwhile, the type of arg
in the higher-rank case is:
'a . 'a -> 'a
Where 'a
has not been bound yet, so the 'a
s in the input and
output types represent any type.
What about rank-2 polymorphism? Basically, think of
predicative polymorphic functions as rank-1 polymorphic. Rank-2
polymorphic functions take rank-1 polymorphic functions as arguments. The
higher-rank polymorphic version of int_and_string
is a rank-2
polymorphic function for this reason.
It's not very interesting for our purposes here, but some languages call it out specially because it retains decidable type inference. In general, for anything beyond rank-2, full type inference isn't decidable (but, of course, you can still have inference for non-higher-rank-polymorphic functions).
Impredicativity
Finally, here's what polytype
would look like for a type checker that
supports impredicative lists:
type polytype = | Forall of var list * polytype | Func of polytype list * polytype | List of polytype | Monotype of monotype
It hasn't changed much from the higher-rank variant, and follows a similar pattern-- we've added another parametric type constructor, this time for lists. A type system possesses impredicativity when quantifiers are allowed anywhere-- particularly, in the constructors of any type. Higher-rank polymorphism, as it turns out, is just a special case of impredicativity that permits quantifiers in function type constructors.
OCaml supports a limited form of impredicativity for records (and object
methods). What this means is that, in OCaml's type checker, records are a
polytype
-- they can contain Forall
s:
type my_record_type = { f: 'a . 'a -> 'a }
You can actually use this to emulate higher-ranked polymorphism, by creating functions that take these records as arguments.
And that's it! Once you've understood the significance of quantifiers and where you're allowed to put them, the rest of it falls into place.
Finally, if I've made any mistakes here, please don't hesitate to reach out!