## Explaining higher-rank and impredicative polymorphism, simply

I recently wrote a type checker for one of my projects-- specifically, one implementing higher-ranked polymorphism. It's based on this paper by Jana Dunfield and Neel Krishnaswami, if you're curious. I realized, while writing it, that its grammar for types offers an easy-to-digest explanation of the different varieties of parametric polymorphism.

For a first jab at this, you could think of *predicative*, *rank-2*,
*higher-rank*, and *impredicative* polymorphism as being arranged
according to the scale below, where the direction of the arrow denotes an
increasing number of programs that will be accepted by each kind of type
checker. For example, the set of programs accepted by a *rank-2*
polymorphic type checker should be a strict superset of those accepted by a
*predicative* polymorphic type checker. The
classic
Hindley-Milner system used by OCaml, Rust, and others is an example of the
latter.

Okay, so we've established that in some vague way, rank-2 polymorphism is 'more powerful' than predicative polymorphism. How, exactly? It's easiest to explain, I think, from the perspective of how the type checker sees your program. I'll use OCaml snippets throughout-- you might want to brush up on its syntax if you're unfamiliar (but if you know another functional language, they should be pretty readable regardless).

### Desugaring polymorphic types

Firstly, it's important to understand something about the syntax sugar
generally applied to polymorphic type signatures. Take, for instance, the
following two definitions for `id`

, the identity function:

let id: 'a -> 'a = fun a -> a let id': 'a . 'a -> 'a = fun a -> a

The difference is minimal-- there's an extra `'a`

with a dot after it in
the type for `id'`

. Semantically, both types for the identity function
are equivalent. The second type is the desugaring of the first, revealing the
binder for `'a`

. Usually, the language hides these quantifiers from you,
but they can be written out explicitly. The quantifier binds a type variable,
and when the function is called, it's set to the type of the corresponding
argument-- it's how you'd normally think about using generics. What this
means is that you can instantiate `'a`

s to *any* type, which is why
the quantifier is called a *universal quantifier* (∀). The differences
between the various points on the scale of parametric polymorphism lie in
where you're allowed to put these quantifiers.

### How a type checker sees things: starting simple

The following type definitions are lifted more-or-less straightforwardly from the checker I wrote in Rust (I've translated the enum definitions to OCaml for consistency). They're what define the structure of the language's types.

type polytype = | Forall of var list * polytype | Monotype of monotype

type monotype = | MVar of var | Int | Bool | String | List of monotype | Func of monotype list * monotype

The constructors of both `polytype`

and `monotype`

look like the
types you've seen before. For instance, this type:

'a 'b . 'a -> 'b

Would be represented as:

Forall ([Var "a"; Var "b"], Monotype (Func ([MVar (Var "a")], MVar (Var "b"))))

`polytype`

is the important one here: it's what determines where your
type system will be placed on the scale. Anything that can take a
`polytype`

argument (only other `polytype`

s, generally) can
*contain quantifiers*. The question of which types can *contain
quantifiers* (distinct from *being quantified over*) is what distinguishes
each kind parametric polymorphism from another.

In this case, observe that the only type constructors which can contain
`Forall`

s are... other `Forall`

s. This is what's known as a
predicative polymorphic type system. Here, all the
quantifiers in a type, if any, have to appear at the beginning. Observe that
once we create a `Monotype`

, there's no way to create another
`Forall`

inside it-- `Monotype`

s cannot contain quantifiers. So any
predicative polymorphic type must look something like this, with all the
quantifiers out front:

'a 'b . 'a -> 'b

### Adding higher-rank polymorphism

Now look at the updated definition of `polytype`

below.

type polytype = | Forall of var list * polytype | Func of polytype list * polytype | Monotype of monotype

It turns out I lied a little when I said I lifted the type definitions in the
prior section from my type checker. The checker I wrote supports
higher-rank polymorphism, which means that the function type
constructor (sometimes called `->`

) is a `polytype`

, as in this
example. What this means is that function types can contain quantifiers!

So, pretending for a moment that OCaml supports higher-rank polymorphism, we can write things like this:

let int_and_string: ('a . 'a -> 'a) -> int * string = fun arg -> (arg 1, arg "hey!")

What does this get us? Let's try writing the predicative polymorphic version:

let int_and_string: 'a . ('a -> 'a) -> int * string = fun arg -> (arg 1, arg "hey!")

Unfortunately, we get an error:

fun arg -> (arg 1, arg "hey") ^^^^^ Error: This expression has type string but an expression was expected of type int

What happened? OCaml inferred at the site of `arg`

's application to
`1`

that `'a`

should be instantiated to `int`

. For the body of
`int_and_string`

, then, `'a`

became `int`

. Or that's what you
might think-- we're actually *still* going to get an error from this even
if we don't give `arg`

an argument of a different type:

let int_and_string: 'a . ('a -> 'a) -> int = fun arg -> arg 1

Again, we get an error. I've simplified this error from the one OCaml actually gives:

fun arg -> arg 1 ^^^ Error: [The argument has type int but an expression was expected of type 'a]

Wait, what? But `'a`

is a type variable! We can instantiate it to
`int`

, right? The crux of the issue here is that `arg`

is *not*
itself polymorphic: it merely has input and return types that are described
in terms of `int_and_string`

's polymorphic type variable. By the time
the predicative polymorphic variant of `int_and_string`

has been called,
`'a`

has already been instantiated to something-- we can't do so again!
On the other hand, when the higher-rank polymorphic version of
`int_and_string`

is called, `'a`

has not been instantiated yet.
There's still a quantifier that we can instantiate ourselves, to anything
we'd like-- in other words, `arg`

is polymorphic.

More succinctly, the type of `arg`

in the predicative polymorphic case is:

'a -> 'a

Where `'a`

is *some* (unknown to us) type, not *any* type.
Meanwhile, the type of `arg`

in the higher-rank case is:

'a . 'a -> 'a

Where `'a`

has not been bound yet, so the `'a`

s in the input and
output types represent *any* type.

What about rank-2 polymorphism? Basically, think
of predicative polymorphic functions as *rank-1 polymorphic*. Rank-2
polymorphic functions take rank-1 polymorphic functions as arguments.
The higher-rank polymorphic version of `int_and_string`

is a rank-2
polymorphic function for this reason.

It's not very interesting for our purposes here, but some languages call it out specially because it retains decidable type inference. In general, for anything beyond rank-2, full type inference isn't decidable (but, of course, you can still have inference for non-higher-rank-polymorphic functions).

### Impredicativity

Finally, here's what `polytype`

would look like for a type checker that
supports *impredicative lists*:

type polytype = | Forall of var list * polytype | Func of polytype list * polytype | List of polytype | Monotype of monotype

It hasn't changed much from the higher-rank variant, and follows a similar
pattern-- we've added another parametric type constructor, this time for
lists. A type system possesses impredicativity when
*quantifiers are allowed anywhere*-- particularly, in the constructors of
any type. Higher-rank polymorphism, as it turns out, is just a special case
of impredicativity that permits quantifiers in function type constructors.

OCaml supports a limited form of impredicativity for records (and object
methods). What this means is that, in OCaml's type checker, records are a
`polytype`

-- they can contain `Forall`

s:

type my_record_type = { f: 'a . 'a -> 'a }

You can actually use this to emulate higher-ranked polymorphism, by creating functions that take these records as arguments.

And that's it! Once you've understood the significance of quantifiers and where you're allowed to put them, the rest of it falls into place.

Finally, if I've made any mistakes here, please don't hesitate to reach out!