Explaining higher-rank and impredicative polymorphism, simply

I recently wrote a type checker for one of my projects—specifically, one implementing higher-ranked polymorphism. It's based on this paper by Jana Dunfield and Neel Krishnaswami, if you're curious. I realized, while writing it, that its grammar for types offers an easy-to-digest explanation of the different varieties of parametric polymorphism.

For a first jab at this, you could think of predicative, rank-2, higher-rank, and impredicative polymorphism as being arranged according to the scale below, where the direction of the arrow denotes an increasing number of programs that will be accepted by each kind of type checker. For example, the set of programs accepted by a rank-2 polymorphic type checker should be a strict superset of those accepted by a predicative polymorphic type checker. The classic Hindley-Milner system used by OCaml, Rust, and others is an example of the latter.

Okay, so we've established that in some vague way, rank-2 polymorphism is 'more powerful' than predicative polymorphism. How, exactly? It's easiest to explain, I think, from the perspective of how the type checker sees your program. I'll use OCaml snippets throughout—you might want to brush up on its syntax if you're unfamiliar (but if you know another functional language, they should be pretty readable regardless).

Desugaring polymorphic types

Firstly, it's important to understand something about the syntax sugar generally applied to polymorphic type signatures. Take, for instance, the following two definitions for id, the identity function:

let id: 'a -> 'a = fun a -> a
let id': 'a . 'a -> 'a = fun a -> a

The difference is minimal—there's an extra 'a with a dot after it in the type for id'. Semantically, both types for the identity function are equivalent. The second type is the desugaring of the first, revealing the binder for 'a. Usually, the language hides these quantifiers from you, but they can be written out explicitly. The quantifier binds a type variable, and when the function is called, it's set to the type of the corresponding argument—it's how you'd normally think about using generics. What this means is that you can instantiate 'as to any type, which is why the quantifier is called a universal quantifier (∀). The differences between the various points on the scale of parametric polymorphism lie in where you're allowed to put these quantifiers.

How a type checker sees things: starting simple

The following type definitions are lifted more-or-less straightforwardly from the checker I wrote in Rust (I've translated the enum definitions to OCaml for consistency). They're what define the structure of the language's types.

type polytype =
| Forall of var list * polytype
| Monotype of monotype

type monotype =
| MVar of var
| Int
| Bool
| String
| List of monotype
| Func of monotype list * monotype

The constructors of both polytype and monotype look like the types you've seen before. For instance, this type:

'a 'b . 'a -> 'b

Would be represented as:

Forall ([Var "a"; Var "b"], Monotype (Func ([MVar (Var "a")], MVar (Var "b"))))

polytype is the important one here: it's what determines where your type system will be placed on the scale. Anything that can take a polytype argument (only other polytypes, generally) can contain quantifiers. The question of which types can contain quantifiers (distinct from being quantified over) is what distinguishes each kind parametric polymorphism from another.

In this case, observe that the only type constructors which can contain Foralls are... other Foralls. This is what's known as a predicative polymorphic type system. Here, all the quantifiers in a type, if any, have to appear at the beginning. Observe that once we create a Monotype, there's no way to create another Forall inside it—Monotypes cannot contain quantifiers. So any predicative polymorphic type must look something like this, with all the quantifiers out front:

'a 'b . 'a -> 'b

Adding higher-rank polymorphism

Now look at the updated definition of polytype below.

type polytype =
| Forall of var list * polytype
| Func of polytype list * polytype
| Monotype of monotype

It turns out I lied a little when I said I lifted the type definitions in the prior section from my type checker. The checker I wrote supports higher-rank polymorphism, which means that the function type constructor (sometimes called ->) is a polytype, as in this example. What this means is that function types can contain quantifiers!

So, pretending for a moment that OCaml supports higher-rank polymorphism, we can write things like this:

let int_and_string: ('a . 'a -> 'a) -> int * string = fun arg -> (arg 1, arg "hey!")

What does this get us? Let's try writing the predicative polymorphic version:

let int_and_string: 'a . ('a -> 'a) -> int * string = fun arg -> (arg 1, arg "hey!")

Unfortunately, we get an error:

fun arg -> (arg 1, arg "hey")
                       ^^^^^
Error: This expression has type string but an expression was expected of type int

What happened? OCaml inferred at the site of arg's application to 1 that 'a should be instantiated to int. For the body of int_and_string, then, 'a became int. Or that's what you might think—we're actually still going to get an error from this even if we don't give arg an argument of a different type:

let int_and_string: 'a . ('a -> 'a) -> int = fun arg -> arg 1

Again, we get an error. I've simplified this error from the one OCaml actually gives:

fun arg -> arg 1
              ^^^
Error: [The argument has type int but an expression was expected of type 'a]

Wait, what? But 'a is a type variable! We can instantiate it to int, right? The crux of the issue here is that arg is not itself polymorphic: it merely has input and return types that are described in terms of int_and_string's polymorphic type variable. By the time the predicative polymorphic variant of int_and_string has been called, 'a has already been instantiated to something—we can't do so again! On the other hand, when the higher-rank polymorphic version of int_and_string is called, 'a has not been instantiated yet. There's still a quantifier that we can instantiate ourselves, to anything we'd like—in other words, arg is polymorphic.

More succinctly, the type of arg in the predicative polymorphic case is:

'a -> 'a

Where 'a is some (unknown to us) type, not any type. Meanwhile, the type of arg in the higher-rank case is:

'a . 'a -> 'a

Where 'a has not been bound yet, so the 'as in the input and output types represent any type.

What about rank-2 polymorphism? Basically, think of predicative polymorphic functions as rank-1 polymorphic. Rank-2 polymorphic functions take rank-1 polymorphic functions as arguments. The higher-rank polymorphic version of int_and_string is a rank-2 polymorphic function for this reason.

It's not very interesting for our purposes here, but some languages call it out specially because it retains decidable type inference. In general, for anything beyond rank-2, full type inference isn't decidable (but, of course, you can still have inference for non-higher-rank-polymorphic functions).

Impredicativity

Finally, here's what polytype would look like for a type checker that supports impredicative lists:

type polytype =
| Forall of var list * polytype
| Func of polytype list * polytype
| List of polytype
| Monotype of monotype

It hasn't changed much from the higher-rank variant, and follows a similar pattern—we've added another parametric type constructor, this time for lists. A type system possesses impredicativity when quantifiers are allowed anywhere—particularly, in the constructors of any type. Higher-rank polymorphism, as it turns out, is just a special case of impredicativity that permits quantifiers in function type constructors.

OCaml supports a limited form of impredicativity for records (and object methods). What this means is that, in OCaml's type checker, records are a polytype—they can contain Foralls:

type my_record_type = { f: 'a . 'a -> 'a }

You can actually use this to emulate higher-ranked polymorphism, by creating functions that take these records as arguments.

And that's it! Once you've understood the significance of quantifiers and where you're allowed to put them, the rest of it falls into place.

Finally, if I've made any mistakes here, please don't hesitate to reach out!