$$ \newcommand{\bone}{\mathbf{1}} \newcommand{\bbeta}{\mathbf{\beta}} \newcommand{\bdelta}{\mathbf{\delta}} \newcommand{\bepsilon}{\mathbf{\epsilon}} \newcommand{\blambda}{\mathbf{\lambda}} \newcommand{\bomega}{\mathbf{\omega}} \newcommand{\bpi}{\mathbf{\pi}} \newcommand{\bphi}{\mathbf{\phi}} \newcommand{\bvphi}{\mathbf{\varphi}} \newcommand{\bpsi}{\mathbf{\psi}} \newcommand{\bsigma}{\mathbf{\sigma}} \newcommand{\btheta}{\mathbf{\theta}} \newcommand{\btau}{\mathbf{\tau}} \newcommand{\ba}{\mathbf{a}} \newcommand{\bb}{\mathbf{b}} \newcommand{\bc}{\mathbf{c}} \newcommand{\bd}{\mathbf{d}} \newcommand{\be}{\mathbf{e}} \newcommand{\boldf}{\mathbf{f}} \newcommand{\bg}{\mathbf{g}} \newcommand{\bh}{\mathbf{h}} \newcommand{\bi}{\mathbf{i}} \newcommand{\bj}{\mathbf{j}} \newcommand{\bk}{\mathbf{k}} \newcommand{\bell}{\mathbf{\ell}} \newcommand{\bm}{\mathbf{m}} \newcommand{\bn}{\mathbf{n}} \newcommand{\bo}{\mathbf{o}} \newcommand{\bp}{\mathbf{p}} \newcommand{\bq}{\mathbf{q}} \newcommand{\br}{\mathbf{r}} \newcommand{\bs}{\mathbf{s}} \newcommand{\bt}{\mathbf{t}} \newcommand{\bu}{\mathbf{u}} \newcommand{\bv}{\mathbf{v}} \newcommand{\bw}{\mathbf{w}} \newcommand{\bx}{\mathbf{x}} \newcommand{\by}{\mathbf{y}} \newcommand{\bz}{\mathbf{z}} \newcommand{\bA}{\mathbf{A}} \newcommand{\bB}{\mathbf{B}} \newcommand{\bC}{\mathbf{C}} \newcommand{\bD}{\mathbf{D}} \newcommand{\bE}{\mathbf{E}} \newcommand{\bF}{\mathbf{F}} \newcommand{\bG}{\mathbf{G}} \newcommand{\bH}{\mathbf{H}} \newcommand{\bI}{\mathbf{I}} \newcommand{\bJ}{\mathbf{J}} \newcommand{\bK}{\mathbf{K}} \newcommand{\bL}{\mathbf{L}} \newcommand{\bM}{\mathbf{M}} \newcommand{\bN}{\mathbf{N}} \newcommand{\bP}{\mathbf{P}} \newcommand{\bQ}{\mathbf{Q}} \newcommand{\bR}{\mathbf{R}} \newcommand{\bS}{\mathbf{S}} \newcommand{\bT}{\mathbf{T}} \newcommand{\bU}{\mathbf{U}} \newcommand{\bV}{\mathbf{V}} \newcommand{\bW}{\mathbf{W}} \newcommand{\bX}{\mathbf{X}} \newcommand{\bY}{\mathbf{Y}} \newcommand{\bZ}{\mathbf{Z}} \newcommand{\bsa}{\boldsymbol{a}} \newcommand{\bsb}{\boldsymbol{b}} \newcommand{\bsc}{\boldsymbol{c}} \newcommand{\bsd}{\boldsymbol{d}} \newcommand{\bse}{\boldsymbol{e}} \newcommand{\bsoldf}{\boldsymbol{f}} \newcommand{\bsg}{\boldsymbol{g}} \newcommand{\bsh}{\boldsymbol{h}} \newcommand{\bsi}{\boldsymbol{i}} \newcommand{\bsj}{\boldsymbol{j}} \newcommand{\bsk}{\boldsymbol{k}} \newcommand{\bsell}{\boldsymbol{\ell}} \newcommand{\bsm}{\boldsymbol{m}} \newcommand{\bsn}{\boldsymbol{n}} \newcommand{\bso}{\boldsymbol{o}} \newcommand{\bsp}{\boldsymbol{p}} \newcommand{\bsq}{\boldsymbol{q}} \newcommand{\bsr}{\boldsymbol{r}} \newcommand{\bss}{\boldsymbol{s}} \newcommand{\bst}{\boldsymbol{t}} \newcommand{\bsu}{\boldsymbol{u}} \newcommand{\bsv}{\boldsymbol{v}} \newcommand{\bsw}{\boldsymbol{w}} \newcommand{\bsx}{\boldsymbol{x}} \newcommand{\bsy}{\boldsymbol{y}} \newcommand{\bsz}{\boldsymbol{z}} \newcommand{\bsA}{\boldsymbol{A}} \newcommand{\bsB}{\boldsymbol{B}} \newcommand{\bsC}{\boldsymbol{C}} \newcommand{\bsD}{\boldsymbol{D}} \newcommand{\bsE}{\boldsymbol{E}} \newcommand{\bsF}{\boldsymbol{F}} \newcommand{\bsG}{\boldsymbol{G}} \newcommand{\bsH}{\boldsymbol{H}} \newcommand{\bsI}{\boldsymbol{I}} \newcommand{\bsJ}{\boldsymbol{J}} \newcommand{\bsK}{\boldsymbol{K}} \newcommand{\bsL}{\boldsymbol{L}} \newcommand{\bsM}{\boldsymbol{M}} \newcommand{\bsN}{\boldsymbol{N}} \newcommand{\bsP}{\boldsymbol{P}} \newcommand{\bsQ}{\boldsymbol{Q}} \newcommand{\bsR}{\boldsymbol{R}} \newcommand{\bsS}{\boldsymbol{S}} \newcommand{\bsT}{\boldsymbol{T}} \newcommand{\bsU}{\boldsymbol{U}} \newcommand{\bsV}{\boldsymbol{V}} \newcommand{\bsW}{\boldsymbol{W}} \newcommand{\bsX}{\boldsymbol{X}} \newcommand{\bsY}{\boldsymbol{Y}} \newcommand{\bsZ}{\boldsymbol{Z}} \newcommand{\calA}{\mathcal{A}} \newcommand{\calB}{\mathcal{B}} \newcommand{\calC}{\mathcal{C}} \newcommand{\calD}{\mathcal{D}} \newcommand{\calE}{\mathcal{E}} \newcommand{\calF}{\mathcal{F}} \newcommand{\calG}{\mathcal{G}} \newcommand{\calH}{\mathcal{H}} \newcommand{\calI}{\mathcal{I}} \newcommand{\calJ}{\mathcal{J}} \newcommand{\calK}{\mathcal{K}} \newcommand{\calL}{\mathcal{L}} \newcommand{\calM}{\mathcal{M}} \newcommand{\calN}{\mathcal{N}} \newcommand{\calO}{\mathcal{O}} \newcommand{\calP}{\mathcal{P}} \newcommand{\calQ}{\mathcal{Q}} \newcommand{\calR}{\mathcal{R}} \newcommand{\calS}{\mathcal{S}} \newcommand{\calT}{\mathcal{T}} \newcommand{\calU}{\mathcal{U}} \newcommand{\calV}{\mathcal{V}} \newcommand{\calW}{\mathcal{W}} \newcommand{\calX}{\mathcal{X}} \newcommand{\calY}{\mathcal{Y}} \newcommand{\calZ}{\mathcal{Z}} \newcommand{\R}{\mathbb{R}} \newcommand{\C}{\mathbb{C}} \newcommand{\N}{\mathbb{N}} \newcommand{\Z}{\mathbb{Z}} \newcommand{\F}{\mathbb{F}} \newcommand{\Q}{\mathbb{Q}} \DeclareMathOperator*{\argmax}{arg\,max} \DeclareMathOperator*{\argmin}{arg\,min} \newcommand{\nnz}[1]{\mbox{nnz}(#1)} \newcommand{\dotprod}[2]{\langle #1, #2 \rangle} \newcommand{\ignore}[1]{} \let\Pr\relax \DeclareMathOperator*{\Pr}{\mathbf{Pr}} \newcommand{\E}{\mathbb{E}} \DeclareMathOperator*{\Ex}{\mathbf{E}} \DeclareMathOperator*{\Var}{\mathbf{Var}} \DeclareMathOperator*{\Cov}{\mathbf{Cov}} \DeclareMathOperator*{\stddev}{\mathbf{stddev}} \DeclareMathOperator*{\avg}{avg} \DeclareMathOperator{\poly}{poly} \DeclareMathOperator{\polylog}{polylog} \DeclareMathOperator{\size}{size} \DeclareMathOperator{\sgn}{sgn} \DeclareMathOperator{\dist}{dist} \DeclareMathOperator{\vol}{vol} \DeclareMathOperator{\spn}{span} \DeclareMathOperator{\supp}{supp} \DeclareMathOperator{\tr}{tr} \DeclareMathOperator{\Tr}{Tr} \DeclareMathOperator{\codim}{codim} \DeclareMathOperator{\diag}{diag} \newcommand{\PTIME}{\mathsf{P}} \newcommand{\LOGSPACE}{\mathsf{L}} \newcommand{\ZPP}{\mathsf{ZPP}} \newcommand{\RP}{\mathsf{RP}} \newcommand{\BPP}{\mathsf{BPP}} \newcommand{\P}{\mathsf{P}} \newcommand{\NP}{\mathsf{NP}} \newcommand{\TC}{\mathsf{TC}} \newcommand{\AC}{\mathsf{AC}} \newcommand{\SC}{\mathsf{SC}} \newcommand{\SZK}{\mathsf{SZK}} \newcommand{\AM}{\mathsf{AM}} \newcommand{\IP}{\mathsf{IP}} \newcommand{\PSPACE}{\mathsf{PSPACE}} \newcommand{\EXP}{\mathsf{EXP}} \newcommand{\MIP}{\mathsf{MIP}} \newcommand{\NEXP}{\mathsf{NEXP}} \newcommand{\BQP}{\mathsf{BQP}} \newcommand{\distP}{\mathsf{dist\textbf{P}}} \newcommand{\distNP}{\mathsf{dist\textbf{NP}}} \newcommand{\eps}{\epsilon} \newcommand{\lam}{\lambda} \newcommand{\dleta}{\delta} \newcommand{\simga}{\sigma} \newcommand{\vphi}{\varphi} \newcommand{\la}{\langle} \newcommand{\ra}{\rangle} \newcommand{\wt}[1]{\widetilde{#1}} \newcommand{\wh}[1]{\widehat{#1}} \newcommand{\ol}[1]{\overline{#1}} \newcommand{\ul}[1]{\underline{#1}} \newcommand{\ot}{\otimes} \newcommand{\zo}{\{0,1\}} \newcommand{\co}{:} %\newcommand{\co}{\colon} \newcommand{\bdry}{\partial} \newcommand{\grad}{\nabla} \newcommand{\transp}{^\intercal} \newcommand{\inv}{^{-1}} \newcommand{\symmdiff}{\triangle} \newcommand{\symdiff}{\symmdiff} \newcommand{\half}{\tfrac{1}{2}} \newcommand{\bbone}{\mathbbm 1} \newcommand{\Id}{\bbone} \newcommand{\SAT}{\mathsf{SAT}} \newcommand{\bcalG}{\boldsymbol{\calG}} \newcommand{\calbG}{\bcalG} \newcommand{\bcalX}{\boldsymbol{\calX}} \newcommand{\calbX}{\bcalX} \newcommand{\bcalY}{\boldsymbol{\calY}} \newcommand{\calbY}{\bcalY} \newcommand{\bcalZ}{\boldsymbol{\calZ}} \newcommand{\calbZ}{\bcalZ} $$

Universal Types

Universal types are very useful for performing generic programming, which allows you to use the same code over different types. For instance, the C++ STL (Standard Template Library) allows you to work on things like containers over any arbitrary type. You would surely not want to re-implement the logic for every concrete type that you use. Such a feature is known as parametric polymorphism. It is different from the other kind of polymorphism normally found in object-oriented languages that allows for overloading and run-time dispatch, which is known as ad-hoc polymorphism.

To formally introduce and discuss universal types, we work in the context of System F with universal (or polymorphic) types. System F is an extension of the simply-typed lambda calculus with universal quantification over types. The typing rules for System F are given below:

\begin{prooftree} \RightLabel{(var)} \AxiomC{$x:T \in \Gamma $} \UnaryInfC{$\Gamma \vdash x : T$} \end{prooftree}

\begin{prooftree} \RightLabel{(abs)} \AxiomC{$\Gamma, x : T_1 \vdash t_2: T_2 $} \UnaryInfC{$\Gamma \vdash \lambda x : T_1.t_2 : T_1 \rightarrow T_2$} \end{prooftree}

\begin{prooftree} \RightLabel{(app)} \AxiomC{$\Gamma \vdash t_1 : T_{11} \rightarrow T_{12}$} \AxiomC{$\Gamma \vdash t_2 : T_{11} $} \BinaryInfC{$\Gamma \vdash t_1\ t_2 : T_{12} $} \end{prooftree}

\begin{prooftree} \RightLabel{(tabs)} \AxiomC{$\Gamma, X \vdash t_2 : T_2$} \UnaryInfC{$\Gamma \vdash \lambda X.t_2 : \forall X.T_2$} \end{prooftree}

\begin{prooftree} \RightLabel{(tapp)} \AxiomC{$\Gamma \vdash t_1 : \forall X.T_{12} $} \UnaryInfC{$\Gamma \vdash t_1\ [T_2] : [X \rightarrow T_2] T_{12} $} \end{prooftree}

The first three rules are our old friends from the simply-typed lambda calculus, corresponding to variable typing, lambda abstraction, and lambda application. The rules \(\textsf{tabs}\) and \(\textsf{tapp}\) (type abstraction and type application) are new. We use capitalized variable names in order to denote that it is a type variable. So \(\textsf{tabs}\) says that if we add a type \(X\) in our context and we can derive a term \(t_2\) of type \(T_2\) from it, then we can produce an abstraction that takes in any type \(X\) and returns a term of type \(T_2\) when applied. The standard technique of applying alpha conversion applies in the case that there are conflicting variable names in nested closures.

As an example, consider the empty context \(\Gamma = \cdot\). If we add a type variable \(X\) to it, we can derive the polymorphic identity function \(\lambda X. \lambda x : X.x\) with type \(\forall X.X \rightarrow X\).

\(\textsf{tapp}\) says that given a lambda abstraction \(t_1\) over types, we can apply it on any type \(T_2\) to obtain a result that have instances of \(X\) substituted by \(T_2\). We put square brackets around \(T_2\) on the bottom left to make explicit that this is a type application, while the bracket notation on the right refers to substitution. Continuing off from our previous example, we can perform the application \(\lambda X. \lambda x [\textsf{Nat}]\) to specialize our polymorphic function to be \(\textsf{Nat} \rightarrow \textsf{Nat}\).

Since we allow quantification over types, System F is a second-order lambda-calculus. In fact, we can go further and quantify over types of types, which is known as kinds. You can then say why not go further and take the type over kinds, and if repeat this process indefinitely you will actually achieve the language known as System \(\text{F}_\omega\)!

Languages from the ML family provides parametric polymorphism. However, it seems like not all is well. Consider the following piece of innocent OCaml code that simply defines the identity function and then tries to call it with two values of different types:

1
2
let id : 'a -> 'a = fun x -> x in
(id 0, id "hello")

Despite how we seemingly typed it to use any generic type variable, the compiler complains to us:

1
2
3
4
5
File "./ident.ml", line 2, characters 10-17:
2 | (id 0, id "hello")
^^^^^^^
Error: This expression has type string but an expression was expected of type
int

This is because OCaml uses the Hindley-Milner type inference algorithm in order to perform type reconstruction. From a high level point of view, it produces constraints on the types based on how they are used, and then tries to unify these constraints in the most general way possible. As such, it discovers both constraints \('a = \textsf{int}\) and also \('a = \textsf{string}\), which cannot be unified (on the other hand, something like \(X = Y, Y = \textsf{int}\) can be unified). If we want the example to work, we actually need to provide more information to the compiler and tell it that we actually want \('a\) to be universally quantified by using polymorphic type annotations:

1
2
let id : 'a. 'a -> 'a = fun x -> x in
(id 0, id "hello")

Notice how the typing rule 'a. 'a -> 'a closely resembles our type \(\forall X.X \rightarrow X\) in System F.

Another way to achieve the same thing is by using explicit type parameters, which now really looks like our term in System F, \(\lambda A. \lambda x.x\):

1
2
let id = fun (type a) (x : a) -> x in
(id 0, id "hello")

The first approach (polymorphic type annotations) is more idiomatic as explicit type parameters could cause issues when the function is recursive (see the following post if you want more information).

However, was this failure just a limitation of our type inference algorithm, and might there be a smarter way such that type inference in a language with first-class parametric polymorphism is possible? Interestingly enough, this was a very difficult question to answer, and was an open problem for around 20 years before Joe Wells proved that it is actually undecidable. Therefore, as a compromise, OCaml uses a weaker form of polymorphism known as let-polymorphism. Let-polymorphism essentially substitutes the binding into the resulting expression itself before typechecking, which allows each instance to be constrained independently of the others.