Here’s the syntax for the simplytyped lambda calculus (STLC):
Here are the typing rules, written in the usual natural deduction style:
It turns out that these rules are not only enough to provide typechecking but also type inference if you feed them into Prolog, though that’s far from a coincidence. Type inference for STLC is possible, and Prolog was the result of decades of research into finding proofs (in this case a typing – a proof that a given term has a given type) for problems stated in this style. It’s even less of a coincidence because STLC admits principal typings – so all proofs that a term has a given type are equivalent.
Rewriting the Rules
Suppose we deliberately ignore the existance of Prolog (but not of unification, as it’ll turn out) – how can we write the rules to make it more obvious what to do?
Well, the good news is that they’re already syntaxdirected. Given a particular term, we can always see which rule to try applying based on its outermost structure. That makes it fairly easy to do typechecking working through a term, though there’s a hidden condition buried in there. Let’s tease that out and see if it helps:
The main difference is in the App
rule. The red equality condition wasn’t there before, and aside from it the blue variables only appear above the line once. When we dispatch the conditions in black, we don’t have to worry about them interacting directly, we only have to worry about the red condition. If you’re only checking in a known context then the condition can be shown by a syntactic equality check. The principal typings property should be pretty clear at this point – there’s barely a decision to be made that shows up in a completed proof tree.
Checking for a Checker
Time to code those rules up, then. Starting with datatypes for the syntax:
data Term ty = Var TermIdentifier  App (Term ty) (Term ty)  Lam TermIdentifier ty (Term ty) data ObjType fix = Prim TypeIdentifier  Fun fix fix deriving Show
Term
is parametric in the type of type annotations to make it easier to handle type inference. The types use a twolevel fixpoint encoding as described here to help reusing the code a little lower down. To build an entire type we use a typelevel fixpoint operator, and construct like so:
Fun (Fix $ Prim "Integer") (Fix $ Fun (Fix $ Prim "Integer") (Fix $ Prim "Integer") )
Or equivalently:
(Fix $ Prim "Integer") `Fun` (Fix $ (Fix $ Prim "Integer") `Fun` (Fix $ Prim "Integer"))
This represents Integer > (Integer > Integer)
, and has the type ObjType (Fix ObjType)
.
Given that, here’s a typechecker:
type OType = ObjType (Fix ObjType) type OEnvironment = Map TermIdentifier OType check :: OEnvironment > Term OType > OType check env (Var i) = case lookup i env of Nothing > error $ "Unbound variable " ++ i Just v > v check env (App f p) = let t_f = check env f t_p = check env p in case t_f of Fun (Fix t_p') (Fix r)  t_p == t_p' > r  otherwise > error "Parameter mismatch" _ > error "Applied a nonfunction" check env (Lam i ty t) = let r = check (insert i ty env) t in Fun (Fix ty) (Fix r)
Aside from the Fix
constructors, this is entirely standard ^{1} – so at least my rewrite of the typechecking rules works!
Inference
As for inference, it turns out this more or less solves it too – all we have to do is treat the red condition as a constraint and the syntaxdirected rules as a constraint functional program. This can be encoded into the language from the earlier article, but it’s neater not to. If the term checks then there’s a best solution that’s unique up to metavariable naming. This lets us hang on to principal typings, using the typing rules and metavariable assignments ^{2}. We also gain the means to deal with unknown contexts, by inferring the necessary bindings.
Here’s a type inferrer for known contexts:
type MType = MetaType ObjType type MEnvironment = Map TermIdentifier MType infer :: MEnvironment > Term (MType) > ConstraintMonad MType infer env (Var i) = case lookup i env of Nothing > throwError $ "Unbound variable " ++ i Just v > return v infer env (App f p) = do t_f < infer env f t_p < infer env p t_r < newVar t_f `eqConstraint` O (t_p `Fun` t_r) return t_r infer env (Lam i ty b) = do t_b < infer (insert i ty env) b return $ O $ ty `Fun` t_b
Here we’ve got error handling pushed into the ConstraintMonad
, but no major wins for it in a toy implementation. Constraint handling makes the inferrer itself as easy to write as the typechecker though. The O
constructor, taken from the Unification
module linked to below, is used to lift objectlevel terms to the constraint system’s meta level. The inferrer returns metalevel terms – hence O
s in the bottom line and in the call to eqConstraint
.
Reading for Context
It deviates a little from a straightforward encoding of the typing rules, but it’s also common to use a ReaderT
to handle the context like so:
inferReader :: MEnvironment > Term MType > ConstraintMonad MType inferReader env t = runReaderT (infer' t) env where infer' (Var i) = do env < ask case lookup i env of Nothing > throwError $ "Unbound variable " ++ i Just v > return v infer' (App f p) = do t_f < infer' f t_p < infer' p t_r < lift newVar lift $ t_f `eqConstraint` (O (t_p `Fun` t_r)) return t_r infer' (Lam i ty b) = do t_b < local (insert i ty) (infer' b) return $ O $ ty `Fun` t_b
At this scale the Reader
doesn’t look like that big a saving at first glance. Most of the context/environment passing is gone, but having to retrieve it to do lookups feels just as bad because it adds a line (much harder to mess up though!). It does lead onto an idea I’m going to save for the next post though – if we’re handling the context in the monad, can we treat it as part of the constraint system and if so can we figure out anything useful from it?
Trekking Through the Tarball
Here’s a quick rundown of what’s in the source that accompanies this post:
 Unification.hs: A slightly simplified, slower version of the unification framework from Generic Unification via TwoLevel Types and Parameterized Modules, including some minimal support for error reporting.
 STLCLang.hs: Datatypes for STLC’s syntax, a helper class (
IsType
) for building polymorphic type values that work at object or meta level, and a pretty printer for type values.  STLCChecker.hs: The typecheckers from above, plus the constraint monad the inferrers run in. Use
runSubstPretty
to invoke the checkers and pretty print the result.  STLCTerms.hs: A toy environment and some test terms to check.
Links
 Accompanying source
 What Are Principal Typings and What Are They Good For?
 The Essence of Principal Typings

The
lookup
call is legit due to the structural rules which are left implicit ↩ 
Traditionally, an unconstrained metavariable becomes a type variable, which has much the same role as a primitive type in STLC ↩
For a simplified and faster version of that work, see my unificationfd package. No need to roll your own 🙂
Thanks, I should’ve linked to it. I’ve seen it before and chose to roll my own anyway, but I’m heading into different (more constraintbased) territory in the long run so at least I’m going to invent a different kind of wheel!
I’ve also been doing some constraintbased typing. You are probably already aware of Bastiaan Heeren’s work? http://www.staff.science.uu.nl/~heere112/phdthesis/index.html
I was hanging on to that link (and related work at Utrecht) until I’d used more than simple equality constraints, but it doesn’t hurt for someone else to mention it! I haven’t implemented any of it myself yet, but that’s partly down to my low rate of programming in general. I’ll be building some of the framework to help me head off partly in that direction this year, though. Have you tried it? How did you get on?
Oops! Sorry if I spilled the beans too early.
I’ve implemented constraintbased typing for HindleyMilner. It appears to be necessary for a paper I’m working on. It’s quite straightforward once you understand how the monomorphic constraints work.
Isn’t it just? Some HM variants are on my todo list here, I find the distinction between concrete type variables and metavariables to be much neater than alternative descriptions even if newbies occasionally get scared by the proliferation of syntax. On the whole, formality’s very much scaffolding rather than restraint for me.
I’ll look forward to seeing what you’re doing with it, anyway!
can anyone suggest me some introductory books for type theory (and it’s notation), please? I’m already familiar with functional programming languages (Haskell) and even a dependentlytyped one (Idris) but I want to learn more about the maths behind it…
The one I learned from initially was Pierce’s Types and Programming Languages (and later on the sequel, Advanced Topics in Types and Programming Languages). TaPL doesn’t discuss dependentlytyped languages much (ATTaPL has a chapter on them), but it treats the other half of the lambda cube thoroughly and also covers issues like subtyping, recursion and HindleyMilnerstyle let polymorphism well.
I’m sure others have their own recommendations, especially for dealing with Type Theory proper as opposed to type systems.