Here’s the syntax for the simply-typed lambda calculus (STLC):
Here are the typing rules, written in the usual natural deduction style:
It turns out that these rules are not only enough to provide typechecking but also type inference if you feed them into Prolog, though that’s far from a coincidence. Type inference for STLC is possible, and Prolog was the result of decades of research into finding proofs (in this case a typing – a proof that a given term has a given type) for problems stated in this style. It’s even less of a coincidence because STLC admits principal typings – so all proofs that a term has a given type are equivalent.
Rewriting the Rules
Suppose we deliberately ignore the existance of Prolog (but not of unification, as it’ll turn out) – how can we write the rules to make it more obvious what to do?
Well, the good news is that they’re already syntax-directed. Given a particular term, we can always see which rule to try applying based on its outermost structure. That makes it fairly easy to do typechecking working through a term, though there’s a hidden condition buried in there. Let’s tease that out and see if it helps:
The main difference is in the
App rule. The red equality condition wasn’t there before, and aside from it the blue variables only appear above the line once. When we dispatch the conditions in black, we don’t have to worry about them interacting directly, we only have to worry about the red condition. If you’re only checking in a known context then the condition can be shown by a syntactic equality check. The principal typings property should be pretty clear at this point – there’s barely a decision to be made that shows up in a completed proof tree.
Checking for a Checker
Time to code those rules up, then. Starting with datatypes for the syntax:
data Term ty = Var TermIdentifier | App (Term ty) (Term ty) | Lam TermIdentifier ty (Term ty) data ObjType fix = Prim TypeIdentifier | Fun fix fix deriving Show
Term is parametric in the type of type annotations to make it easier to handle type inference. The types use a two-level fixpoint encoding as described here to help reusing the code a little lower down. To build an entire type we use a type-level fixpoint operator, and construct like so:
Fun (Fix $ Prim "Integer") (Fix $ Fun (Fix $ Prim "Integer") (Fix $ Prim "Integer") )
(Fix $ Prim "Integer") `Fun` (Fix $ (Fix $ Prim "Integer") `Fun` (Fix $ Prim "Integer"))
Integer -> (Integer -> Integer), and has the type
ObjType (Fix ObjType).
Given that, here’s a typechecker:
type OType = ObjType (Fix ObjType) type OEnvironment = Map TermIdentifier OType check :: OEnvironment -> Term OType -> OType check env (Var i) = case lookup i env of Nothing -> error $ "Unbound variable " ++ i Just v -> v check env (App f p) = let t_f = check env f t_p = check env p in case t_f of Fun (Fix t_p') (Fix r) | t_p == t_p' -> r | otherwise -> error "Parameter mismatch" _ -> error "Applied a non-function" check env (Lam i ty t) = let r = check (insert i ty env) t in Fun (Fix ty) (Fix r)
Aside from the
Fix constructors, this is entirely standard 1 – so at least my rewrite of the typechecking rules works!
As for inference, it turns out this more or less solves it too – all we have to do is treat the red condition as a constraint and the syntax-directed rules as a constraint functional program. This can be encoded into the language from the earlier article, but it’s neater not to. If the term checks then there’s a best solution that’s unique up to metavariable naming. This lets us hang on to principal typings, using the typing rules and metavariable assignments 2. We also gain the means to deal with unknown contexts, by inferring the necessary bindings.
Here’s a type inferrer for known contexts:
type MType = MetaType ObjType type MEnvironment = Map TermIdentifier MType infer :: MEnvironment -> Term (MType) -> ConstraintMonad MType infer env (Var i) = case lookup i env of Nothing -> throwError $ "Unbound variable " ++ i Just v -> return v infer env (App f p) = do t_f <- infer env f t_p <- infer env p t_r <- newVar t_f `eqConstraint` O (t_p `Fun` t_r) return t_r infer env (Lam i ty b) = do t_b <- infer (insert i ty env) b return $ O $ ty `Fun` t_b
Here we’ve got error handling pushed into the
ConstraintMonad, but no major wins for it in a toy implementation. Constraint handling makes the inferrer itself as easy to write as the typechecker though. The
O constructor, taken from the
Unification module linked to below, is used to lift object-level terms to the constraint system’s meta level. The inferrer returns meta-level terms – hence
Os in the bottom line and in the call to
Reading for Context
It deviates a little from a straightforward encoding of the typing rules, but it’s also common to use a
ReaderT to handle the context like so:
inferReader :: MEnvironment -> Term MType -> ConstraintMonad MType inferReader env t = runReaderT (infer' t) env where infer' (Var i) = do env <- ask case lookup i env of Nothing -> throwError $ "Unbound variable " ++ i Just v -> return v infer' (App f p) = do t_f <- infer' f t_p <- infer' p t_r <- lift newVar lift $ t_f `eqConstraint` (O (t_p `Fun` t_r)) return t_r infer' (Lam i ty b) = do t_b <- local (insert i ty) (infer' b) return $ O $ ty `Fun` t_b
At this scale the
Reader doesn’t look like that big a saving at first glance. Most of the context/environment passing is gone, but having to retrieve it to do lookups feels just as bad because it adds a line (much harder to mess up though!). It does lead onto an idea I’m going to save for the next post though – if we’re handling the context in the monad, can we treat it as part of the constraint system and if so can we figure out anything useful from it?
Trekking Through the Tarball
Here’s a quick rundown of what’s in the source that accompanies this post:
- Unification.hs: A slightly simplified, slower version of the unification framework from Generic Unification via Two-Level Types and Parameterized Modules, including some minimal support for error reporting.
- STLCLang.hs: Datatypes for STLC’s syntax, a helper class (
IsType) for building polymorphic type values that work at object or meta level, and a pretty printer for type values.
- STLCChecker.hs: The typecheckers from above, plus the constraint monad the inferrers run in. Use
runSubstPrettyto invoke the checkers and pretty print the result.
- STLCTerms.hs: A toy environment and some test terms to check.
- Accompanying source
- What Are Principal Typings and What Are They Good For?
- The Essence of Principal Typings