flippac A wandering hacker, lost in a maze of abstractions all alike 2019-03-20T20:02:30Z http://flippac.wordpress.com/feed/atom/ WordPress.com Philippa Cowderoy https://flippac.wordpress.com <![CDATA[The Simply-typed Lambda Calculus with Constraints]]> http://flippac.wordpress.com/?p=109 2012-02-26T20:11:26Z 2012-02-23T12:27:47Z Continue reading ]]> Here’s the syntax for the simply-typed lambda calculus (STLC): Here are the typing rules, written in the usual natural deduction style: It turns out that these rules are not only enough to provide typechecking but also type inference if you feed them into Prolog, though that’s far from a coincidence. Type inference for STLC is possible, and Prolog was the result of decades of research into finding proofs (in this case a typing – a proof that a given term has a given type) for problems stated in this style. It’s even less of a coincidence because STLC admits principal typings – so all proofs that a term has a given type are equivalent.

## Rewriting the Rules

Suppose we deliberately ignore the existance of Prolog (but not of unification, as it’ll turn out) – how can we write the rules to make it more obvious what to do?

Well, the good news is that they’re already syntax-directed. Given a particular term, we can always see which rule to try applying based on its outermost structure. That makes it fairly easy to do typechecking working through a term, though there’s a hidden condition buried in there. Let’s tease that out and see if it helps: The main difference is in the `App` rule. The red equality condition wasn’t there before, and aside from it the blue variables only appear above the line once. When we dispatch the conditions in black, we don’t have to worry about them interacting directly, we only have to worry about the red condition. If you’re only checking in a known context then the condition can be shown by a syntactic equality check. The principal typings property should be pretty clear at this point – there’s barely a decision to be made that shows up in a completed proof tree.

## Checking for a Checker

Time to code those rules up, then. Starting with datatypes for the syntax:

```data Term ty = Var TermIdentifier |
App (Term ty) (Term ty) |
Lam TermIdentifier ty (Term ty)

data ObjType fix = Prim TypeIdentifier | Fun fix fix
deriving Show```

`Term` is parametric in the type of type annotations to make it easier to handle type inference. The types use a two-level fixpoint encoding as described here to help reusing the code a little lower down. To build an entire type we use a type-level fixpoint operator, and construct like so:

```Fun (Fix \$ Prim "Integer")
(Fix \$ Fun (Fix \$ Prim "Integer")
(Fix \$ Prim "Integer")
)```

Or equivalently:

```(Fix \$ Prim "Integer")
`Fun`
(Fix \$ (Fix \$ Prim "Integer") `Fun` (Fix \$ Prim "Integer"))```

This represents `Integer -> (Integer -> Integer)`, and has the type `ObjType (Fix ObjType)`.

Given that, here’s a typechecker:

```type OType = ObjType (Fix ObjType)
type OEnvironment = Map TermIdentifier OType

check :: OEnvironment -> Term OType -> OType
check env (Var i) = case lookup i env of
Nothing -> error \$ "Unbound variable " ++ i
Just v -> v
check env (App f p) = let t_f = check env f
t_p = check env p
in case t_f of
Fun (Fix t_p') (Fix r)
| t_p == t_p' -> r
| otherwise -> error "Parameter mismatch"
_ -> error "Applied a non-function"
check env (Lam i ty t) = let r = check (insert i ty env) t
in Fun (Fix ty) (Fix r)```

Aside from the `Fix` constructors, this is entirely standard 1 – so at least my rewrite of the typechecking rules works!

## Inference

As for inference, it turns out this more or less solves it too – all we have to do is treat the red condition as a constraint and the syntax-directed rules as a constraint functional program. This can be encoded into the language from the earlier article, but it’s neater not to. If the term checks then there’s a best solution that’s unique up to metavariable naming. This lets us hang on to principal typings, using the typing rules and metavariable assignments 2. We also gain the means to deal with unknown contexts, by inferring the necessary bindings.

Here’s a type inferrer for known contexts:

```type MType = MetaType ObjType
type MEnvironment = Map TermIdentifier MType

infer :: MEnvironment -> Term (MType) -> ConstraintMonad MType
infer env (Var i) = case lookup i env of
Nothing -> throwError \$ "Unbound variable " ++ i
Just v -> return v
infer env (App f p) = do t_f <- infer env f
t_p <- infer env p
t_r <- newVar
t_f `eqConstraint` O (t_p `Fun` t_r)
return t_r
infer env (Lam i ty b) = do t_b <- infer (insert i ty env) b
return \$ O \$ ty `Fun` t_b```

Here we’ve got error handling pushed into the `ConstraintMonad`, but no major wins for it in a toy implementation. Constraint handling makes the inferrer itself as easy to write as the typechecker though. The `O` constructor, taken from the `Unification` module linked to below, is used to lift object-level terms to the constraint system’s meta level. The inferrer returns meta-level terms – hence `O`s in the bottom line and in the call to `eqConstraint`.

## Reading for Context

It deviates a little from a straightforward encoding of the typing rules, but it’s also common to use a `ReaderT` to handle the context like so:

```inferReader :: MEnvironment -> Term MType -> ConstraintMonad MType
inferReader env t = runReaderT (infer' t) env
where infer' (Var i) = do env <- ask
case lookup i env of
Nothing -> throwError \$ "Unbound variable " ++ i
Just v -> return v
infer' (App f p) = do t_f <- infer' f
t_p <- infer' p
t_r <- lift newVar
lift \$ t_f `eqConstraint` (O (t_p `Fun` t_r))
return t_r
infer' (Lam i ty b) = do t_b <- local (insert i ty) (infer' b)
return \$ O \$ ty `Fun` t_b```

At this scale the `Reader` doesn’t look like that big a saving at first glance. Most of the context/environment passing is gone, but having to retrieve it to do lookups feels just as bad because it adds a line (much harder to mess up though!). It does lead onto an idea I’m going to save for the next post though – if we’re handling the context in the monad, can we treat it as part of the constraint system and if so can we figure out anything useful from it?

## Trekking Through the Tarball

Here’s a quick rundown of what’s in the source that accompanies this post:

• Unification.hs: A slightly simplified, slower version of the unification framework from Generic Unification via Two-Level Types and Parameterized Modules, including some minimal support for error reporting.
• STLCLang.hs: Datatypes for STLC’s syntax, a helper class (`IsType`) for building polymorphic type values that work at object or meta level, and a pretty printer for type values.
• STLCChecker.hs: The typecheckers from above, plus the constraint monad the inferrers run in. Use `runSubstPretty` to invoke the checkers and pretty print the result.
• STLCTerms.hs: A toy environment and some test terms to check.

1. The `lookup` call is legit due to the structural rules which are left implicit 2. Traditionally, an unconstrained metavariable becomes a type variable, which has much the same role as a primitive type in STLC ]]>
8
Philippa Cowderoy https://flippac.wordpress.com <![CDATA[Object and Meta, Typefully]]> http://flippac.wordpress.com/?p=88 2011-11-09T20:28:29Z 2011-11-08T02:02:33Z Continue reading ]]> In my previous post on constraint programming, I took the following term type as an object language and started doing meta-level work with it:

`data Term = Term !Label ![Term]`

I handled the meta level in a manner that isn’t particularly typeful:

`data Term = Term !Label ![Term] | Metavar Integer`

That’s okay for a quick and dirty solver, but there are better options if you want a bit more safety!

quick note: I leave the !s out in all these representations because the one in `![Term]` doesn’t actually do that much good (it only achieves head-strictness) so I figured it’s easier to stick to the fast and loose approach

## Just use two types

```data ObjectTerm = ObjectTerm Label [ObjectTerm]
data MetaTerm = MetaTerm Label [MetaTerm] | Metavar Integer

liftToMeta :: ObjectTerm -> MetaTerm
liftToMeta (ObjectTerm l cs) = MetaTerm l (map liftToMeta cs)

concretise :: MetaTerm -> Maybe ObjectTerm
concretise (MetaTerm l cs) = return . ObjectTerm l =<< mapM concretise cs
concretise Metavar{} = Nothing```

This one’s obvious enough. Given only Haskell 2010 type classes to abstract across the equivalents of the `Term` constructor this can be rather painful to work with because your polymorphic constructor is no good for pattern matching. With view patterns or the upcoming pattern synonyms things become more comfortable though.

```data TermView children = Term Label [children] | NotTerm

class Term t where
term :: Label -> [t] -> t
matchTerm :: t -> TermView t

instance Term ObjectTerm where
term = ObjectTerm
matchTerm (ObjectTerm l cs) = Term l cs

instance Term MetaTerm where
term = MetaTerm
matchTerm (MetaTerm l cs) = Term l cs
matchTerm _ = NotTerm

add :: (Term l, Term r) => l -> r -> r
add (matchTerm -> Term "Z" []) r = r
add (matchTerm -> Term "S" [n]) r = term "S" \$ [add n r]```

This example only uses view patterns, but pattern synonyms would allow patterns that’re indistinguishable from ordinary ones. Note the type of `add` though (equivalent to the inferred one) – it can lead to unexpected annotation requirements.

One of the great things about GADTs is that you can use phantom types to mark out what might otherwise be subtypes of the GADT – and pattern-matching to perform the equivalent of a safe cast. The bad news is that they don’t play quite so nicely with the rose tree representation: you can’t prove anything without a separate leaf constructor, which makes a bit of a mess of the original code.

```data Concrete
data Meta
data Term a where
Leaf :: Label -> Term a
Branch :: Label -> [Term a] -> Term a
Metavar :: Integer -> Term Meta

concretise :: Term a -> Maybe (Term Concrete)
concretise (Leaf l) = return \$ Leaf l
concretise (Branch l cs) = return . Branch l =<< mapM concretise cs
concretise Metavar{} = Nothing```

The `concretise` function attempts to prove that a `Term` can be `Concrete` via pattern-matching – `Metavar` is always `Term Meta`, which effectively taints the entire term it appears in. The `concretise` function still ends up rebuilding all the `Branch` nodes though.

An alternative type for concretise is `Term a -> Maybe (Term b)`, and if we’re comfortable with rank-n types we can use `(forall a. Term a)` as an equivalent to `Term Concrete` that’s readily usable as `Term Meta` as well. This means we already get `liftToMeta` for free if we keep types polymorphic until we care, but failing that it’s just a tree traversal rebuilding the same term at a different type.

### Quick and Dirty Optimisation

The version of concretise above rebuilds an entire term just to change its type, but GHC already has the same underlying representations for them. I don’t know a good way of carrying coercions to Concrete or a type variable around, but we do know something the compiler doesn’t – and this means we can cheat.

```import Unsafe.Coerce

concretise :: Term a -> Maybe (Term Concrete)
concretise l@Leaf{} = return \$ unsafeCoerce l
concretise b@(Branch l cs) = return \$ unsafeCoerce b =<< mapM_ concretise cs
concretise Metavar{} = Nothing```

## Two level types

There’s an alternative approach using two mutually recursive datatypes, which when generalised are parametric in each other. One type handles the per-node structure of a term, while the other handles the recursive structure – and via polymorphism, any additional structure that doesn’t belong to object terms.

```data Term wrapper = Term Label [wrapper]

newtype Fix a = Fix { unfix :: (a (Fix a)) }
data Meta object = O (object (Meta object)) | Metavar Integer

liftToMeta :: Fix Term -> Meta Term
liftToMeta (Fix (Term l cs)) = O \$ Term l (map liftToMeta cs)

concretise :: Meta Term -> Maybe (Fix Term)
concretise (O (Term l cs)) = return . Fix . Term l =<< mapM concretise cs
concretise Metavar{} = Nothing```

`Fix` is a type-level fixpoint operator – so it provides recursive structure in types polymorphic in their own children, and nothing else. The `O` constructor in `Meta` has much the same structure, but `Meta` also handles metavariables. As `Fix` and `Meta` have completely different internal representations, `liftToMeta` and `concretise` both require rebuilding the entire term again.

This approach has the major advantage that it starts to enable a degree of generic programming – we can reuse `Fix` and `Meta` for a variety of different object node types. The classic example is Tim Sheard‘s functional pearl on generic unification, which is implemented in the unification-fd package.

It may also be appropriate to use type classes etc much as in the initial example – possibly using one class generalising `Term` and a second one generalising `Fix` and `Meta`, such that an application of the `Fix`/`Meta` generalisation is also a generalisation of `Term`. I haven’t tried it so far though, and it’s possible that the increased annotation burden isn’t worth it.

## Chaining Types and the Expression Problem

Having gone as far as two level types, we can try a refactoring:

```data ObjectTermNode fix = Term Label [fix]
data MetaTermNode chain fix = O (chain fix) | Metavar Integer

newtype Fix a = Fix { unfix :: (a (Fix a)) }

type ObjectTerm = Fix ObjectTermNode
type MetaTerm = Fix (MetaTermNode ObjectTermNode)```

Now the node info for metaterms is also separated from the recursive structure. This generalises to as many different node types as we like, though the collection of constructors used to pass up the chain will grow – clearly this cries out for a type class to match each node type. I’m hoping that with ConstraintKinds someone’ll figure out a good way to write instances that generate the chaining behaviour with only a linear instance count too, but as I’m sticking with the Haskell Platform at present I’m not able to have a go myself yet.

This solution has the advantange of generalising to the Expression Problem: we can add new node types (with corresponding type classes) and new functions reasonably orthogonally. Similarly, other solutions to the Expression Problem are solutions to the problem discussed in this post.

]]>
4
Philippa Cowderoy https://flippac.wordpress.com <![CDATA[Really Only Relative]]> http://flippac.wordpress.com/?p=83 2011-11-09T20:28:56Z 2011-11-08T01:59:53Z Continue reading ]]> In Introduction to Constraint Programming I made something of a mistake. It was a remarkably easy one to make, because I stepped outside the bounds of the type system’s help – and especially that of those who’ve declared interfaces before me.

## Finding and Fixing the Mistake

The centrepiece of the article is an embedded language, which I’d initially written as just an applicative functor and quickly decided needed more power so I added a bind-like construct as well. The functor at its heart only operates on a subcategory of that of Haskell values – defined by a typeclass called `Embeddable`:

```class Embeddable a where
subst :: Assignment -> a -> a

instance Embeddable () where
subst _ v = v

instance Embeddable Term where
subst = substTerm

instance (Embeddable a, Embeddable b) => Embeddable (a -> b) where
subst = const id```

So we’ve got two ground types in the category – `()` and `Term` – and arbitrary-order functions on them. What we haven’t got is the target of the functor, `Embeddable a => Code a` – which means that while the `Bind` constructor types fine and so does its implementations, we can’t write the `join` function that provides the difference between an applicative functor and a monad in Haskell. We can pass around higher-order Haskell functions, but our embedded language is strictly first-order!

Oops? We can fix this by adding a new instance:

```instance Embeddable a => Embeddable (Code a) where
subst _ c = c ```

What made this mistake easy to make, of course, was not having a `Monad`-like typeclass with a defaulted join I was writing an instance for. That said, I did discover something new to me: I can have a "first-order-only monad" with the freedom of binding structure that comes with it. If I’m willing to invoke the relevant GHC flags, it can even play along with do notation.

## What Did We Have Before?

This new structure is interesting – `>>=` still doesn’t admit self-analysis, but if I just want to know that there’s no passing around of computations happening in an EDSL then it’s a lot more pleasant to use than an `Arrow`. The first piece of good news is that there’s already somewhere for it to fit in on HackageDB – the RMonad package. Of course, this implies that not every `RMonad` is a monad.

For something with more theory developed, I believe it’s a Haskell encoding of a Relative Monad. A typeclass for them in Haskell using ConstraintKinds can be found in this draft paper: Subcategories in Haskell.

]]>
0
Philippa Cowderoy https://flippac.wordpress.com <![CDATA[Introduction to Constraint Programming]]> http://flippac.wordpress.com/?p=20 2011-11-09T20:27:55Z 2011-10-16T16:19:45Z Continue reading ]]> In forthcoming posts I’m going to be making heavy use of constraint programming, so I’m going to do an intro to constraint programming as I see it first so everyone knows what I’m talking about before I start making controversial statements. The code’s all in Haskell, which is my usual language of choice. If you’re feeling pedantic, assume I’ve got the basics imported (Data.Map, Control.Monad etc) and hidden stuff that’d conflict with the only things that typecheck – there’s a full source file to play with at the bottom if you want to go further.

## Some basic definitions

Constraint programming isn’t a paradigm itself, but it can be viewed as a "paradigm transformer" – there are such things as Constraint Logic Programming, Constraint Functional Programming and Constraint Imperative Programming. What distinguishes them from their underlying paradigms is the ability to work with constraints on some kind of data, usually to either find a solution to those constraints or check whether such a solution can exist. Time to define a few terms (warning: I may be using them a little idiosyncratically or insufficiently generally!)

• Constraint Domain: a specific setting for constraint problems. A type or types of data that may be constrained and the constraints (possibly: primitive constraints) that are allowed on them.
• Constraint: a predicate that may or may not be satisfied by the values it constrains.
• Constraint Problem: a specific problem within a constraint domain, with specific constraints.
• Constraint Space: an abstract space defined by the variables in a constraint problem. For example, given three real numbers x, y and z and inequalities on them, the canonical constraint space for them is $\mathbb{R}^3$. Inequalities ‘cut’ the space in half, one side satisfiable and the other unsatisfiable – how’s that for intuition? In some constraint domains, there may not be an easily-envisageable constraint space in that the set or universe of potential solutions may not be particularly spatial, or there may not be a space that’s particularly intuitive to work with. Depending on your background we’ll be dealing with one of those in this article.

There are further concepts to come, but I’m going to leave those until they come up – time for a worked example.

## Behold my domain!

Okay, first let’s pick a constraint domain. I’m going to go for a classic: equality on the Herbrand universe of terms. Logic programming fans will recognise this as the implicit constraint domain of conventional logic programming ala Prolog. A datatype:

```data Term = Term !Label ![Term]
deriving Show```

Note the strictness: we only want to deal with finite terms. `Label` can be anything that provides an infinite alphabet, equality and is a member of `Show`. Now we need terms with variables so we can start talking about non-trivial constraints. There are more typeful ways of doing this, but I’m going to go with modifying `Term` to this:

```data Term = Term !Label ![Term] | Metavar Integer
deriving Show```

Why `Metavar`? Well, the language of constraints-on-`Term`s-with-variables is a metalanguage for talking about our original `Term`s – so those variables are metavariables, not variables of the `Term` language per se. This is also why there are more typeful ways to handle the problem: the code above doesn’t have a type of concrete `Term`s without metavariables in them any more. It’ll do for today though. If we have variables, we’re going to need some substitution machinery for them:

```type Assignment = Map Integer Term

substTerm a (Term l cs) = Term l \$ map (substTerm a) cs
substTerm a m@(Metavar i) = case lookup i a of
Nothing -> m
Just t -> substTerm a t```

Astute readers will notice that an assignment is equivalent to a special case of a set of equality constraints where one side is a lone metavariable. It’s important that we don’t allow circular references into our `Assignment`s.

## It’s all Greek to the machine

So now we’ve got that much, let’s build ourselves an EDSL! You may ask where the constraints are: hold that thought for now. Instead we’re going to aim for an initial implementation that solves all constraints as it goes ala Prolog. I’d also like the language to automatically substitute in solutions as it goes so that when you run it you get out a concrete solution straight away. That means we need this typeclass and the following instances:

```class Embeddable a where
subst :: Assignment -> a -> a

instance Embeddable () where
subst _ v = v

instance Embeddable Term where
subst = substTerm

instance (Embeddable a, Embeddable b) => Embeddable (a -> b) where
subst = const id```

Why `Embeddable`? These are the Haskell values we’re going to allow embedded into our own language, itself embedded back into Haskell – think `Applicative`‘s pure or `Monad`‘s return. Most of the EDSLs we think to grace with the name are actually mutually embedded with their host language to some extent – it’s just so convenient to steal from! I’m a fan of deep embeddings when I’m explaining and/or working on the semantics of an EDSL, so syntax next:

```data Code a where
Pure :: Embeddable a => a -> Code a
Ap :: Embeddable b => Code (a -> b) -> Code a -> Code b
Bind :: (Embeddable a, Embeddable b) => Code a -> (a -> Code b) -> Code b
NewVar :: Code Term
Equate :: Term -> Term -> Code Term```

If you didn’t see this coming, we’re effectively building an applicative functor and a monad – only on a subcategory of haskell’s defined by `Embeddable`, instead of on the entire of it. Why do I have both `Ap` and `Bind`, analogues of `<*>` and `>>=`? Mostly historical accident, but I also wanted to highlight a design decision – different levels of power are appropriate to different constraint problems, as we’ll see with one implementation below. `NewVar` is used to generate fresh metavariables as we can’t do it automatically, whereas `Equate` is used to assert equality constraints. At this point we’re ready for multiple implementations of the language’s semantics. Here’s one for starters:

```runOnFly :: Assignment -> Integer -> Code a ->
Either UnificationError (a, Assignment, Integer)
runOnFly a v (Pure t) = return (subst a t, a, v)
runOnFly a v (Ap f p) = do (rf, a', v') <- runOnFly a v f
(rp, a'', v'') <- runOnFly a' v' p
return (subst a'' (rf rp), a'', v'')
runOnFly a v (Bind p f) = do (rp, a', v') <- runOnFly a v p
runOnFly a' v' (f \$ subst a' rp)
runOnFly a v NewVar = return (Metavar v, a, v+1)
runOnFly a v (Equate l r) = case unify a l r of
Left e -> Left e
Right (u,a') -> return (u, a', v)```

The parameters `a` and `v` are the current `Assignment` containing the solution-to-date during runtime and the first free metavariable counting from 0, both of which are handled as if they’re state. The implementation uses the `Either` monad, as it’s entirely possible for a new constraint to be unsatisfiable given the current `Assignment` and make the problem the program describes impossible to solve. Arguably I should’ve newtyped up a couple of `StateT`s on top of `Either` – if only so I can’t mess up the state handling! `NewVar`‘s case uses simple addition, whereas `Equate` calls a unification algorithm on the two `Term`s in the context of the current `Assignment`. Unification’s a standard way to solve this kind of constraint, but it’s pretty tedious so I won’t list the code here – see the bottom for a source file including it as well as everything else in this article. Oh yes, I almost missed it in amongst detailing the obvious: check out the call to `subst` in the `Pure`, `Ap` and `Bind` cases! This is what we gave up our `Monad` and `Applicative` instances for, but it’s kinda neat, huh? GHC 7.4’s new ConstraintKinds feature gives us a good way to combine typeclass-defined subcategories with instances of other typeclasses – hopefully we’ll see appropriate generalisations of `Functor` and the rest standardised on Hackage soon so we can perform such tricks freely.

## The decomposing remains of a problem?

There’s another possible set of semantics for this language though: rather than solving constraints as we go, we could just use the language for constraint generation and let the user pass the resulting problem to satisfiability checkers, solvers and other analyses as appropriate. It’s a lot easier to "debug" the satisfiability of a constraint problem without having to deal with the program that generated it as well, so this is an idea with genuine merit. We’re going to have to talk about constraints explicitly in the code now:

```data Constraint = Term := Term
deriving Show```

I’m going to be lazy and use lists as a representation for constraint sets. We don’t have to use sets at all though – if we treat equality constraints as predicates that are true or false for specific concrete assignments to all the metavariables, we can also treat the entire constraint problem as one big constraint – the conjunction of all the constraints that would otherwise form the constraint set. To go a step further, constraint domains are (or correspond to) intentionally-limited logics! This is a theme I’ll be using in later posts. Code time!

```runGathering :: Integer -> Code a -> (a, Integer, [Constraint])
runGathering v (Pure t) = (t, v, [])
runGathering v (Ap f p) = let (rf, v', csf) = runGathering v f
(rp, v'', csp) = runGathering v' p
in (rf rp, v'', csf ++ csp)
runGathering v (Bind p f) = let (rp, v', cs) = runGathering v p
(rf, v'', cs') = runGathering v' (f rp)
in (rf, v'', cs ++ cs')
runGathering v (NewVar) = (Metavar v, v + 1, [])
runGathering v (Equate l r) = (l, v, [l := r])```

This time around, I should’ve used a `StateT` for the metavariable counter and a `WriterT` to collect `Constraint`s in. This isn’t just a new operational behaviour for the same language, arguably it’s a different language. Why? Because `Bind` and `Ap` see different results than they would with `runOnFly` – they don’t get to observe the results of solving.

## Decision time!

Prolog fans will be wondering something by now. Our `runOnFly` implementation might smell a little bit familiar, but it can’t branch – where’s the disjunction? Well, there’s no reason we can’t have it and still be doing constraint programming, so here goes:

```data Code a where
...
Or :: Embeddable a => Code a -> Code a -> Code a

...
runOnFly a v (Or l r) = case (runOnFly a v l) of
r@Right{} -> r
Left{} -> runOnFly a v r```

Nothing special here – except that `runGathering` can’t handle `Or` because it doesn’t do any solving to branch on. This suggests a new language feature and a third language implementation – why not leave when to solve up to the program?

```data Code a where
...
Solve :: Code ()

...
data ConstraintSet = CS {assumptions :: Assignment,
constraints :: [Constraint]}
deriving Show

emptyConstraintSet = CS Data.Map.empty []

solveConstraints :: ConstraintSet -> Either UnificationError ConstraintSet
solveConstraints (CS a cs) = solveConstraints' a cs
where solveConstraints' a [] = return (CS a [])
solveConstraints' a ((l := r):cs) = do (u, a') <- unify a l r
solveConstraints' a' cs```

What’s with `ConstraintSet`? Well, it’s partitioned into an `Assignment` of solved "constraints" and a list of unsolved ones. When we call `solveConstraints`, it attempts to ‘move’ the unsolved ones into the solved set by unifying terms and seeing how that changes the `Assignment`. Feel free to complain at my use of general recursion!

```run :: ConstraintSet -> Integer -> Code a ->
Either UnificationError (a, ConstraintSet, Integer)
run cs v (Pure t) = return (subst (assumptions cs) t, cs, v)
run cs v (Ap f p) = do (rf, cs', v') <- run cs v f
(rp, cs'', v'') <- run cs' v' p
return (subst (assumptions cs'') (rf rp),
cs'',
v'')
run cs v (Bind p f) = do (rp, cs', v') <- run cs v p
run cs' v' (f \$ subst (assumptions cs') rp)
run cs v NewVar = return (Metavar v, cs, v+1)
run cs v (Equate l r) = let cs' = CS (assumptions cs)
(l := r : constraints cs)
in return (subst (assumptions cs) l, cs', v)
run cs v (Or l r) = case (run cs v l) of
result@Right{} -> result
Left{} -> run cs v r
run cs v Solve = do cs' <- solveConstraints cs
return ((), cs', v)```

Bet you’re wishing I was using monad transformers by now, huh? Now we pass along a `ConstraintSet` in a state-like fashion. The constraints half of it behaves in a `State`-like fashion throughout – this is a design decision, we could have made Solve take a computation as a parameter and only solved the constraints generated inside it instead. In fact, either way round you can find use cases that the decision breaks. I’m not going to implement further here, but clearly there are cases where you might want to both pass constraint problems around first-class (which we get for free) and access the current constraint problem/set (another command for our language).

## A little test

Finally, we’d better find some problems to solve, hadn’t we?

```{- Test code - let's do a little bit of boring Peano arithmetic!
No logic programming here, the constraints are just pattern-matching.
Feel free to write it the other way if you can stomach the awful syntax. -}

solve t = t `Bind` \result -> (Solve `Bind` (const \$ Pure result))

{- Peano numbers, matching, addition -}

succ t = Term "S" [t]
zero = Term "Z" []

integerToNatTerm :: Integer -> Term
integerToNatTerm i | i > 0 = succ (integerToNatTerm (i - 1))
| i == 0 = zero
| otherwise = error "integerToNatTerm: negative parameter"

natTermToInteger :: Term -> Integer
natTermToInteger (Term "Z" []) = 0
natTermToInteger (Term "S" [n]) = 1 + natTermToInteger n
natTermToInteger _ = error "natTermToInteger: parameter isn't a peano natural"

matchZero t = solve \$ Equate t zero
matchSucc t = NewVar `Bind`
(\n -> solve (Equate t (succ n)) `Bind`
(const \$ Pure n)
)

add l r = (matchZero l `Bind` (const \$ Pure r))
`Or`
(matchSucc l `Bind`
\l' -> add l' (succ r)
)```

All that code just to do a little lousy addition?! I’m afraid so. Evidence that it works:

```*Herbrand> let t = add (integerToNatTerm 3) (integerToNatTerm 4)
*Herbrand> run emptyConstraintSet 0 (solve t)
Right (Term "S" [Term "S" [Term "S" [Term "S"
[Term "S" [Term "S" [Term "S" [Term "Z" []]]]]
]]],
CS {assumptions = fromList [(0,Term "S" [Term "S" [Term "Z" []]]),
(1,Term "S" [Term "Z" []]),
(2,Term "Z" [])],
constraints = []},
3)```

You can thank me for the manually added layout later. Feel free to feed the result term into `natTermToInteger` to confirm that it is indeed 7. If we’d written our addition in a Prolog-like style, asserting a relation rather than solving a function, that would have arguably embedded something interesting: a non-primitive constraint. Admittedly not a particularly strong argument given that the constraint system itself not only doesn’t have an entity matching the addition but also doesn’t have the power to encode addition without the externally-imposed branching of our program. But it’s a nice idea for other constraint systems, no?

## Closing up

So what did we learn? We learned about constraint domains (of which there are many and I covered exactly one), constraint problems, constraint spaces and constraints themselves. We learned that a problem may or may not be satisfiable. We didn’t really talk enough about solving them, but we know it’s about finding values for all those variables. We also learned that constraints can be treated as little logics – or not so little. Most importantly we learned that all this stuff exists and (check the links) has been worked on heavily – that there’s something worth learning about! Hopefully this’s been worth your time – catch you around.

• Herbrand.hs – Haskell source file for everything in this article, plus all the bits I skipped
• Monadic Constraint Programming package. I haven’t tried this myself, but I was impressed by the talk at Anglohaskell – well worth a look, and perhaps a quick search for the related papers.
• Wikipedia articles:
• Concepts, Techniques, and Models of Computer Programming by Peter Van Roy and Seif Haridi is an excellent book in general and has a good introduction to constraint programming. I don’t entirely agree with their notion of programming paradigms (it rules out half of why I’m interested in pure functional languages for example), but it’s certainly an advancement.
• The Haskell implementation Helium is designed for beginners and aims to produce better error messages. The team’s research can be found here, and takes a constraint-based approach.
• The HM(X) type system extends Hindley-Milner type inference by parameterising it on a constraint domain – examples include dimension analysis for physics and extensible records. There’s a chapter on it in ATTAPL, with an extended version available here.
]]>
7
Philippa Cowderoy https://flippac.wordpress.com <![CDATA[Hello again, cruel world!]]> http://flippac.wordpress.com/?p=6 2011-09-14T02:50:06Z 2011-09-14T02:50:06Z Continue reading ]]> So I figured I should set up an academic/FP/etc blog and get writing again – preferably one where I didn’t have to do the sysadmin work. I also didn’t particularly want to hang on to the previous title I guess – I don’t think there’s anything wrong with my past self’s sense of humour, but I suppose I’m a bit more wary of reactions these days. Especially as I’m hoping to be talking about restraintconstraint programming a lot!

I’ve not been too active with such things the last few years due to quite a collection of personal circumstances, but I’ve definitely got things to say if only I can get round to them. I don’t have a lot of in-person contact prodding me about such things, so a little online cheerleading would be a big help – hopefully I’ll have more up soon.

]]>
0