At the Haskell Workshop 2007 Neil Mitchell presented an approach to generic traversals of data that does not need “Scary Types”. He restricted his Uniplate library compared to existing approaches and was able to give an implementation based on a simple Haskell type class.

This post rephrases Neil’s enlightening ideas and simplifies (a bit) the implementation and use of his library. In the following, I will explain the ideas in my own words. I have implemented a Haskell library Traversal.hs for generic traversals and will comment on similarities and differences after documenting my library by example.

## Features of the `Traversal`

library

The main feature of the library is to support to query and manipulate children of tree-structured values. With little help from the programmer, the library provides powerful and flexible queries and transformations of tree structured values. Powerful, because sophisticated functions are implemented in the library independently of a concrete datatype, so the programmer does not need to write such often tedious traversal code herself. Flexible, because the provided functions allow to focus on the interesting part of the data and use default behaviour for other parts, so the programmer can later refine her datatypes without having to adapt transformations that are not concerned with the change.

Finally, the approach is comnpletely lightweight. The library compiles with both hugs and ghc – the currently most widely used Haskell systems. It uses multi-parameter type classes to provide a bit more general interface. An implementation within Haskell98 is very useful already and can be obtained by specializing some type signatures in the presented library.

## A central type class

In order to use the library for her own datatype, the programmer has to make it an instance of the type class `Traversable`

:

class Traversable a b where scrap :: a -> ([b],[b] -> a)

This type class specifies a single function `scrap`

that selects a list of children from an arbitrary value and provides a means to replace the children of the given value. The main restriction of this approach is that all children must have the same type. Thanks to multi-parameter type-classes, the type of the children can differ from the type of the “mother”. Although the approach is not as general as other approaches to generic traversals it is general enough for many practical examples.

Assume we have defined types for names and expressions and define a type for bindings based on these:

type Binds = [(Name,Exp)]

We can make this type traversable as follows:

instance Traversable Binds Exp where scrap bs = let (xs,es) = unzip bs in (es, zip xs)

We decided to consider expressions – not names – children of bindings. We can use library functions to compute the list of expressions and replace bound expressions in a list of bindings:

boundExps :: Binds -> [Exp] boundExps = children replaceBoundExps :: Binds -> [Exp] -> Binds replaceBoundExps = replaceChildren

The employed library functions have the following types:

children :: Traversable a b => a -> [b] replaceChildren :: Traversable a b => a -> [b] -> a

They are simply defined by projecting to the two components of the result of the function `scrap`

that is defined in the `Traversable`

class.

Honestly, this is a boring example. There are more interesting functions defined in the library that are based on these “single-step operations”. Let’s consider these!

## A datatype for expressions

As a running example, we will use a datatype for expressions defined as follows.

type Name = String data Exp = Cons Name | Var Name | Lam Name Exp | Exp :@: Exp | Let Binds Exp

As this is a recursive datatype, we can make it an instance of `Traversable`

by considering direct subexpressions children of an expression:

instance Traversable Exp Exp where scrap (Lam x e) = ([e], \ [e'] -> Lam x e') scrap (e1 :@: e2) = ([e1,e2], \ [e1',e2'] -> e1' :@: e2') scrap (Let bs e) = let (es,rep) = scrap bs in (e:es, \ (e':es') -> Let (rep es') e') scrap e = ([], \_ -> e)

Note, how we can reuse the instance `Traversable Binds Exp`

to implement the `Let`

-case for the `scrap`

function. The library function

family :: Traversable a a => a -> [a]

computes the list of the given value, its children, those children and so on, if the given value is traversable and its type equals the type of its children. Applied to an expression, it yields the list of all subexpressions of the given expression. We can use this function, e.g., to compute the names of all variables that are used in an expression:

usedVars :: Exp -> [Name] usedVars e = [ n | Var n ← family e ]

Note that this function mentions only a single constructor of the `Exp`

datatype. If we want to add new kinds of expressions – for example, case expressions for pattern matching – only the instance declaration for `Traversable`

has to be adapted – `usedVars`

does not have to be changed.

## A simple transformation

We cannot only query subexpressions but also transform them with user defined functions. Assume we want to eliminate simple let expressions. A let expression is simple, if the bound expressions do not use the introduced variables:

isSimpleLet :: Exp -> Bool isSimpleLet (Let bs _) = null (intersect xs (concatMap usedVars es)) where xs = map fst bs es = boundExps bs isSimpleLet _ = False

Simple let expressions can be replaced by a lambda expression as follows:

elimSimpleLet :: Exp -> Exp elimSimpleLet (Let bs e) = foldl (:@:) (foldr Lam e xs) es where xs = map fst bs es = boundExps bs

In order to apply this transformation to every simple let expression in a given expression we employ the library function

mapFamily :: Traversable a a => (a -> a) -> a -> a

that maps the given function on all family members of the given value. With the help of `mapFamily`

, eliminating simple let expressions is straightforward:

elimSimpleLets :: Exp -> Exp elimSimpleLets = mapFamily elim where elim e | isSimpleLet e = elimSimpleLet e | otherwise = e

Note again, that we don’t have to change anything, if we decide to add new constructors to the expression datatype.

## Refining the transformation

The specification of simple let expressions can be improved. For example, the following let expression could be considered simple, but is not by the current definition of `isSimpleLet`

:

let x=\x.x in x

Although `x`

appears in the list of variables used in `\x.x`

we can safely eliminate this let expression, because `x`

does not occur *freely* in `\x.x`

. If we redefine `isSimpleLet`

as

isSimpleLet :: Exp -> Bool isSimpleLet (Let bs _) = null (intersect xs (concatMap freeVars es)) where xs = map fst bs es = boundExps bs isSimpleLet _ = False

then the above let expression is simplified to

(\x.x) (\x.x)

But how can we define the function `freeVars`

?

## A monad for scoping

An occurrence of a variable is called *free* if the variable is not in scope, i.e., bound by a let or lambda expression, when it is referenced. We use the following monad with two auxiliary functions to model scoping:

newtype Scoped a = Scoped { runScoped :: [Name] -> a } instance Monad Scoped where return x = Scoped (const x) a >>= f = Scoped (\ns -> runScoped (f (runScoped a ns)) ns) isInScope :: Name -> Scoped Bool isInScope n = Scoped (n`elem`) extendScope :: [Name] -> Scoped a -> Scoped a extendScope ns scpd = Scoped (runScoped scpd . (ns++))

The function `extendScope`

is special because it takes a monadic action and runs it in a different scope. This is more natural than using a state monad because variables that are bound in one subexpression can occur freely in another.

The function `freeVars`

can now be defined in terms of a monadic action that computes free variables:

freeVars :: Exp -> [Name] freeVars e = runScoped (freeVarsScoped e) []

The function `freeVarsScoped`

has the structure of a generalized fold, i.e., it matches all alternatives of the `Exp`

datatype, calls itself recursively on subexpressions, and recombines the results to yield the overall result.

This is a very frequent scheme – many transformations are defined like this. However, if we mention all constructors explicitely in the definition of `freeVarsScoped`

, this results in a poorly extensible implementation. We would have to change `freeVarsScoped`

whenever we change the definition of `Exp`

.

As we are only interested in the `Var`

, `Lam`

and `Let`

cases, we are only willing to change `freeVarsScoped`

when one of these constructors changes. We don’t care about the others.

Of course, the `Traversal`

library comes to the rescue. It provides a function

fold :: Traversable a a => (a -> [r] -> r) -> a -> r

that can be employed to define `freeVarsScoped`

as follows:

freeVarsScoped :: Exp -> Scoped [Name] freeVarsScoped = fold vars where vars (Var n) cs = do inScope ← isInScope n if inScope then return [] else return [n] vars (Lam n _) cs = extendScope [n] (descend cs) vars (Let bs _) cs = extendScope (map fst bs) (descend cs) vars _ cs = descend cs descend = liftM concat . sequence

We only explicitely mention the constructors that we are interested in. All other cases are summarized by the local function `descend`

that combines the results of the different recursive calls that are hidden in the implementation of `fold`

.

## Comparison with Neil’s Uniplate library

The main difference between the `Traversal`

and the `Uniplate`

library is that the former defines a single type class `Traversable`

and the latter defines two type classes:

class Uniplate a where uniplate :: a -> ([a], [a] -> a) class Uniplate b => Biplate a b where biplate :: a -> ([b], [b] -> a)

You can see that the type of `uniplate`

is a special case of the type of `biplate`

with `a=b`

.

The function `family`

is called `universe`

in Uniplate:

universe :: Uniplate a => a -> [a]

There is a similar function `universeBi`

for the `Biplate`

class that is called `childFamilies`

in the Traversal library:

universeBi :: Biplate a b => a -> [b] childFamilies :: (Traversable a b, Traversable b b) => a -> [b]

The functions `family`

and `childFamilies`

can be implemented as mutually recursive functions quite naturally:

family :: Traversable a a => a -> [a] family a = a : childFamilies a childFamilies :: (Traversable a b, Traversable b b) => a -> [b] childFamilies = concatMap family . children

The function `uniplate`

cannot be implemented in terms of `uniplateBi`

because an instance `Uniplate a`

does not imply `Biplate a a`

.

Moreover, Neil seems to have invested much more thoughts in making his implementation efficient. I’m wondering whether his optimizations could be imitated within the presented single type-class design.

Another topic for future investigations is to derive instances of the class `Traversable`

automatically like it is possible with Uniplate. Until now, I did not feel the need for implementing this for my own projects because the instances are not very difficult to write by hand.

## Try it out!

I did not describe the whole interface of the library. So watch Traversal.hs for more information.