### Table of Contents

# Implementing Least-Strict Natural Numbers

## Introduction

On this page I want to summarize some insights about the behaviour of natural number implementations (in respect to non-strictness). I have gained these by using a tool called StrictCheck. StrictCheck was developed by Olaf Chitil and is supposed to test whether a function is unnecessarily strict. In the long term my goal is to improve the suggestions made by StrictCheck because in some cases StrictCheck is too ambitious. More Information about the development of StrictCheck can be found on a separate page. This page is supposed to be some kind of motivation for further development of StrictCheck. It shows that StrictCheck can be used to gain surprising insights about functions. For example the multiplication of Peano numbers like it is implemented naively is not least-strict. Furthermore the examples should demonstrate that StrictCheck is especially useful in the area of functional logic programming.

This work also initiated an implementation of natural numbers by a binary representation in Haskell. This implementation is similar to the implementation presented in Section “Binary Numbers”. This implementation is efficient in comparison to Peano numbers and is still less strict than the external numbers. A cabal package can be found at hackageDB.

The idea is to identify functions that are not least-strict. A function is not least-strict if we can alter it such that it behaves equally for total values but yields a more defined result for a partial value. In a functional logic programming language it is very important that functions are least-strict because least-strict functions result in the smallest possible search space when we use free variables.

## Peano Numbers

We start with an implementation of Peano numbers.

data Peano = Z | S Peano

We will see that even for this trivial data type a naive implementation is not least-strict.

### Multiplication

We implement a addition and on the basis of this a multiplication for Peano numbers.

(+) :: Peano -> Peano -> Peano Z + y = y S x + y = S (x + y) (*) :: Peano -> Peano -> Peano Z * _ = Z S x * y = y + (x * y)

The following example is supposed to demonstrate that even the implementation of simple Peano arithmetic is not least strict if we are not careful. The Curry function `check`

takes a number `n`

and checks whether there is another number `m`

such that `m * n = 1`

.

check :: Peano -> Peano check n | m * n == S Z = m where m free

This function guesses a Peano number `m`

and checks whether it yields 1 when multiplied by `n`

. If we apply `check`

to `Z`

the function does not terminate. If we swap the arguments of `(*)`

in `check`

it yields `no more solutions`

because there is no natural number `m`

with the property `0 * m = 1`

. But there is also no number `m`

with the property `m * 0 = 1`

.

Although least-strictness is more important in a functional logic language the multiplication shows up an undesirable behaviour in a functional programming language, too. For example we can define the infinite Peano number as follows.

infinity :: Peano infinity = S infinity

While the evaluation of `Z * infinity`

yields `Z`

the evaluation of `infinity * Z`

does not terminiate.

So, let's have a closer look at the behaviour of `(*)`

. We use StrictCheck to check whether `(*)`

is least-strict.

Failed: f _|_ Z = _|_ but should be Z Failed: f (S _|_) Z = _|_ but should be Z Failed: f (S (S _|_)) Z = _|_ but should be Z

Note that we can check only a finite number of partial values and there are more examples where `(*)`

is not least-strict. These results state that an application of `(*)`

to `_|_`

and `Z`

yields `_|_`

but should yield `Z`

. Sadly this is not possible (at least in Haskell) because StrictCheck also demands that `Z * _|_`

yields 'Z'. But also in the context of Curry we are only interested in deterministic functions. We do not want to introduce non-determinism if it is not necessary. Therefore we would prefer StrictCheck to take this into account. I want to address exactly this problem in the future.

So, let's take a look at the second counter example and check whether we can satisfy this one. The operation (+) evaluates its first argument to HNF to yield a HNF. Therefore the expression `y + (x * y)`

demands the evaluation of `y`

to yield a HNF. The recursive call to `(*)`

demands the evaluation of its first argument. Therefore it would be cleverer to switch the arguments of the recursive call of `(*)`

.

(*) :: Peano -> Peano -> Peano Z * _ = Z S x * y = y + (y * x)

We check this new implementation with StrictCheck.

Failed: f _|_ Z = _|_ but should be Z

That is the other two cases are least-strict now. Furthermore now the evaluation of `check Z`

yields `no more solutions`

and `infinity * Z`

yields `Z`

.

### Comparison

We want to take a look at the implementation of comparison functions for Peano numbers now. First we define a comparison of two Peano numbers which yields a value of type `Ordering`

.

compare :: Peano -> Peano -> Ordering compare Z Z = EQ compare Z (S _) = LT compare (S _) Z = GT compare (S x) (S y) = compare x y

On basis of this function we can define `(<)`

, `(<=)`

, `(>)`

and `(>=)`

. For example `(>)`

is defined as follows.

(>) :: Peano -> Peano -> Bool x > y = compare x y == GT

In Haskell these functions belong to the type class `Ord`

. It is sufficient to only define `compare`

. In this case `(<)`

is exactly defined as stated above. So let's check `(<)`

for least-strictness.

Failed: f _|_ Z = _|_ but should be False Failed: f (S _|_) (S Z) = _|_ but should be False

Well, the first case sounds reasonable. If we check whether anything is smaller than `Z`

we know this is false. If we take a look at `compare`

we see that we preform pattern matching on both arguments. But we do not have to. If we would define `(>)`

without using `compare`

it would look like the following.

(>) :: Peano -> Peano -> Bool Z > _ = False S _ > Z = True S x > S y = x > y

This function is least-strict. It looks very similar to `compare`

but we merge the first two cases of `compare`

. Because we do not want to define all the comparison functions this way we define a special version of `compare`

called `compareLT`

. This function yields `LT`

instead of `EQ`

in case the two arguments are equal.

compareLT :: Peano -> Peano -> Ordering compareLT Z _ = LT compareLT (S _) Z = GT compareLT (S x) (S y) = compareLT x y

Now we can define a least-strict version of all the comparison functions on basis of `compareLT`

.

x > y = compareLT x y == GT x >= y = compareLT y x == LT x < y = compareLT y x == GT x <= y = compareLT x y == LT

## Binary Numbers

Peano numbers are nice if you want to use narrowing but they are very inefficient if you want to do arithmetic. Therefore Curry provides an implementation of natural numbers by a binary representation. The implementation in Curry differs slightly from the one presented here.

data Nat1 = IHi | O Nat | I Nat

On the basis of this data type we implement natural numbers including zero (which nobody really likes but apparently everybodty needs ;) ).

data Nat = Zero | Pos Nat1

On the basis of this data type we implement signed integers.

data Int = Neg Nat1 | Pos0 Nat

### Multiplication

The multiplication of two binary numbers can be implemented as follows.

(*) :: Nat1 -> Nat1 -> Nat1 IHi * y = y I x * y = y + O (x * y) O x * y = O (x * y)

This implementation has a similar drawback (with respect to least-strictness) as the implementation of the multiplication of Peano numbers. The addition evaluates both its arguments to HNF to yield a HNF. Therefore it is better with respect to least-strictness to swap the arguments of the recursive call of `(*)`

. The following implementation is least-strict.

(*) :: Nat1 -> Nat1 -> Nat1 IHi * y = y I x * y = y + O (y * x) O x * y = O (x * y)

Note that it is not a good idea to swap the second recursive call of `(*)`

. Note furthermore that this implementation is not the canonical one and it is very difficult to see whether this function is least-strict. Therefore a supporting tool would be very valuable.

### Minus

(-) :: Nat1 -> Nat1 -> Int IHi - y = inc (Neg y) O x - IHi = Pos (pred (O x)) I x - IHi = Pos (O x) O x - O y = o (x - y) O x - I y = dec (o (x - y)) I x - O y = inc (o (x - y)) I x - I y = o (x - y) o :: Int -> Int o (Neg n) = Neg (O n) o (Pos0 n) = Pos0 (case n of Zero -> Zero Pos m -> Pos (O m))

(-) :: Nat1 -> Nat1 -> Int x - y = case minus x y of Pos0 n -> Pos0 n Neg IHi -> zero Neg n -> Neg (pred n) minus :: Nat1 -> Nat1 -> Int minus IHi y = Neg y minus x@(O _) IHi = pos (pred x) minus (I x) IHi = pos (O x) minus (O x) (O y) = incNeg (o (minus x y)) minus (I x) (I y) = incNeg (o (minus x y)) minus (O x) (I y) = decNeg (o (minus x y)) minus (I x) (O y) = decNeg (o (negate (minus y x))) incNeg :: Int -> Int incNeg (Neg n) = Neg (pred n) incNeg (Pos0 n) = Pos0 n decNeg :: Int -> Int decNeg (Neg n) = Neg n decNeg (Pos0 n) = Pos0 (pred n)

### Division

A simple implementation of the division of two natural numbers is implemented as following.

divmodNat :: Nat -> Nat -> (Nat,Nat) divmodNat _ Zero = error "divide by zero" divmodNat Zero _ = (Zero,Zero) divmodNat (Pos x) (Pos y) = divmodNat1 x y divmodNat1 :: Nat1 -> Nat1 -> (Nat,Nat) divmodNat1 x y | x<y = (Zero,Pos x) | otherwise = (succ q,r) where (q,r) = divmodNat (Pos x-Pos y) (Pos y)

You can implement the test whether a number is even or not on basis of `divmodNat`

. In Haskell you have actually no choice as the function `even`

is implemented as following by demanding its argument to be a member of the type class `Integral`

. For natural numbers `mod`

and `rem`

are the same.

even :: Nat -> Bool even n = n `rem` Pos (O IHi) == Zero

Note that it is furthermore not possible to define `'even`

for the data type `Nat`

in Haskell as it does not support a zero.

Let's test whether a number is even or not. In our representation we can check this by simply taking a look at the first bit. But with the above implementation the expression `even (Pos (O _|_))`

is evaluated to `_|_`

. But we improve this implementation as follows.

divmodNat :: Nat -> Nat -> (Int,Int) divmodNat x IHi = (Pos x,Zero) divmodNat (O x) (O y) = (q,o r) where (q,r) = divmodNat x y divmodNat (I x) (O y) = (q,i r) where (q,r) = divmodNat x y divmodNat x y | x<y = (Zero,Pos x) | otherwise = (succ q,r) where (q,r) = divmodNat (Pos x-Pos y) (Pos y) o :: Int -> Int o Zero = Zero o (Pos n) = Pos (O n) i :: Int -> Int i Zero = Pos IHi i (Pos n) = Pos (I n)

When we now check `even (Pos (O _|_))`

we get `True`

as we have expected.

### Comparison

The following shows a canonical way of implementing the `compare`

function for the `Nat`

data type.

compare :: Nat -> Nat -> Ordering compare x y = cmpNat EQ x y cmpNat :: Ordering -> Nat -> Nat -> Ordering cmpNat eq IHi IHi = eq cmpNat _ IHi (O _) = LT cmpNat _ IHi (I _) = LT cmpNat _ (O _) IHi = GT cmpNat _ (I _) IHi = GT cmpNat eq (O x) (O y) = cmpNat eq x y cmpNat eq (I x) (I y) = cmpNat eq x y cmpNat _ (O x) (I y) = cmpNat LT x y cmpNat _ (I x) (O y) = cmpNat GT x y

Instead of using a case distinction on the result of the recursive call it uses an extra parameter. When we check this function using StrictCheck we get the following result.

Failed: f (2*_|_+1) (2*1) = _|_ but should be GT Failed: f (2*1) (2*_|_+1) = _|_ but should be LT

Are these counter examples satisfiable? Let's take a look at more counter examples.

Failed: f (2*(2*_|_+1)) (2*(2*1)) = _|_ but should be GT Failed: f (2*(2*_|_+1)) (2*(2*1)+1) = _|_ but should be GT

These all have in common that you do not have to evaluate the second argument if the other-one is `IHi`

. And you do not have to because you know from a less significant bit that one of the two numbers is less than the other. In the definition of `compare`

if the first argument is `IHi`

we check the other one. But we do not have to if the numbers differ in one of the previous bits. Therefore we define a function `compareLT`

like we did for Peano numbers which yields `LT`

in the case that its arguments are equal. This way we can merge the first three cases of the function and do not have to pattern match on the second argument. Instead of defining a similar function `compareGT`

we use a function `invOrd`

which changes `LT`

to `GT`

and `LT`

to `GT`

.

compareLT :: Nat1 -> Nat1 -> Ordering comapreLT IHi _ = LT compareLT (O _) IHi = GT compareLT (I _) IHi = GT compareLT (O x) (O y) = compareLT x y compareLT (I x) (I y) = compareLT x y compareLT (O x) (I y) = compareLT x y compareLT (I x) (O y) = invOrd (compareLT y x)

On basis of `compareLT`

we define `compare`

.

compare :: Nat1 -> Nat1 -> Ordering compare IHi IHi = EQ compare IHi (O _) = LT compare IHi (I _) = LT compare (O _) IHi = GT compare (I _) IHi = GT compare (O x) (O y) = compare x y compare (I x) (I y) = compare x y compare (O x) (I y) = compareLT x y compare (I x) (O y) = invOrd (compareLT y x)

We can reduce the code duplication by introducing a functional parameter. When we check the new implementation of `compare`

there are no more counterexamples. Another way of implementing this is using minus and checking the sign of the result.

### Performance

The following Curry function can be used to generate Pythagorean triples.

pyt :: (Nat,Nat,Nat) pyt | a*a+b*b==c*c = (a,b,c) where a,b,c free testPyt n = take n (allValuesB (searchTree pyt))

no of triples | old implementation | lest-strict implementation |
---|---|---|

100 | 3.2s | 1.2s |

200 | 21.8s | 3.9s |

300 | 46.2s | 13.8s |

## Conclusion

If you want to define a function which is as non-strict as possible you have to invest brain power. Furthermore it is very difficult to reason about the non-strictness of a function. Even for simple functions like the multiplication of Peano numbers the canonical implementation is not least-strict. There are good reasons to be least-strict in a functional programming language (for example infinite data structures like `infinity`

). But there are even better reasons to be least-strict in a functional logic programming language.

Right now you have to interpret the results of StrictCheck which is not easy. Therefore we have to improve the tool to make applicable for everybody.