# using automated testing to find an unsatisfiable boolean formula that is not easily detected as such by a simple SAT solver

``import System ( getArgs )``

The other day I wrote a simple, incremental solver for boolean satisfiability.

``import Data.Boolean.SatSolver``

Today I want to search for a smallest unsatifiable boolean formula that this solver only identifies as such after backtracking. I'll use SmallCheck to find one.

``import Test.SmallCheck``

The idea is strikingly simple: state that there is no such formula and SmallCheck will shatter this claim showing you the smallest counter example.

``audaciousClaim :: Boolean -> PropertyaudaciousClaim formula = null (solver >>= solve) ==> null solver where solver = assertTrue formula newSatSolver``

In English: If we cannot find a solution by solving the formula, then asserting the formula without solving afterwards must already fail.

The SAT solver performs deterministic simplification steps when asserting formulas that might detect unsatisfiability without backtracking. The operation `solve` searches for solutions non-deterministically and there might be formulas whose unsatisfiability can only be detected via unsuccessful search.

To enable SmallCheck to find such a formula, we need to inform it about the type of boolean formulas.

``instance Serial Boolean where  series = cons1 Var        \/ cons0 Yes        \/ cons0 No        \/ cons1 Not        \/ cons2 (:&&:)        \/ cons2 (:||:)  coseries = undefined``

Now, SmallCheck is ready to go:

``shatterClaim :: IO ()shatterClaim = smallCheckI audaciousClaim``

The function `smallCheckI` will interactively check the claim for boolean formulas of increasing depth. I am running this function while I'm writing this paragraph -- it's searching at depth four. Here are the results for all depths up to three:

``````Depth 0:
Completed 2 test(s) without failure.
But 1 did not meet ==> condition.
Deeper? y
Depth 1:
Completed 13 test(s) without failure.
But 7 did not meet ==> condition.
Deeper? y
Depth 2:
Completed 356 test(s) without failure.
But 193 did not meet ==> condition.
Deeper? y
Depth 3:
Completed 253835 test(s) without failure.
But 137561 did not meet ==> condition.
Deeper? y
``````

From 253835 formulas at depth three 137561 are satisfiable, thats a bit more than half of them. All of the remaining 116274 unsatisfiable formulas are detected as such without backtracking.

SmallChek is running in GHCi now for 15 minutes, still at depth four reaching three million tests. Let's see whether compiled code is faster.

``````main :: IO ()
main = getArgs >>= flip depthCheck audaciousClaim . read . head
``````

Bingo! Compiling this with `-O2` and passing depth four when calling the program yields a counter example in an instant.

``````Depth 4:
Failed test no. 6600270. Test values follow.
Not (Var 0 :&&: Var 0) :&&: (Var (-1) :&&: (Var 0 :||: Var 0))
``````

Hence, ¬(x ∧ x) ∧ y ∧ (x ∨ x) is a smallest formula that is unsatisfiable and not detected as such by my solver.

That is a bit weird because both x ∧ x and x ∨ x can be easily simplified. Let's be suspicious and check the simplified formula ¬x ∧ y ∧ x by hand. We use -1 as encoding for y like in the counter example although that should not matter.

`````` > assertTrue (Not (Var 0) :&&: Var (-1) :&&: Var 0) newSatSolver
[]
``````

Indeed, no backtracking is necessary.

The solver works on conjunctive normal forms (CNF) of boolean formulas. The CNF of the counterexample is x ∨ ¬x) ∧ y ∧ (x ∨ x). Seems like I can easily improve the solver by eliminating duplicate literals from clauses. I'll send an update to Hackage.

After the fix the previously found conter example is none anymore. The (compiled) search at depth four does not yield a counter example within one hour.

Instead of generating arbitrary boolean formulas, let's directly create only conjunctive normal forms. In order to do so, we define a newtype for CNFs and a custom `Serial` instance.

``newtype CNF = CNF { clauses :: [Clause] }newtype Clause = Clause { literals :: [Literal] }newtype Literal = Literal { literal :: Boolean }``
``instance Serial CNF where  series = map CNF . series  coseries = undefined``
``instance Serial Clause where  series = map Clause . series  coseries = undefined``
``instance Serial Literal where  series depth = map Literal \$    concat [ [Var x, Not (Var x)] | x <- series (depth-1) ]  coseries = undefined``

We convert CNFs to feed them to the solver.

``cnfFormula :: CNF -> BooleancnfFormula = foldr (:&&:) Yes           . map (foldr (:||:) No . map literal . literals)           . clausesinstance Show CNF where show = show . cnfFormula``

We restate our claim as property on conjunctive normal forms in order to check only those.

``cnfClaim :: CNF -> PropertycnfClaim cnf = null (solver >>= solve) ==> null solver where solver = assertTrue (cnfFormula cnf) newSatSolver``

Let's try to shatter this one and redefine the main function.

``main :: IO ()main = getArgs >>= flip smallCheck cnfClaim . read . head``

Waiting for SmallCheck to generate counter examples leaves enough time to come up with one by hand: (x ∨ y) ∧ (x ∨ ¬y) ∧ (¬x ∨ y) ∧ (¬x ∨ ¬y) is unsatisfiable but is not detected as such by my solver without guessing. Probably it is the smallest formula with this property.. Does SmallCheck find it? Or an even smaller one?

At depth six the are still no counter examples:

``````Depth 6:
Completed 97589351 test(s) without failure.
But 25183775 did not meet ==> condition.
``````

Searching the almost 0.1 billion candidates took less than 10 minutes on my machine. At depth seven, SmallCheck eats all my memory. Sadly.

As we know, that we only need two variables, we can alter the `Serial` instance for literals accordingly:

``instance Serial Literal where  series _ = map Literal \$    concat [ [Var x, Not (Var x)] | x <- [0,1] ]  coseries = undefined``

There are still no conter examples until depth four..

``````Depth 0:
Completed 1 test(s) without failure.
But 1 did not meet ==> condition.
Depth 1:
Completed 2 test(s) without failure.
But 1 did not meet ==> condition.
Depth 2:
Completed 11 test(s) without failure.
But 5 did not meet ==> condition.
Depth 3:
Completed 232 test(s) without failure.
But 93 did not meet ==> condition.
Depth 4:
Completed 19721 test(s) without failure.
But 7221 did not meet ==> condition.
``````

Hey, none at depth five either:

``````Depth 5:
Completed 6724862 test(s) without failure.
But 2342549 did not meet ==> condition.
``````

I am eagerly waiting for the search on depth six to reveal one.

``````Depth 6:

``````

``````  Failed test no. 47219388. Test values follow.
(Var 0 :||: (Var 0 :||: (Var 0 :||: (Var 0 :||: (Var 1 :||: No))))) :&&: ((Var 0 :||: (Var 0 :||: (Var 0 :||: (Not (Var 1) :||: No)))) :&&: ((Not (Var 0) :||: (Not (Var 0) :||: (Var 1 :||: No))) :&&: ((Not (Var 0) :||: (Not (Var 1) :||: No)) :&&: Yes)))
``````

There it is! It looks larger than the one we created by hand but it only adds redundant literals. If we remove them, the formular reads (x ∨ y) ∧ (x ∨ ¬y) ∧ (¬x ∨ y) ∧ (¬x ∨ ¬y), exactly the formula we envisioned earlier.

Why do we get all these redundant literals? Apparently, according to SmallCheck the counter example has the same depth as the simplified formula and happens to be found earlier in a depth-first traversal.

## What I learned

Searching for a smallest unsatisfiable formula where my solver needs backtracking to detect its unsatisfiability I opted for SmallCheck to automate the search. It pointed out a missed opportunity for simplification in my solver where deterministic simplification was not possible only because of duplicated literals in clauses. However, waiting for counter examples to be generated took so long that there was enough time to come up with the desired formula by hand. At least, this formula was confirmed by SmallCheck after improving the generation of candidate formulas manually (but only after recognizing that two variables are sufficient).