This is the third post in a series of posts:

- Fast Sudoku Solver in Haskell #1: A Simple Solution
- Fast Sudoku Solver in Haskell #2: A 200x Faster Solution
- Fast Sudoku Solver in Haskell #3: Picking the Right Data Structures

Discuss this post on r/haskell.

Sudoku is a number placement puzzle. It consists of a 9x9 grid which is to be filled with digits from 1 to 9 such that each row, each column and each of the nine 3x3 sub-grids contain all the digits. Some of the cells of the grid come pre-filled and the player has to fill the rest.

In the previous post, we improved the performance of the simple Sudoku solver by implementing a new strategy to prune cells. This new strategy found the digits which occurred uniquely, in pairs, or in triplets and fixed the cells to those digits. It led to a speedup of about 200x over our original naive solution. This is our current run^{1} time for solving all the 49151 17-clue puzzles:

```
$ cat sudoku17.txt | time stack exec sudoku > /dev/null
258.97 real 257.34 user 1.52 sys
```

Let’s try to improve this time.^{2}

Instead of trying to guess how to improve the performance of our solution, let’s be methodical about it. We start with profiling the code to find the bottlenecks. Let’s compile and run the code with profiling flags:

```
$ stack build --profile
$ head -1000 sudoku17.txt | stack exec -- sudoku +RTS -p > /dev/null
```

This generates a `sudoku.prof`

file with the profiling output. Here are the top seven *Cost Centres*^{3} from the file (cleaned for brevity):

Cost Centre | Src | %time | %alloc |
---|---|---|---|

`exclusivePossibilities` |
Sudoku.hs:(49,1)-(62,26) | 18.9 | 11.4 |

`pruneCellsByFixed.pruneCell` |
Sudoku.hs:(75,5)-(76,36) | 17.7 | 30.8 |

`exclusivePossibilities.\.\` |
Sudoku.hs:55:38-70 | 11.7 | 20.3 |

`fixM.\` |
Sudoku.hs:13:27-65 | 10.7 | 0.0 |

`==` |
Sudoku.hs:15:56-57 | 5.6 | 0.0 |

`pruneGrid'` |
Sudoku.hs:(103,1)-(106,64) | 5.0 | 6.7 |

`pruneCellsByFixed` |
Sudoku.hs:(71,1)-(76,36) | 4.5 | 5.0 |

`exclusivePossibilities.\` |
Sudoku.hs:58:36-68 | 3.4 | 2.5 |

*Cost Centre* points to a function, either named or anonymous. *Src* gives the line and column numbers of the source code of the function. *%time* and *%alloc* are the percentages of time spent and memory allocated in the function, respectively.

We see that `exclusivePossibilities`

and the nested functions inside it take up almost 34% time of the entire run time. Second biggest bottleneck is the `pruneCell`

function inside the `pruneCellsByFixed`

function.

We are going to look at `exclusivePossibilities`

later. For now, it is easy to guess the possible reason for `pruneCell`

taking so much time. Here’s the code for reference:

```
pruneCellsByFixed :: [Cell] -> Maybe [Cell]
pruneCellsByFixed cells = traverse pruneCell cells
where
fixeds = [x | Fixed x <- cells]
pruneCell (Possible xs) = makeCell (xs Data.List.\\ fixeds)
pruneCell x = Just x
```

`pruneCell`

uses `Data.List.\\`

to find the difference of the cell’s possible digits and the fixed digits in the cell’s block. In Haskell, lists are implemented as singly linked lists. So, finding the difference or intersection of two lists is O(n^{2}), that is, quadratic asymptotic complexity. Let’s tackle this bottleneck first.

What is a efficient data structure for finding differences and intersections? Why, a *Set* of course! A Set stores unique values and provides fast operations for testing membership of its elements. If we use a Set to represent the possible values of cells instead of a List, the program should run faster. Since the possible values are already unique (`1`

–`9`

), it should not break anything.

Haskell comes with a bunch of Set implementations:

`Data.Set`

which is a generic data structure implemented as self-balancing binary search tree.`Data.HashSet`

which is a generic data structure implemented as hash array mapped trie.`Data.IntSet`

which is a specialized data structure for integer values, implemented as radix tree.

However, a much faster implementation is possible for our particular use-case. We can use a *BitSet*.

A BitSet uses bits to represent unique members of a Set. We map values to particular bits using some function. If the bit corresponding to a particular value is set to 1 then the value is present in the Set, else it is not. So, we need as many bits in a BitSet as the number of values in our domain, which makes is difficult to use for generic problems. But, for our Sudoku solver, we need to store only the digits `1`

–`9`

in the Set, which make BitSet very suitable for us. Also, the Set operations on BitSet are implemented using bit-level instructions in hardware, making them much faster than those on the other data structure listed above.

In Haskell, we can use the `Data.Word`

module to represent a BitSet. Specifically, we can use the `Data.Word.Word16`

type which has sixteen bits because we need only nine bits to represent the nine digits. The bit-level operations on `Word16`

are provided by the `Data.Bits`

module.

First, we replace List with `Word16`

in the `Cell`

type and add a helper function:

```
data Cell = Fixed Data.Word.Word16
| Possible Data.Word.Word16
deriving (Show, Eq)
setBits :: Data.Word.Word16 -> [Data.Word.Word16] -> Data.Word.Word16
setBits = Data.List.foldl' (Data.Bits..|.)
```

Then we replace `Int`

related operations with bit related ones in the read and show functions:

```
readGrid :: String -> Maybe Grid
readGrid s
| length s == 81 =
traverse (traverse readCell) . Data.List.Split.chunksOf 9 $ s
| otherwise = Nothing
where
allBitsSet = 1022
readCell '.' = Just $ Possible allBitsSet
readCell c
| Data.Char.isDigit c && c > '0' =
Just . Fixed . Data.Bits.bit . Data.Char.digitToInt $ c
| otherwise = Nothing
showGrid :: Grid -> String
showGrid = unlines . map (unwords . map showCell)
where
showCell (Fixed x) = show . Data.Bits.countTrailingZeros $ x
showCell _ = "."
showGridWithPossibilities :: Grid -> String
showGridWithPossibilities = unlines . map (unwords . map showCell)
where
showCell (Fixed x) = (show . Data.Bits.countTrailingZeros $ x) ++ " "
showCell (Possible xs) =
"[" ++
map (\i -> if Data.Bits.testBit xs i
then Data.Char.intToDigit i
else ' ')
[1..9]
++ "]"
```

We set the same bits as the digits to indicate the presence of the digits in the possibilities. For example, for digit `1`

, we set the bit 1 so that the resulting `Word16`

is `0000 0000 0000 0010`

or 2. This also means, for fixed cells, the value is count of the zeros from right.

The change in the `exclusivePossibilities`

function is pretty minimal:

```
-exclusivePossibilities :: [Cell] -> [[Int]]
+exclusivePossibilities :: [Cell] -> [Data.Word.Word16]
exclusivePossibilities row =
row
& zip [1..9]
& filter (isPossible . snd)
& Data.List.foldl'
(\acc ~(i, Possible xs) ->
- Data.List.foldl'
- (\acc' x -> Map.insertWith prepend x [i] acc')
- acc
- xs)
+ Data.List.foldl'
+ (\acc' x -> if Data.Bits.testBit xs x
+ then Map.insertWith prepend x [i] acc'
+ else acc')
+ acc
+ [1..9])
Map.empty
& Map.filter ((< 4) . length)
& Map.foldlWithKey' (\acc x is -> Map.insertWith prepend is [x] acc) Map.empty
& Map.filterWithKey (\is xs -> length is == length xs)
& Map.elems
+ & map (Data.List.foldl' Data.Bits.setBit Data.Bits.zeroBits)
where
prepend ~[y] ys = y:ys
```

In the nested folding step, instead of folding over the possible values of the cells, now we fold over the digits from `1`

to `9`

and insert the entry in the map if the bit corresponding to the digit is set in the possibilities. And as the last step, we convert the exclusive possibilities to `Word16`

by folding them, starting with zero. As example in the *REPL* should be instructive:

```
*Main> poss = Data.List.foldl' Data.Bits.setBit Data.Bits.zeroBits
*Main> row = [Possible $ poss [4,6,9], Fixed $ poss [1], Fixed $ poss [5], Possible $ poss [6,9], Fixed $ poss [7], Possible $ poss [2,3,6,8,9], Possible $ poss [6,9], Possible $ poss [2,3,6,8,9], Possible $ poss [2,3,6,8,9]]
*Main> putStr $ showGridWithPossibilities [row]
[ 4 6 9] 1 5 [ 6 9] 7 [ 23 6 89] [ 6 9] [ 23 6 89] [ 23 6 89]
*Main> exclusivePossibilities row
[16,268]
*Main> [poss [4], poss [8,3,2]]
[16,268]
```

This is the same example row as the last time. And it returns same results, excepts as a list of `Word16`

now.

Now, we change `makeCell`

to use bit operations instead of list ones:

```
makeCell :: Data.Word.Word16 -> Maybe Cell
makeCell ys
| ys == Data.Bits.zeroBits = Nothing
| Data.Bits.popCount ys == 1 = Just $ Fixed ys
| otherwise = Just $ Possible ys
```

And we change cell pruning functions too:

```
pruneCellsByFixed :: [Cell] -> Maybe [Cell]
pruneCellsByFixed cells = traverse pruneCell cells
where
- fixeds = [x | Fixed x <- cells]
+ fixeds = setBits Data.Bits.zeroBits [x | Fixed x <- cells]
- pruneCell (Possible xs) = makeCell (xs Data.List.\\ fixeds)
+ pruneCell (Possible xs) =
+ makeCell (xs Data.Bits..&. Data.Bits.complement fixeds)
pruneCell x = Just x
pruneCellsByExclusives :: [Cell] -> Maybe [Cell]
pruneCellsByExclusives cells = case exclusives of
[] -> Just cells
_ -> traverse pruneCell cells
where
exclusives = exclusivePossibilities cells
- allExclusives = concat exclusives
+ allExclusives = setBits Data.Bits.zeroBits exclusives
pruneCell cell@(Fixed _) = Just cell
pruneCell cell@(Possible xs)
| intersection `elem` exclusives = makeCell intersection
| otherwise = Just cell
where
- intersection = xs `Data.List.intersect` allExclusives
+ intersection = xs Data.Bits..&. allExclusives
```

Notice how the list difference and intersection functions are replaced by `Data.Bits`

functions. Specifically, list difference is replace by bitwise-and of the bitwise-complement, and list intersection is replaced by bitwise-and.

We make a one-line change in the `isGridInvalid`

function to find empty possible cells using bit ops:

```
isGridInvalid :: Grid -> Bool
isGridInvalid grid =
any isInvalidRow grid
|| any isInvalidRow (Data.List.transpose grid)
|| any isInvalidRow (subGridsToRows grid)
where
isInvalidRow row =
let fixeds = [x | Fixed x <- row]
- emptyPossibles = [x | Possible x <- row, null x]
+ emptyPossibles = [() | Possible x <- row, x == Data.Bits.zeroBits]
in hasDups fixeds || not (null emptyPossibles)
hasDups l = hasDups' l []
hasDups' [] _ = False
hasDups' (y:ys) xs
| y `elem` xs = True
| otherwise = hasDups' ys (y:xs)
```

And finally, we change the `nextGrids`

functions to use bit operations:

```
nextGrids :: Grid -> (Grid, Grid)
nextGrids grid =
let (i, first@(Fixed _), rest) =
fixCell
. Data.List.minimumBy (compare `Data.Function.on` (possibilityCount . snd))
. filter (isPossible . snd)
. zip [0..]
. concat
$ grid
in (replace2D i first grid, replace2D i rest grid)
where
possibilityCount (Possible xs) = Data.Bits.popCount xs
possibilityCount (Fixed _) = 1
fixCell ~(i, Possible xs) =
let x = Data.Bits.countTrailingZeros xs
in case makeCell (Data.Bits.clearBit xs x) of
Nothing -> error "Impossible case"
Just cell -> (i, Fixed (Data.Bits.bit x), cell)
replace2D :: Int -> a -> [[a]] -> [[a]]
replace2D i v =
let (x, y) = (i `quot` 9, i `mod` 9) in replace x (replace y (const v))
replace p f xs = [if i == p then f x else x | (x, i) <- zip xs [0..]]
```

`possibilityCount`

now uses `Data.Bits.popCount`

to count the number of bits set to 1. `fixCell`

now chooses the first set bit from right as the digit to fix. Rest of the code stays the same. Let’s build and run it:

```
$ stack build
$ cat sudoku17.txt | time stack exec sudoku > /dev/null
69.44 real 69.12 user 0.37 sys
```

Wow! That is almost 3.7x faster than the previous solution. It’s a massive win! But let’s not be content yet. To the profiler again!^{4}

Running the profiler again gives us these top six culprits:

Cost Centre | Src | %time | %alloc |
---|---|---|---|

`exclusivePossibilities` |
Sudoku.hs:(57,1)-(74,26) | 25.2 | 16.6 |

`exclusivePossibilities.\.\` |
Sudoku.hs:64:23-96 | 19.0 | 32.8 |

`fixM.\` |
Sudoku.hs:15:27-65 | 12.5 | 0.1 |

`pruneCellsByFixed` |
Sudoku.hs:(83,1)-(88,36) | 5.9 | 7.1 |

`pruneGrid'` |
Sudoku.hs:(115,1)-(118,64) | 5.0 | 8.6 |

Hurray! `pruneCellsByFixed.pruneCell`

has disappeared from the list of top bottlenecks. Though `exclusivePossibilities`

still remains here as expected.

`exclusivePossibilities`

is a big function. The profiler does not really tell us which parts of it are the slow ones. That’s because by default, the profiler only considers functions as *Cost Centres*. We need to give it hints for it to be able to find bottlenecks inside functions. For that, we need to insert *Cost Centre* annotations in the code:

```
exclusivePossibilities :: [Cell] -> [Data.Word.Word16]
exclusivePossibilities row =
row
& ({-# SCC "EP.zip" #-} zip [1..9])
& ({-# SCC "EP.filter" #-} filter (isPossible . snd))
& ({-# SCC "EP.foldl" #-} Data.List.foldl'
(\acc ~(i, Possible xs) ->
Data.List.foldl'
(\acc' n -> if Data.Bits.testBit xs n
then Map.insertWith prepend n [i] acc'
else acc')
acc
[1..9])
Map.empty)
& ({-# SCC "EP.Map.filter1" #-} Map.filter ((< 4) . length))
& ({-# SCC "EP.Map.foldl" #-}
Map.foldlWithKey'
(\acc x is -> Map.insertWith prepend is [x] acc)
Map.empty)
& ({-# SCC "EP.Map.filter2" #-}
Map.filterWithKey (\is xs -> length is == length xs))
& ({-# SCC "EP.Map.elems" #-} Map.elems)
& ({-# SCC "EP.map" #-}
map (Data.List.foldl' Data.Bits.setBit Data.Bits.zeroBits))
where
prepend ~[y] ys = y:ys
```

Here, `{-# SCC "EP.zip" #-}`

is a *Cost Centre* annotation. `"EP.zip"`

is the name we choose to give to this *Cost Centre*.

After profiling the code again, we get a different list of bottlenecks:

Cost Centre | Src | %time | %alloc |
---|---|---|---|

`exclusivePossibilities.\.\` |
Sudoku.hs:(64,23)-(66,31) | 19.5 | 31.4 |

`fixM.\` |
Sudoku.hs:15:27-65 | 13.1 | 0.1 |

`pruneCellsByFixed` |
Sudoku.hs:(85,1)-(90,36) | 5.4 | 6.8 |

`pruneGrid'` |
Sudoku.hs:(117,1)-(120,64) | 4.8 | 8.3 |

`EP.zip` |
Sudoku.hs:59:27-36 | 4.3 | 10.7 |

`EP.Map.filter1` |
Sudoku.hs:70:35-61 | 4.2 | 0.5 |

`chunksOf` |
Data/List/Split/Internals.hs:(514,1)-(517,49) | 4.1 | 7.4 |

`exclusivePossibilities.\` |
Sudoku.hs:71:64-96 | 4.0 | 3.4 |

`EP.filter` |
Sudoku.hs:60:30-54 | 2.9 | 3.4 |

`EP.foldl` |
Sudoku.hs:(61,29)-(69,15) | 2.8 | 1.8 |

`exclusivePossibilities` |
Sudoku.hs:(57,1)-(76,26) | 2.7 | 1.9 |

`chunksOf.splitter` |
Data/List/Split/Internals.hs:(516,3)-(517,49) | 2.5 | 2.7 |

So almost one-fifth of the time is actually going in this nested one-line anonymous function inside `exclusivePossibilities`

:

But we are going to ignore it for now.

If we look closely, we also find that around 17% of the run time now goes into list traversal and manipulation. This is in the functions `pruneCellsByFixed`

, `pruneGrid'`

, `chunksOf`

and `chunksOf.splitter`

, where the first two are majorly list traversal and transposition, and the last two are list splitting. Maybe it is time to get rid of lists altogether?

Vector is a Haskell library for working with arrays. It implements very performant operations for integer-indexed array data. Unlike the lists in Haskell which are implemented as singly linked lists, vectors are stored in a contiguous set of memory locations. This makes random access to the elements a constant time operation. The memory overhead per additional item in vectors is also much smaller. Lists allocate memory for each item in the heap and have pointers to the memory locations in nodes, leading to a lot of wasted memory in holding pointers. On the other hand, operations on lists are lazy, whereas, operations on vectors are strict, and this may need to useless computation depending on the use-case^{5}.

In our current code, we represent the grid as a list of lists of cells. All the pruning operations require us to traverse the grid list or the row lists. We also need to transform the grid back-and-forth for being able to use the same pruning operations for rows, columns and sub-grids. The pruning of cells and the choosing of pivot cells also requires us to replace cells in the grid with new ones, leading to a lot of list traversals.

To prevent all this linear-time list traversals, we can replace the nested list of lists with a single vector. Then all we need to do it to go over the right parts of this vector, looking up and replacing cells as needed. Since both lookups and updates on vectors are constant time, this should lead to a speedup.

Let’s start by changing the grid to a vector of cells.:

```
data Cell = Fixed Data.Word.Word16
| Possible Data.Word.Word16
deriving (Show, Eq)
type Grid = Data.Vector.Vector Cell
```

Since we plan to traverse different parts of the same vector, let’s define these different parts first:

```
type CellIxs = [Int]
fromXY :: (Int, Int) -> Int
fromXY (x, y) = x * 9 + y
allRowIxs, allColIxs, allSubGridIxs :: [CellIxs]
allRowIxs = [getRow i | i <- [0..8]]
where getRow n = [ fromXY (n, i) | i <- [0..8] ]
allColIxs = [getCol i | i <- [0..8]]
where getCol n = [ fromXY (i, n) | i <- [0..8] ]
allSubGridIxs = [getSubGrid i | i <- [0..8]]
where getSubGrid n = let (r, c) = (n `quot` 3, n `mod` 3)
in [ fromXY (3 * r + i, 3 * c + j) | i <- [0..2], j <- [0..2] ]
```

We define a type for cell indices as a list of integers. Then we create three lists of cell indices: all row indices, all column indices, and all sub-grid indices. Let’s check these out in the *REPL*:

```
*Main> Control.Monad.mapM_ print allRowIxs
[0,1,2,3,4,5,6,7,8]
[9,10,11,12,13,14,15,16,17]
[18,19,20,21,22,23,24,25,26]
[27,28,29,30,31,32,33,34,35]
[36,37,38,39,40,41,42,43,44]
[45,46,47,48,49,50,51,52,53]
[54,55,56,57,58,59,60,61,62]
[63,64,65,66,67,68,69,70,71]
[72,73,74,75,76,77,78,79,80]
*Main> Control.Monad.mapM_ print allColIxs
[0,9,18,27,36,45,54,63,72]
[1,10,19,28,37,46,55,64,73]
[2,11,20,29,38,47,56,65,74]
[3,12,21,30,39,48,57,66,75]
[4,13,22,31,40,49,58,67,76]
[5,14,23,32,41,50,59,68,77]
[6,15,24,33,42,51,60,69,78]
[7,16,25,34,43,52,61,70,79]
[8,17,26,35,44,53,62,71,80]
*Main> Control.Monad.mapM_ print allSubGridIxs
[0,1,2,9,10,11,18,19,20]
[3,4,5,12,13,14,21,22,23]
[6,7,8,15,16,17,24,25,26]
[27,28,29,36,37,38,45,46,47]
[30,31,32,39,40,41,48,49,50]
[33,34,35,42,43,44,51,52,53]
[54,55,56,63,64,65,72,73,74]
[57,58,59,66,67,68,75,76,77]
[60,61,62,69,70,71,78,79,80]
```

We can verify manually that these indices are correct.

Read and show functions are easy to change for vector:

```
readGrid :: String -> Maybe Grid
readGrid s
- | length s == 81 = traverse (traverse readCell) . Data.List.Split.chunksOf 9 $ s
+ | length s == 81 = Data.Vector.fromList <$> traverse readCell s
| otherwise = Nothing
where
allBitsSet = 1022
readCell '.' = Just $ Possible allBitsSet
readCell c
| Data.Char.isDigit c && c > '0' =
Just . Fixed . Data.Bits.bit . Data.Char.digitToInt $ c
| otherwise = Nothing
showGrid :: Grid -> String
-showGrid = unlines . map (unwords . map showCell)
+showGrid grid =
+ unlines . map (unwords . map (showCell . (grid !))) $ allRowIxs
where
showCell (Fixed x) = show . Data.Bits.countTrailingZeros $ x
showCell _ = "."
showGridWithPossibilities :: Grid -> String
-showGridWithPossibilities = unlines . map (unwords . map showCell)
+showGridWithPossibilities grid =
+ unlines . map (unwords . map (showCell . (grid !))) $ allRowIxs
where
showCell (Fixed x) = (show . Data.Bits.countTrailingZeros $ x) ++ " "
showCell (Possible xs) =
"[" ++
map (\i -> if Data.Bits.testBit xs i
then Data.Char.intToDigit i
else ' ')
[1..9]
++ "]"
```

`readGrid`

simply changes to work on a single vector of cells instead of a list of lists. Show functions have a pretty minor change to do lookups from a vector using the row indices and the `(!)`

function. The `(!)`

function is the vector indexing function which is similar to the `(!!)`

function, except it executes in constant time.

The pruning related functions are rewritten for working with vectors:

```
replaceCell :: Int -> Cell -> Grid -> Grid
replaceCell i c g = g Data.Vector.// [(i, c)]
pruneCellsByFixed :: Grid -> CellIxs -> Maybe Grid
pruneCellsByFixed grid cellIxs =
Control.Monad.foldM pruneCell grid . map (\i -> (i, grid ! i)) $ cellIxs
where
fixeds = setBits Data.Bits.zeroBits [x | Fixed x <- map (grid !) cellIxs]
pruneCell g (_, Fixed _) = Just g
pruneCell g (i, Possible xs)
| xs' == xs = Just g
| otherwise = flip (replaceCell i) g <$> makeCell xs'
where
xs' = xs Data.Bits..&. Data.Bits.complement fixeds
pruneCellsByExclusives :: Grid -> CellIxs -> Maybe Grid
pruneCellsByExclusives grid cellIxs = case exclusives of
[] -> Just grid
_ -> Control.Monad.foldM pruneCell grid . zip cellIxs $ cells
where
cells = map (grid !) cellIxs
exclusives = exclusivePossibilities cells
allExclusives = setBits Data.Bits.zeroBits exclusives
pruneCell g (_, Fixed _) = Just g
pruneCell g (i, Possible xs)
| intersection == xs = Just g
| intersection `elem` exclusives =
flip (replaceCell i) g <$> makeCell intersection
| otherwise = Just g
where
intersection = xs Data.Bits..&. allExclusives
pruneCells :: Grid -> CellIxs -> Maybe Grid
pruneCells grid cellIxs =
fixM (flip pruneCellsByFixed cellIxs) grid
>>= fixM (flip pruneCellsByExclusives cellIxs)
```

All the three functions now take the grid and the cell indices instead of a list of cells, and use the cell indices to lookup the cells from the grid. Also, instead of using the `traverse`

function as earlier, now we use the `Control.Monad.foldM`

function to fold over the cell-index-and-cell tuples in the context of the `Maybe`

monad, making changes to the grid directly.

We use the `replaceCell`

function to replace cells at an index in the grid. It is a simple wrapper over the vector update function `Data.Vector.//`

. Rest of the code is same in essence, except a few changes to accommodate the changed function parameters.

`pruneGrid'`

function does not need to do transpositions and back-transpositions anymore as now we use the cell indices to go over the right parts of the grid vector directly:

```
pruneGrid' :: Grid -> Maybe Grid
pruneGrid' grid =
Control.Monad.foldM pruneCells grid allRowIxs
>>= flip (Control.Monad.foldM pruneCells) allColIxs
>>= flip (Control.Monad.foldM pruneCells) allSubGridIxs
```

Notice that the `traverse`

function here is also replaced by the `Control.Monad.foldM`

function.

Similarly, the grid predicate functions change a little to go over a vector instead of a list of lists:

```
isGridFilled :: Grid -> Bool
-isGridFilled grid = null [ () | Possible _ <- concat grid ]
+isGridFilled = not . Data.Vector.any isPossible
isGridInvalid :: Grid -> Bool
isGridInvalid grid =
- any isInvalidRow grid
- || any isInvalidRow (Data.List.transpose grid)
- || any isInvalidRow (subGridsToRows grid)
+ any isInvalidRow (map (map (grid !)) allRowIxs)
+ || any isInvalidRow (map (map (grid !)) allColIxs)
+ || any isInvalidRow (map (map (grid !)) allSubGridIxs)
```

And finally, we change the `nextGrids`

function to replace the list related operations with the vector related ones:

```
nextGrids :: Grid -> (Grid, Grid)
nextGrids grid =
let (i, first@(Fixed _), rest) =
fixCell
- . Data.List.minimumBy
+ . Data.Vector.minimumBy
(compare `Data.Function.on` (possibilityCount . snd))
- . filter (isPossible . snd)
- . zip [0..]
- . concat
+ . Data.Vector.imapMaybe
+ (\j cell -> if isPossible cell then Just (j, cell) else Nothing)
$ grid
- in (replace2D i first grid, replace2D i rest grid)
+ in (replaceCell i first grid, replaceCell i rest grid)
```

We also switch the `replace2D`

function which went over the entire list of lists of cells to replace a cell, with the vector-based `replaceCell`

function.

All the required changes are done. Let’s do a run:

```
$ stack build
$ cat sudoku17.txt | time stack exec sudoku > /dev/null
88.53 real 88.16 user 0.41 sys
```

Oops! Instead of getting a speedup, our vector-based code is actually 1.3x slower than the list-based code. How did this happen? Time to bust out the profiler again!

`(==)`

Profiling the current code gives us the following hotspots:

Cost Centre | Src | %time | %alloc |
---|---|---|---|

`>>=` |
Data/Vector/Fusion/Util.hs:36:3-18 | 52.2 | 51.0 |

`basicUnsafeIndexM` |
Data/Vector.hs:278:3-62 | 22.2 | 20.4 |

`exclusivePossibilities` |
Sudoku.hs:(75,1)-(93,26) | 6.8 | 8.3 |

`exclusivePossibilities.\.\` |
Sudoku.hs:83:23-96 | 3.8 | 8.8 |

`pruneCellsByFixed.fixeds` |
Sudoku.hs:105:5-77 | 2.0 | 1.7 |

We see a sudden appearance of `(>>=)`

from the `Data.Vector.Fusion.Util`

module at the top of the list, taking more than half of the run time. For more clues, we dive into the detailed profiler report and find this bit:

Cost Centre | Src | %time | %alloc |
---|---|---|---|

`pruneGrid` |
Sudoku.hs:143:1-27 | 0.0 | 0.0 |

`fixM` |
Sudoku.hs:16:1-65 | 0.1 | 0.0 |

`fixM.\` |
Sudoku.hs:16:27-65 | 0.2 | 0.1 |

`==` |
Data/Vector.hs:287:3-50 | 1.0 | 1.4 |

`>>=` |
Data/Vector/Fusion/Util.hs:36:3-18 | 51.9 | 50.7 |

`basicUnsafeIndexM` |
Data/Vector.hs:278:3-62 | 19.3 | 20.3 |

Here, the indentation indicated nesting of operations. We see that both the `(>>=)`

and `basicUnsafeIndexM`

functions — which together take around three-quarter of the run time — are being called from the `(==)`

function in the `fixM`

function^{6}. It seems like we are checking for equality too many times. Here’s the usage of the `fixM`

for reference:

```
pruneCells :: Grid -> CellIxs -> Maybe Grid
pruneCells grid cellIxs =
fixM (flip pruneCellsByFixed cellIxs) grid
>>= fixM (flip pruneCellsByExclusives cellIxs)
pruneGrid :: Grid -> Maybe Grid
pruneGrid = fixM pruneGrid'
```

In `pruneGrid`

, we run `pruneGrid'`

till the resultant grid settles, that is, the grid computed in a particular iteration is **equal to** the grid in the previous iteration. Interestingly, we do the same thing in `pruneCells`

too. We equate **the whole grid** to check for settling of each block of cells. This is the reason of the slowdown.

Why did we add `fixM`

in the `pruneCells`

function at all? Quoting from the previous post,

We need to run

`pruneCellsByFixed`

and`pruneCellsByExclusives`

repeatedly using`fixM`

because an unsettled row can lead to wrong solutions.Imagine a row which just got a

`9`

fixed because of`pruneCellsByFixed`

. If we don’t run the function again, the row may be left with one non-fixed cell with a`9`

. When we run this row through`pruneCellsByExclusives`

, it’ll consider the`9`

in the non-fixed cell as aSingleand fix it. This will lead to two`9`

s in the same row, causing the solution to fail.

So the reason we added `fixM`

is that, we run the two pruning strategies one-after-another. That way, they see the cells in the same block in different states. If we were to merge the two pruning functions into a single one such that they work in lockstep, we would not need to run `fixM`

at all!

With this idea, we rewrite `pruneCells`

as a single function:

```
pruneCells :: Grid -> CellIxs -> Maybe Grid
pruneCells grid cellIxs = Control.Monad.foldM pruneCell grid cellIxs
where
cells = map (grid !) cellIxs
exclusives = exclusivePossibilities cells
allExclusives = setBits Data.Bits.zeroBits exclusives
fixeds = setBits Data.Bits.zeroBits [x | Fixed x <- cells]
pruneCell g i =
pruneCellByFixed g (i, g ! i) >>= \g' -> pruneCellByExclusives g' (i, g' ! i)
pruneCellByFixed g (_, Fixed _) = Just g
pruneCellByFixed g (i, Possible xs)
| xs' == xs = Just g
| otherwise = flip (replaceCell i) g <$> makeCell xs'
where
xs' = xs Data.Bits..&. Data.Bits.complement fixeds
pruneCellByExclusives g (_, Fixed _) = Just g
pruneCellByExclusives g (i, Possible xs)
| null exclusives = Just g
| intersection == xs = Just g
| intersection `elem` exclusives =
flip (replaceCell i) g <$> makeCell intersection
| otherwise = Just g
where
intersection = xs Data.Bits..&. allExclusives
```

We have merged the two pruning functions almost blindly. The important part here is the nested `pruneCell`

function which uses monadic bind `(>>=)`

to ensure that cells fixed in the first step are seen by the next step. Merging the two functions ensures that both strategies will see same *Exclusives* and *Fixeds*, thereby running in lockstep.

Let’s try it out:

```
$ stack build
$ cat sudoku17.txt | time stack exec sudoku > /dev/null
57.67 real 57.12 user 0.46 sys
```

Ah, now it’s faster than the list-based implementation by 1.2x^{7}. Let’s see what the profiler says:

Cost Centre | Src | %time | %alloc |
---|---|---|---|

`exclusivePossibilities.\.\` |
Sudoku.hs:82:23-96 | 15.7 | 33.3 |

`pruneCells` |
Sudoku.hs:(101,1)-(126,53) | 9.6 | 6.8 |

`pruneCells.pruneCell` |
Sudoku.hs:(108,5)-(109,83) | 9.5 | 2.1 |

`basicUnsafeIndexM` |
Data/Vector.hs:278:3-62 | 9.4 | 0.5 |

`pruneCells.pruneCell.\` |
Sudoku.hs:109:48-83 | 7.6 | 2.1 |

`pruneCells.cells` |
Sudoku.hs:103:5-40 | 7.1 | 10.9 |

`exclusivePossibilities.\` |
Sudoku.hs:87:64-96 | 3.5 | 3.8 |

`EP.Map.filter1` |
Sudoku.hs:86:35-61 | 3.0 | 0.6 |

`>>=` |
Data/Vector/Fusion/Util.hs:36:3-18 | 2.8 | 2.0 |

`replaceCell` |
Sudoku.hs:59:1-45 | 2.5 | 1.1 |

`EP.filter` |
Sudoku.hs:78:30-54 | 2.4 | 3.3 |

`primitive` |
Control/Monad/Primitive.hs:195:3-16 | 2.3 | 6.5 |

The double nested anonymous function mentioned before is still the biggest culprit but `fixM`

has disappeared from the list. Let’s tackle `exclusivePossibilities`

now.

Here’s `exclusivePossibilities`

again for reference:

```
exclusivePossibilities :: [Cell] -> [Data.Word.Word16]
exclusivePossibilities row =
row
& zip [1..9]
& filter (isPossible . snd)
& Data.List.foldl'
(\acc ~(i, Possible xs) ->
Data.List.foldl'
(\acc' n -> if Data.Bits.testBit xs n
then Map.insertWith prepend n [i] acc'
else acc')
acc
[1..9])
Map.empty
& Map.filter ((< 4) . length)
& Map.foldlWithKey'(\acc x is -> Map.insertWith prepend is [x] acc) Map.empty
& Map.filterWithKey (\is xs -> length is == length xs)
& Map.elems
& map (Data.List.foldl' Data.Bits.setBit Data.Bits.zeroBits)
where
prepend ~[y] ys = y:ys
```

Let’s zoom into lines 6–14. Here, we do a fold with a nested fold over the non-fixed cells of the given block to accumulate the mapping from the digits to the indices of the cells they occur in. We use a `Data.Map.Strict`

map as the accumulator. If a digit is not present in the map as a key then we add a singleton list containing the corresponding cell index as the value. If the digit is already present in the map then we prepend the cell index to the list of indices for the digit. So we end up “mutating” the map repeatedly.

Of course, it’s not actual mutation because the map data structure we are using is immutable. Each change to the map instance creates a new copy with the addition, which we thread through the fold operation, and we get the final copy at the end. This may be the reason of the slowness in this section of the code.

What if, instead of using an immutable data structure for this, we used a mutable one? But how can we do that when we know that Haskell is a pure language? Purity means that all code must be referentially transparent, and mutability certainly isn’t. It turns out, there is an escape hatch to mutability in Haskell. Quoting the relevant section from the book Real World Haskell:

Haskell provides a special monad, named

`ST`

, which lets us work safely with mutable state. Compared to the`State`

monad, it has some powerful added capabilities.

- We can
thawan immutable array to give a mutable array; modify the mutable array in place; and freeze a new immutable array when we are done.- We have the ability to use
mutable references. This lets us implement data structures that we can modify after construction, as in an imperative language. This ability is vital for some imperative data structures and algorithms, for which similarly efficient purely functional alternatives have not yet been discovered.

So if we use a mutable map in the `ST`

monad, we may be able to get rid of this bottleneck. But, we can actually do better! Since the keys of our map are digits `1`

–`9`

, we can use a mutable vector to store the indices. In fact, we can go one step even further and store the indices as a BitSet as `Word16`

because they also range from 1 to 9, and are unique for a block. This lets us use an unboxed mutable vector. What is *unboxing* you ask? Quoting from the GHC docs:

Most types in GHC are boxed, which means that values of that type are represented by a pointer to a heap object. The representation of a Haskell

`Int`

, for example, is a two-word heap object. An unboxed type, however, is represented by the value itself, no pointers or heap allocation are involved.

When combined with vector, unboxing of values means the whole vector is stored as single byte array, avoiding pointer redirections completely. This is more memory efficient and allows better usage of caches^{8}. Let’s rewrite `exclusivePossibilities`

using `ST`

and unboxed mutable vectors.

First we write the core of this operation, the function `cellIndicesList`

which take a list of cells and returns the digit to cell indices mapping. The mapping is returned as a list. The zeroth value in this list is the indices of the cells which have `1`

as a possible digit, and so on. The indices themselves are packed as BitSets. If the bit 1 is set then the first cell has a particular digit. Let’s say it returns `[0,688,54,134,0,654,652,526,670]`

. In 10-bit binary it is:

`[0000000000, 1010110000, 0000110110, 0010000110, 0000000000, 1010001110, 1010001100, 1000001110, 1010011110]`

We can arrange it in a table for further clarity:

Digits | Cell 9 | Cell 8 | Cell 7 | Cell 6 | Cell 5 | Cell 4 | Cell 3 | Cell 2 | Cell 1 |
---|---|---|---|---|---|---|---|---|---|

1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |

2 | 1 | 0 | 1 | 0 | 1 | 1 | 0 | 0 | 0 |

3 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 1 | 1 |

4 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 1 |

5 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |

6 | 1 | 0 | 1 | 0 | 0 | 0 | 1 | 1 | 1 |

7 | 1 | 0 | 1 | 0 | 0 | 0 | 1 | 1 | 0 |

8 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 |

9 | 1 | 0 | 1 | 0 | 0 | 1 | 1 | 1 | 1 |

If the value of the intersection of a particular digit and a particular cell index in the table is set to 1, then the digit is a possibility in the cell, else it is not. Here’s the code:

```
cellIndicesList :: [Cell] -> [Data.Word.Word16]
cellIndicesList cells =
Data.Vector.Unboxed.toList $ Control.Monad.ST.runST $ do
vec <- Data.Vector.Unboxed.Mutable.replicate 9 Data.Bits.zeroBits
ref <- Data.STRef.newSTRef (1 :: Int)
Control.Monad.forM_ cells $ \cell -> do
i <- Data.STRef.readSTRef ref
case cell of
Fixed _ -> return ()
Possible xs -> Control.Monad.forM_ [0..8] $ \d ->
Control.Monad.when (Data.Bits.testBit xs (d+1)) $
Data.Vector.Unboxed.Mutable.unsafeModify vec (`Data.Bits.setBit` i) d
Data.STRef.writeSTRef ref (i+1)
Data.Vector.Unboxed.unsafeFreeze vec
```

The whole mutable code runs inside the `runST`

function. `runST`

take an operation in `ST`

monad and executes it, making sure that the mutable references created inside it cannot escape the scope of `runST`

. This is done using a type-system trickery called Rank-2 types.

Inside the `ST`

operation, we start with creating a mutable vector of `Word16`

s of size 9 with all its values initially set to zero. We also initialize a mutable reference to keep track of the cell index we are on. Then we run two nested for loops, going over each cell and each digit `1`

–`9`

, setting the right bit of the right index of the mutable vector. During this, we mutate the vector directly using the `Data.Vector.Unboxed.Mutable.unsafeModify`

function. At the end of the `ST`

operation, we freeze the mutable vector to return an immutable version of it. Outside `runST`

, we convert the immutable vector to a list. Notice how this code is quite similar to how we’d write it in imperative programming languages like C or Java^{9}.

It is easy to use this function now to rewrite `exclusivePossibilities`

:

```
exclusivePossibilities :: [Cell] -> [Data.Word.Word16]
exclusivePossibilities row =
row
- & zip [1..9]
- & filter (isPossible . snd)
- & Data.List.foldl'
- (\acc ~(i, Possible xs) ->
- Data.List.foldl'
- (\acc' n -> if Data.Bits.testBit xs n
- then Map.insertWith prepend n [i] acc'
- else acc')
- acc
- [1..9])
- Map.empty
+ & cellIndicesList
+ & zip [1..9]
- & Map.filter ((< 4) . length)
- & Map.foldlWithKey' (\acc x is -> Map.insertWith prepend is [x] acc) Map.empty
- & Map.filterWithKey (\is xs -> length is == length xs)
+ & filter (\(_, is) -> let p = Data.Bits.popCount is in p > 0 && p < 4)
+ & Data.List.foldl' (\acc (x, is) -> Map.insertWith prepend is [x] acc) Map.empty
+ & Map.filterWithKey (\is xs -> Data.Bits.popCount is == length xs)
& Map.elems
& map (Data.List.foldl' Data.Bits.setBit Data.Bits.zeroBits)
where
prepend ~[y] ys = y:ys
```

We replace the nested two-fold operation with `cellIndicesList`

. Then we replace some map related function with the corresponding list ones because `cellIndicesList`

returns a list. We also replace the `length`

function call on cell indices with `Data.Bits.popCount`

function call as the indices are represented as `Word16`

now.

That is it. Let’s build and run it now:

```
$ stack build
$ cat sudoku17.txt | time stack exec sudoku > /dev/null
35.04 real 34.84 user 0.24 sys
```

That’s a 1.6x speedup over the map-and-fold based version. Let’s check what the profiler has to say:

Cost Centre | Src | %time | %alloc |
---|---|---|---|

`cellIndicesList.\.\` |
Sudoku.hs:(88,11)-(89,81) | 10.7 | 6.0 |

`primitive` |
Control/Monad/Primitive.hs:195:3-16 | 7.9 | 6.9 |

`pruneCells` |
Sudoku.hs:(113,1)-(138,53) | 7.5 | 6.4 |

`cellIndicesList` |
Sudoku.hs:(79,1)-(91,40) | 7.4 | 10.1 |

`basicUnsafeIndexM` |
Data/Vector.hs:278:3-62 | 7.3 | 0.5 |

`pruneCells.pruneCell` |
Sudoku.hs:(120,5)-(121,83) | 6.8 | 2.0 |

`exclusivePossibilities` |
Sudoku.hs:(94,1)-(104,26) | 6.5 | 9.7 |

`pruneCells.pruneCell.\` |
Sudoku.hs:121:48-83 | 6.1 | 2.0 |

`cellIndicesList.\` |
Sudoku.hs:(83,42)-(90,37) | 5.5 | 3.5 |

`pruneCells.cells` |
Sudoku.hs:115:5-40 | 5.0 | 10.4 |

The run time is spread quite evenly over all the functions now and there are no hotspots anymore. We stop optimizating at this point^{10}. Let’s see how far we have come up.

Below is a table showing the speedups we got with each new implementation:

Implementation | Run Time (s) | Incremental Speedup | Cumulative Speedup |
---|---|---|---|

Simple | 47450 | 1x | 1x |

Exclusive Pruning | 258.97 | 183.23x | 183x |

BitSet | 69.44 | 3.73x | 683x |

Vector | 57.67 | 1.20x | 823x |

Mutable Vector | 35.04 | 1.65x | 1354x |

The first improvement over the simple solution got us the most major speedup of 183x. After that, we followed the profiler, fixing bottlenecks by using the right data structures. We got quite significant speedup over the naive list-based solution, leading to drop in the run time from 259 seconds to 35 seconds. In total, we have done more than a thousand times improvement in the run time since the first solution!

In this post, we improved upon our list-based Sudoku solution from the last time. We profiled the code at each step, found the bottlenecks and fixed them by choosing the right data structure for the case. We ended up using BitSets and Vectors — both immutable and mutable varieties — for the different parts of the code. Finally, we sped up our program by 7.4 times. Can we go even faster? How about using all those other CPU cores which have been lying idle? Come back for the next post in this series where we’ll explore the parallel programming facilities in Haskell. The code till now is available here. Discuss this post on r/haskell or leave a comment.

All the runs were done on my MacBook Pro from 2014 with 2.2 GHz Intel Core i7 CPU and 16 GB memory.↩

A lot of the code in this post references the code from the previous posts, including showing diffs. So, please read the previous posts if you have not already done so.↩

Notice the British English spelling of the word “Centre”. GHC was originally developed in University of Glasgow in Scotland.↩

The code for the BitSet based implementation can be found here.↩

This article on School of Haskell goes into details about performance of vectors vs. lists. There are also these benchmarks for sequence data structures in Haskell: lists, vectors, seqs, etc.↩

We see Haskell’s laziness at work here. In the code for the

`fixM`

function, the`(==)`

function is nested inside the`(>>=)`

function, but because of laziness, they are actually evaluated in the reverse order. The evaluation of parameters for the`(==)`

function causes the`(>>=)`

function to be evaluated.↩The code for the vector based implementation can be found here.↩

Unboxed vectors have some restrictions on the kind of values that can be put into them but

`Word16`

already follows those restrictions so we are good.↩Haskell can be a pretty good imperative programming language using the

`ST`

monad. This article shows how to implement some algorithms which require mutable data structures in Haskell.↩The code for the mutable vector based implementation can be found here.↩

If you liked this post, please leave a comment.

]]>This is the second post in a series of posts:

- Fast Sudoku Solver in Haskell #1: A Simple Solution
- Fast Sudoku Solver in Haskell #2: A 200x Faster Solution
- Fast Sudoku Solver in Haskell #3: Picking the Right Data Structures

Discuss this post on r/haskell.

Sudoku is a number placement puzzle. It consists of a 9x9 grid which is to be filled with digits from 1 to 9 such that each row, each column and each of the nine 3x3 sub-grids contain all the digits. Some of the cells of the grid come pre-filled and the player has to fill the rest.

In the previous post, we implemented a simple Sudoku solver without paying much attention to its performance characteristics. We ran^{1} some of 17-clue puzzles^{2} through our program to see how fast it was:

```
$ head -n100 sudoku17.txt | time stack exec sudoku
... output omitted ...
116.70 real 198.09 user 94.46 sys
```

So, it took about 117 seconds to solve one hundred puzzles. At this speed, it would take about 16 hours to solve all the 49151 puzzles contained in the file. This is way too slow. We need to find ways to make it faster. Let’s go back to the drawing board.

In a Sudoku puzzle, we have a partially filled 9x9 grid which we have to fill completely while following the constraints of the game.

```
+-------+-------+-------+
| . . . | . . . | . 1 . |
| 4 . . | . . . | . . . |
| . 2 . | . . . | . . . |
+-------+-------+-------+
| . . . | . 5 . | 4 . 7 |
| . . 8 | . . . | 3 . . |
| . . 1 | . 9 . | . . . |
+-------+-------+-------+
| 3 . . | 4 . . | 2 . . |
| . 5 . | 1 . . | . . . |
| . . . | 8 . 6 | . . . |
+-------+-------+-------+
A sample puzzle
+-------+-------+-------+
| 6 9 3 | 7 8 4 | 5 1 2 |
| 4 8 7 | 5 1 2 | 9 3 6 |
| 1 2 5 | 9 6 3 | 8 7 4 |
+-------+-------+-------+
| 9 3 2 | 6 5 1 | 4 8 7 |
| 5 6 8 | 2 4 7 | 3 9 1 |
| 7 4 1 | 3 9 8 | 6 2 5 |
+-------+-------+-------+
| 3 1 9 | 4 7 5 | 2 6 8 |
| 8 5 6 | 1 2 9 | 7 4 3 |
| 2 7 4 | 8 3 6 | 1 5 9 |
+-------+-------+-------+
and its solution
```

Earlier, we followed a simple pruning algorithm which removed all the solved (or *fixed*) digits from neighbours of the fixed cells. We repeated the pruning till the fixed and non-fixed values in the grid stopped changing (or the grid *settled*). Here’s an example of a grid before pruning:

```
+-------------------------------------+-------------------------------------+-------------------------------------+
| [123456789] [123456789] [123456789] | [123456789] [123456789] [123456789] | [123456789] 1 [123456789] |
| 4 [123456789] [123456789] | [123456789] [123456789] [123456789] | [123456789] [123456789] [123456789] |
| [123456789] 2 [123456789] | [123456789] [123456789] [123456789] | [123456789] [123456789] [123456789] |
+-------------------------------------+-------------------------------------+-------------------------------------+
| [123456789] [123456789] [123456789] | [123456789] 5 [123456789] | 4 [123456789] 7 |
| [123456789] [123456789] 8 | [123456789] [123456789] [123456789] | 3 [123456789] [123456789] |
| [123456789] [123456789] 1 | [123456789] 9 [123456789] | [123456789] [123456789] [123456789] |
+-------------------------------------+-------------------------------------+-------------------------------------+
| 3 [123456789] [123456789] | 4 [123456789] [123456789] | 2 [123456789] [123456789] |
| [123456789] 5 [123456789] | 1 [123456789] [123456789] | [123456789] [123456789] [123456789] |
| [123456789] [123456789] [123456789] | 8 [123456789] 6 | [123456789] [123456789] [123456789] |
+-------------------------------------+-------------------------------------+-------------------------------------+
```

And here’s the same grid when it settles after repeated pruning:

```
+-------------------------------------+-------------------------------------+-------------------------------------+
| [ 56789] [ 3 6789] [ 3 567 9] | [ 23 567 9] [ 234 6 8 ] [ 2345 789] | [ 56789] 1 [ 23456 89] |
| 4 [1 3 6789] [ 3 567 9] | [ 23 567 9] [123 6 8 ] [123 5 789] | [ 56789] [ 23 56789] [ 23 56 89] |
| [1 56789] 2 [ 3 567 9] | [ 3 567 9] [1 34 6 8 ] [1 345 789] | [ 56789] [ 3456789] [ 3456 89] |
+-------------------------------------+-------------------------------------+-------------------------------------+
| [ 2 6 9] [ 3 6 9] [ 23 6 9] | [ 23 6 ] 5 [123 8 ] | 4 [ 2 6 89] 7 |
| [ 2 567 9] [ 4 67 9] 8 | [ 2 67 ] [12 4 6 ] [12 4 7 ] | 3 [ 2 56 9] [12 56 9] |
| [ 2 567 ] [ 34 67 ] 1 | [ 23 67 ] 9 [ 234 78 ] | [ 56 8 ] [ 2 56 8 ] [ 2 56 8 ] |
+-------------------------------------+-------------------------------------+-------------------------------------+
| 3 [1 6 89] [ 6 9] | 4 7 [ 5 9] | 2 [ 56 89] [1 56 89] |
| [ 2 6789] 5 [ 2 4 67 9] | 1 [ 23 ] [ 23 9] | [ 6789] [ 34 6789] [ 34 6 89] |
| [12 7 9] [1 4 7 9] [ 2 4 7 9] | 8 [ 23 ] 6 | [1 5 7 9] [ 345 7 9] [1 345 9] |
+-------------------------------------+-------------------------------------+-------------------------------------+
```

We see how the possibilities conflicting with the fixed values are removed. We also see how some of the non-fixed cells turn into fixed ones as all their other possible values get eliminated.

This simple strategy follows directly from the constraints of Sudoku. But, are there more complex strategies which are implied indirectly?

Let’s have a look at this sample row captured from a solution in progress:

```
+-------------------------------------+-------------------------------------+-------------------------------------+
| 4 [ 2 6 89] 7 | 3 [ 2 56 9] [12 56 9] | [ 56 8 ] [ 2 56 8 ] [ 2 56 8 ] |
+-------------------------------------+-------------------------------------+-------------------------------------+
```

Notice how the sixth cell is the only one with `1`

as a possibility in it. It is obvious that we should fix the sixth cell to `1`

as we cannot place `1`

in any other cell in the row. Let’s call this the *Singles*^{3} scenario.

But, our current solution will not fix the sixth cell to `1`

till one of these cases arise:

- all other possibilities of the cell are pruned away, or,
- the cell is chosen as pivot in the
`nextGrids`

function and`1`

is chosen as the value to fix.

This may take very long and lead to a longer solution time. Let’s assume that we recognize the Singles scenario while pruning cells and fix the cell to `1`

right then. That would cut down the search tree by a lot and make the solution much faster.

It turns out, we can generalize this pattern. Let’s check out this sample row from middle of a solution:

```
+-------------------------------------+-------------------------------------+-------------------------------------+
| [1 4 9] 3 [1 4567 9] | [1 4 89] [1 4 6 89] [1 4 6 89] | [1 4 89] 2 [1 456789] |
+-------------------------------------+-------------------------------------+-------------------------------------+
```

It is a bit difficult to notice with the naked eye but there’s something special here too. The digits `5`

and `7`

occur only in the third and the ninth cells. Though they are accompanied by other digits in those cells, they are not present in any other cells. This means, we can place `5`

and `7`

either in the third or the ninth cell and no other cells. This implies that we can prune the third and ninth cells to have only `5`

and `7`

like this:

```
+-------------------------------------+-------------------------------------+-------------------------------------+
| [1 4 9] 3 [ 5 7 ] | [1 4 89] [1 4 6 89] [1 4 6 89] | [1 4 89] 2 [ 5 7 ] |
+-------------------------------------+-------------------------------------+-------------------------------------+
```

This is the *Twins* scenario. As we can imagine, this pattern extends to groups of three digits and beyond. When three digits can be found only in three cells in a block, it is the *Triplets* scenario, as in the example below:

```
+-------------------------------------+-------------------------------------+-------------------------------------+
| [ 45 7 ] [ 45 7 ] [ 5 7 ] | 2 [ 3 5 89] 6 | 1 [ 34 89] [ 34 89] |
+-------------------------------------+-------------------------------------+-------------------------------------+
```

In this case, the triplet digits are `3`

, `8`

and `9`

. And as before, we can prune the block by fixing these digits in their cells:

```
+-------------------------------------+-------------------------------------+-------------------------------------+
| [ 45 7 ] [ 45 7 ] [ 5 7 ] | 2 [ 3 89] 6 | 1 [ 3 89] [ 3 89] |
+-------------------------------------+-------------------------------------+-------------------------------------+
```

Let’s call these three scenarios *Exclusives* in general.

We can extend this to *Quadruplets* scenario and further. But such scenarios occur rarely in a 9x9 Sudoku puzzle. Trying to find them may end up being more computationally expensive than the benefit we may get in solution time speedup by finding them.

Now that we have discovered these new strategies to prune cells, let’s implement them in Haskell.

We can implement the three new strategies to prune cells as one function for each. However, we can actually implement all these strategies in a single function. But, this function is a bit more complex than the previous pruning function. So first, let’s try to understand its working using tables. Let’s take this sample row:

```
+-------------------------------------+-------------------------------------+-------------------------------------+
| [ 4 6 9] 1 5 | [ 6 9] 7 [ 23 6 89] | [ 6 9] [ 23 6 89] [ 23 6 89] |
+-------------------------------------+-------------------------------------+-------------------------------------+
```

First, we make a table mapping the digits to the cells in which they occur, excluding the fixed cells:

Digit | Cells |
---|---|

2 | 6, 8, 9 |

3 | 6, 8, 9 |

4 | 1 |

6 | 1, 4, 6, 7, 8, 9 |

8 | 6, 8, 9 |

9 | 1, 4, 6, 7, 8, 9 |

Then, we flip this table and collect all the digits that occur in the same set of cells:

Cells | Digits |
---|---|

1 | 4 |

6, 8, 9 | 2, 3, 8 |

1, 4, 6, 7, 8, 9 | 6, 9 |

And finally, we remove the rows of the table in which the count of the cells is not the same as the count of the digits:

Cells | Digits |
---|---|

1 | 4 |

6, 8, 9 | 2, 3, 8 |

Voilà! We have found a Single `4`

and a set of Triplets `2`

, `3`

and `8`

. You can go over the puzzle row and verify that this indeed is the case.

Translating this logic to Haskell is quite easy now:

```
isPossible :: Cell -> Bool
isPossible (Possible _) = True
isPossible _ = False
exclusivePossibilities :: [Cell] -> [[Int]]
exclusivePossibilities row =
-- input
row
-- [Possible [4,6,9], Fixed 1, Fixed 5, Possible [6,9], Fixed 7, Possible [2,3,6,8,9],
-- Possible [6,9], Possible [2,3,6,8,9], Possible [2,3,6,8,9]]
-- step 1
& zip [1..9]
-- [(1,Possible [4,6,9]),(2,Fixed 1),(3,Fixed 5),(4,Possible [6,9]),(5,Fixed 7),
-- (6,Possible [2,3,6,8,9]),(7,Possible [6,9]),(8,Possible [2,3,6,8,9]),
-- (9,Possible [2,3,6,8,9])]
-- step 2
& filter (isPossible . snd)
-- [(1,Possible [4,6,9]),(4,Possible [6,9]),(6,Possible [2,3,6,8,9]),
-- (7,Possible [6,9]), (8,Possible [2,3,6,8,9]),(9,Possible [2,3,6,8,9])]
-- step 3
& Data.List.foldl'
(\acc ~(i, Possible xs) ->
Data.List.foldl' (\acc' x -> Map.insertWith prepend x [i] acc') acc xs)
Map.empty
-- fromList [(2,[9,8,6]),(3,[9,8,6]),(4,[1]),(6,[9,8,7,6,4,1]),(8,[9,8,6]),
-- (9,[9,8,7,6,4,1])]
-- step 4
& Map.filter ((< 4) . length)
-- fromList [(2,[9,8,6]),(3,[9,8,6]),(4,[1]),(8,[9,8,6])]
-- step 5
& Map.foldlWithKey'(\acc x is -> Map.insertWith prepend is [x] acc) Map.empty
-- fromList [([1],[4]),([9,8,6],[8,3,2])]
-- step 6
& Map.filterWithKey (\is xs -> length is == length xs)
-- fromList [([1],[4]),([9,8,6],[8,3,2])]
-- step 7
& Map.elems
-- [[4],[8,3,2]]
where
prepend ~[y] ys = y:ys
```

We extract the `isPossible`

function to the top level from the `nextGrids`

function for reuse. Then we write the `exclusivePossibilities`

function which finds the Exclusives in the input row. This function is written using the reverse application operator `(&)`

^{4} instead of the usual `($)`

operator so that we can read it from top to bottom. We also show the intermediate values for a sample input after every step in the function chain.

The nub of the function lies in step 3 (pun intended). We do a nested fold over all the non-fixed cells and all the possible digits in them to compute the map^{5} which represents the first table. Thereafter, we filter the map to keep only the entries with length less than four (step 4). Then we flip it to create a new map which represents the second table (step 5). Finally, we filter the flipped map for the entries where the cell count is same as the digit count (step 6) to arrive at the final table. The step 7 just gets the values in the map which is the list of all the Exclusives in the input row.

To start with, we extract some reusable code from the previous `pruneCells`

function and rename it to `pruneCellsByFixed`

:

```
makeCell :: [Int] -> Maybe Cell
makeCell ys = case ys of
[] -> Nothing
[y] -> Just $ Fixed y
_ -> Just $ Possible ys
pruneCellsByFixed :: [Cell] -> Maybe [Cell]
pruneCellsByFixed cells = traverse pruneCell cells
where
fixeds = [x | Fixed x <- cells]
pruneCell (Possible xs) = makeCell (xs Data.List.\\ fixeds)
pruneCell x = Just x
```

Now we write the `pruneCellsByExclusives`

function which uses the `exclusivePossibilities`

function to prune the cells:

```
pruneCellsByExclusives :: [Cell] -> Maybe [Cell]
pruneCellsByExclusives cells = case exclusives of
[] -> Just cells
_ -> traverse pruneCell cells
where
exclusives = exclusivePossibilities cells
allExclusives = concat exclusives
pruneCell cell@(Fixed _) = Just cell
pruneCell cell@(Possible xs)
| intersection `elem` exclusives = makeCell intersection
| otherwise = Just cell
where
intersection = xs `Data.List.intersect` allExclusives
```

`pruneCellsByExclusives`

works exactly as shown in the examples above. We first find the list of Exclusives in the given cells. If there are no Exclusives, there’s nothing to do and we just return the cells. If we find any Exclusives, we `traverse`

the cells, pruning each cell to only the intersection of the possible digits in the cell and Exclusive digits. That’s it! We reuse the `makeCell`

function to create a new cell with the intersection.

As the final step, we rewrite the `pruneCells`

function by combining both the functions.

```
fixM :: (Eq t, Monad m) => (t -> m t) -> t -> m t
fixM f x = f x >>= \x' -> if x' == x then return x else fixM f x'
pruneCells :: [Cell] -> Maybe [Cell]
pruneCells cells = fixM pruneCellsByFixed cells >>= fixM pruneCellsByExclusives
```

We have extracted `fixM`

as a top level function from the `pruneGrid`

function. Just like the `pruneGrid'`

function, we need to use monadic bind (`>>=`

) to chain the two pruning steps. We also use `fixM`

to apply each step repeatedly till the pruned cells settle^{6}.

No further code changes are required. It is time to check out the improvements.

Let’s build the program and run the exact same number of puzzles as before:

```
$ head -n100 sudoku17.txt | time stack exec sudoku
... output omitted ...
0.53 real 0.58 user 0.23 sys
```

Woah! It is way faster than before. Let’s solve all the puzzles now:

```
$ cat sudoku17.txt | time stack exec sudoku > /dev/null
282.98 real 407.25 user 109.27 sys
```

So it is took about 283 seconds to solve all the 49151 puzzles. The speedup is about 200x^{7}. That’s about 5.8 milliseconds per puzzle.

Let’s do a quick profiling to see where the time is going:

```
$ stack build --profile
$ head -n1000 sudoku17.txt | stack exec -- sudoku +RTS -p > /dev/null
```

This generates a file named `sudoku.prof`

with the profiling results. Here are the top five most time-taking functions (cleaned for brevity):

Cost Center | Source | %time | %alloc |
---|---|---|---|

`exclusivePossibilities` |
(49,1)-(62,26) | 17.6 | 11.4 |

`pruneCellsByFixed.pruneCell` |
(75,5)-(76,36) | 16.9 | 30.8 |

`exclusivePossibilities.\.\` |
55:38-70 | 12.2 | 20.3 |

`fixM.\` |
13:27-65 | 10.0 | 0.0 |

`==` |
15:56-57 | 7.2 | 0.0 |

Looking at the report, my guess is that a lot of time is going into list operations. Lists are known to be inefficient in Haskell so maybe we should switch to some other data structures?

As per the comment below by Chris Casinghino, I ran both the versions of code without the `-threaded`

, `-rtsopts`

and `-with-rtsopts=-N`

options. The time for previous post’s code:

```
$ head -n100 sudoku17.txt | time stack exec sudoku
... output omitted ...
96.54 real 95.90 user 0.66 sys
```

And the time for this post’s code:

```
$ cat sudoku17.txt | time stack exec sudoku > /dev/null
258.97 real 257.34 user 1.52 sys
```

So, both the versions run about 10% faster without the threading options. I suspect this has something to do with GHC’s parallel GC as described in this post. So for now, I’ll keep threading disabled.

In this post, we improved upon our simple Sudoku solution from the last time. We discovered and implemented a new strategy to prune cells, and we achieved a 200x speedup. But profiling shows that we still have many possibilities for improvements. We’ll work on that and more in the upcoming posts in this series. The code till now is available here. Discuss this post on r/haskell or leave a comment.

All the runs were done on my MacBook Pro from 2014 with 2.2 GHz Intel Core i7 CPU and 16 GB memory.↩

At least 17 cells must be pre-filled in a Sudoku puzzle for it to have a unique solution. So 17-clue puzzles are the most difficult of all puzzles. This paper by McGuire, Tugemann and Civario gives the proof of the same.↩

“Single” as in “Single child”↩

Reverse application operation is not used much in Haskell. But it is the preferred way of function chaining in some other functional programming languages like Clojure, FSharp, and Elixir.↩

We use Data.Map.Strict as the map implementation.↩

We need to run

`pruneCellsByFixed`

and`pruneCellsByExclusives`

repeatedly using`fixM`

because an unsettled row can lead to wrong solutions.Imagine a row which just got a

`9`

fixed because of`pruneCellsByFixed`

. If we don’t run the function again, the row may be left with one non-fixed cell with a`9`

. When we run this row through`pruneCellsByExclusives`

, it’ll consider the`9`

in the non-fixed cell as a Single and fix it. This will lead to two`9`

s in the same row, causing the solution to fail.↩Speedup calculation: 116.7 / 100 * 49151 / 282.98 = 202.7↩

If you liked this post, please leave a comment.

]]>Haskell is a purely functional programming language. It is a good choice to solve Sudoku given the problem’s combinatorial nature. The aim of this series of posts is to write a **fast** Sudoku solver in Haskell. We’ll focus on both implementing the solution and making it efficient, step-by-step, starting with a slow but simple solution in this post^{1}.

This is the first post in a series of posts:

- Fast Sudoku Solver in Haskell #1: A Simple Solution
- Fast Sudoku Solver in Haskell #2: A 200x Faster Solution
- Fast Sudoku Solver in Haskell #3: Picking the Right Data Structures

Discuss this post on r/haskell.

Solving Sudoku is a constraint satisfaction problem. We are given a partially filled grid which we have to fill completely such that each of the following constraints are satisfied:

- Each of the nine rows must have all the digits, from 1 to 9.
- Each of the nine columns must have all the digits, from 1 to 9.
- Each of the nine 3x3 sub-grids must have all the digits, from 1 to 9.

```
+-------+-------+-------+
| . . . | . . . | . 1 . |
| 4 . . | . . . | . . . |
| . 2 . | . . . | . . . |
+-------+-------+-------+
| . . . | . 5 . | 4 . 7 |
| . . 8 | . . . | 3 . . |
| . . 1 | . 9 . | . . . |
+-------+-------+-------+
| 3 . . | 4 . . | 2 . . |
| . 5 . | 1 . . | . . . |
| . . . | 8 . 6 | . . . |
+-------+-------+-------+
A sample puzzle
+-------+-------+-------+
| 6 9 3 | 7 8 4 | 5 1 2 |
| 4 8 7 | 5 1 2 | 9 3 6 |
| 1 2 5 | 9 6 3 | 8 7 4 |
+-------+-------+-------+
| 9 3 2 | 6 5 1 | 4 8 7 |
| 5 6 8 | 2 4 7 | 3 9 1 |
| 7 4 1 | 3 9 8 | 6 2 5 |
+-------+-------+-------+
| 3 1 9 | 4 7 5 | 2 6 8 |
| 8 5 6 | 1 2 9 | 7 4 3 |
| 2 7 4 | 8 3 6 | 1 5 9 |
+-------+-------+-------+
and its solution
```

Each cell in the grid is member of one row, one column and one sub-grid (called *block* in general). Digits in the pre-filled cells impose constraints on the rows, columns, and sub-grids they are part of. For example, if a cell contains `1`

then no other cell in that cell’s row, column or sub-grid can contain `1`

. Given these constraints, we can devise a simple algorithm to solve Sudoku:

- Each cell contains either a single digit or has a set of possible digits. For example, a grid showing the possibilities of all non-filled cells for the sample puzzle above:

```
+-------------------------------------+-------------------------------------+-------------------------------------+
| [123456789] [123456789] [123456789] | [123456789] [123456789] [123456789] | [123456789] 1 [123456789] |
| 4 [123456789] [123456789] | [123456789] [123456789] [123456789] | [123456789] [123456789] [123456789] |
| [123456789] 2 [123456789] | [123456789] [123456789] [123456789] | [123456789] [123456789] [123456789] |
+-------------------------------------+-------------------------------------+-------------------------------------+
| [123456789] [123456789] [123456789] | [123456789] 5 [123456789] | 4 [123456789] 7 |
| [123456789] [123456789] 8 | [123456789] [123456789] [123456789] | 3 [123456789] [123456789] |
| [123456789] [123456789] 1 | [123456789] 9 [123456789] | [123456789] [123456789] [123456789] |
+-------------------------------------+-------------------------------------+-------------------------------------+
| 3 [123456789] [123456789] | 4 [123456789] [123456789] | 2 [123456789] [123456789] |
| [123456789] 5 [123456789] | 1 [123456789] [123456789] | [123456789] [123456789] [123456789] |
| [123456789] [123456789] [123456789] | 8 [123456789] 6 | [123456789] [123456789] [123456789] |
+-------------------------------------+-------------------------------------+-------------------------------------+
```

- If a cell contains a digit, remove that digit from the list of the possible digits from all its neighboring cells. Neighboring cells are the other cells in the given cell’s row, column and sub-grid. For example, the grid after removing the fixed value
`4`

of the row-2-column-1 cell from its neighboring cells:

```
+-------------------------------------+-------------------------------------+-------------------------------------+
| [123 56789] [123 56789] [123 56789] | [123456789] [123456789] [123456789] | [123456789] 1 [123456789] |
| 4 [123 56789] [123 56789] | [123 56789] [123 56789] [123 56789] | [123 56789] [123 56789] [123 56789] |
| [123 56789] 2 [123 56789] | [123456789] [123456789] [123456789] | [123456789] [123456789] [123456789] |
+-------------------------------------+-------------------------------------+-------------------------------------+
| [123 56789] [123456789] [123456789] | [123456789] 5 [123456789] | 4 [123456789] 7 |
| [123 56789] [123456789] 8 | [123456789] [123456789] [123456789] | 3 [123456789] [123456789] |
| [123 56789] [123456789] 1 | [123456789] 9 [123456789] | [123456789] [123456789] [123456789] |
+-------------------------------------+-------------------------------------+-------------------------------------+
| 3 [123456789] [123456789] | 4 [123456789] [123456789] | 2 [123456789] [123456789] |
| [123 56789] 5 [123456789] | 1 [123456789] [123456789] | [123456789] [123456789] [123456789] |
| [123 56789] [123456789] [123456789] | 8 [123456789] 6 | [123456789] [123456789] [123456789] |
+-------------------------------------+-------------------------------------+-------------------------------------+
```

- Repeat the previous step for all the cells that are have been solved (or
*fixed*), either pre-filled or filled in the previous iteration of the solution. For example, the grid after removing all fixed values from all non-fixed cells:

```
+-------------------------------------+-------------------------------------+-------------------------------------+
| [ 56789] [ 3 6789] [ 3 567 9] | [ 23 567 9] [ 234 678 ] [ 2345 789] | [ 56789] 1 [ 23456 89] |
| 4 [1 3 6789] [ 3 567 9] | [ 23 567 9] [123 678 ] [123 5 789] | [ 56789] [ 23 56789] [ 23 56 89] |
| [1 56789] 2 [ 3 567 9] | [ 3 567 9] [1 34 678 ] [1 345 789] | [ 56789] [ 3456789] [ 3456 89] |
+-------------------------------------+-------------------------------------+-------------------------------------+
| [ 2 6 9] [ 3 6 9] [ 23 6 9] | [ 23 6 ] 5 [123 8 ] | 4 [ 2 6 89] 7 |
| [ 2 567 9] [ 4 67 9] 8 | [ 2 67 ] [12 4 67 ] [12 4 7 ] | 3 [ 2 56 9] [12 56 9] |
| [ 2 567 ] [ 34 67 ] 1 | [ 23 67 ] 9 [ 234 78 ] | [ 56 8 ] [ 2 56 8 ] [ 2 56 8 ] |
+-------------------------------------+-------------------------------------+-------------------------------------+
| 3 [1 6789] [ 67 9] | 4 7 [ 5 7 9] | 2 [ 56789] [1 56 89] |
| [ 2 6789] 5 [ 2 4 67 9] | 1 [ 23 7 ] [ 23 7 9] | [ 6789] [ 34 6789] [ 34 6 89] |
| [12 7 9] [1 4 7 9] [ 2 4 7 9] | 8 [ 23 7 ] 6 | [1 5 7 9] [ 345 7 9] [1 345 9] |
+-------------------------------------+-------------------------------------+-------------------------------------+
```

- Continue till the grid
*settles*, that is, there are no more changes in the possibilities of any cells. For example, the settled grid for the current iteration:

```
+-------------------------------------+-------------------------------------+-------------------------------------+
| [ 56789] [ 3 6789] [ 3 567 9] | [ 23 567 9] [ 234 6 8 ] [ 2345 789] | [ 56789] 1 [ 23456 89] |
| 4 [1 3 6789] [ 3 567 9] | [ 23 567 9] [123 6 8 ] [123 5 789] | [ 56789] [ 23 56789] [ 23 56 89] |
| [1 56789] 2 [ 3 567 9] | [ 3 567 9] [1 34 6 8 ] [1 345 789] | [ 56789] [ 3456789] [ 3456 89] |
+-------------------------------------+-------------------------------------+-------------------------------------+
| [ 2 6 9] [ 3 6 9] [ 23 6 9] | [ 23 6 ] 5 [123 8 ] | 4 [ 2 6 89] 7 |
| [ 2 567 9] [ 4 67 9] 8 | [ 2 67 ] [12 4 6 ] [12 4 7 ] | 3 [ 2 56 9] [12 56 9] |
| [ 2 567 ] [ 34 67 ] 1 | [ 23 67 ] 9 [ 234 78 ] | [ 56 8 ] [ 2 56 8 ] [ 2 56 8 ] |
+-------------------------------------+-------------------------------------+-------------------------------------+
| 3 [1 6 89] [ 6 9] | 4 7 [ 5 9] | 2 [ 56 89] [1 56 89] |
| [ 2 6789] 5 [ 2 4 67 9] | 1 [ 23 ] [ 23 9] | [ 6789] [ 34 6789] [ 34 6 89] |
| [12 7 9] [1 4 7 9] [ 2 4 7 9] | 8 [ 23 ] 6 | [1 5 7 9] [ 345 7 9] [1 345 9] |
+-------------------------------------+-------------------------------------+-------------------------------------+
```

- Once the grid settles, choose one of the non-fixed cells following some strategy. Select one of the digits from all the possibilities of the cell, and fix (assume) the cell to have that digit. Go back to step 1 and repeat.
- The elimination of possibilities may result in inconsistencies. For example, you may end up with a cell with no possibilities. In such a case, discard that branch of solution, and backtrack to last point where you fixed a cell. Choose a different possibility to fix and repeat.
- If at any point the grid is completely filled, you’ve found the solution!
- If you exhaust all branches of the solution then the puzzle is unsolvable. This can happen if it starts with cells pre-filled wrongly.

This algorithm is actually a Depth-First Search on the state space of the grid configurations. It guarantees to either find a solution or prove a puzzle to be unsolvable.

We start with writing types to represent the cells and the grid:

A cell is either fixed with a particular digit or has a set of digits as possibilities. So it is natural to represent it as a sum type with `Fixed`

and `Possible`

constructors. A row is a list of cells and a grid is a list of rows.

We’ll take the input puzzle as a string of 81 characters representing the cells, left-to-right and top-to-bottom. An example is:

`.......1.4.........2...........5.4.7..8...3....1.9....3..4..2...5.1........8.6...`

Here, `.`

represents an non-filled cell. Let’s write a function to read this input and parse it to our `Grid`

data structure:

```
readGrid :: String -> Maybe Grid
readGrid s
| length s == 81 = traverse (traverse readCell) . Data.List.Split.chunksOf 9 $ s
| otherwise = Nothing
where
readCell '.' = Just $ Possible [1..9]
readCell c
| Data.Char.isDigit c && c > '0' = Just . Fixed . Data.Char.digitToInt $ c
| otherwise = Nothing
```

`readGrid`

return a `Just grid`

if the input is correct, else it returns a `Nothing`

. It parses a `.`

to a `Possible`

cell with all digits as possibilities, and a digit char to a `Fixed`

cell with that digit. Let’s try it out in the *REPL*:

```
*Main> Just grid = readGrid ".......1.4.........2...........5.4.7..8...3....1.9....3..4..2...5.1........8.6..."
*Main> mapM_ print grid
[Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Fixed 1,Possible [1,2,3,4,5,6,7,8,9]]
[Fixed 4,Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9]]
[Possible [1,2,3,4,5,6,7,8,9],Fixed 2,Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9]]
[Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Fixed 5,Possible [1,2,3,4,5,6,7,8,9],Fixed 4,Possible [1,2,3,4,5,6,7,8,9],Fixed 7]
[Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Fixed 8,Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Fixed 3,Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9]]
[Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Fixed 1,Possible [1,2,3,4,5,6,7,8,9],Fixed 9,Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9]]
[Fixed 3,Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Fixed 4,Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Fixed 2,Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9]]
[Possible [1,2,3,4,5,6,7,8,9],Fixed 5,Possible [1,2,3,4,5,6,7,8,9],Fixed 1,Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9]]
[Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Fixed 8,Possible [1,2,3,4,5,6,7,8,9],Fixed 6,Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9],Possible [1,2,3,4,5,6,7,8,9]]
```

The output is a bit unreadable but correct. We can write a few functions to clean it up:

```
showGrid :: Grid -> String
showGrid = unlines . map (unwords . map showCell)
where
showCell (Fixed x) = show x
showCell _ = "."
showGridWithPossibilities :: Grid -> String
showGridWithPossibilities = unlines . map (unwords . map showCell)
where
showCell (Fixed x) = show x ++ " "
showCell (Possible xs) =
(++ "]")
. Data.List.foldl' (\acc x -> acc ++ if x `elem` xs then show x else " ") "["
$ [1..9]
```

Back to the *REPL* again:

```
*Main> Just grid = readGrid ".......1.4.........2...........5.4.7..8...3....1.9....3..4..2...5.1........8.6..."
*Main> putStrLn $ showGrid grid
. . . . . . . 1 .
4 . . . . . . . .
. 2 . . . . . . .
. . . . 5 . 4 . 7
. . 8 . . . 3 . .
. . 1 . 9 . . . .
3 . . 4 . . 2 . .
. 5 . 1 . . . . .
. . . 8 . 6 . . .
```

```
*Main> putStrLn $ showGridWithPossibilities grid
[123456789] [123456789] [123456789] [123456789] [123456789] [123456789] [123456789] 1 [123456789]
4 [123456789] [123456789] [123456789] [123456789] [123456789] [123456789] [123456789] [123456789]
[123456789] 2 [123456789] [123456789] [123456789] [123456789] [123456789] [123456789] [123456789]
[123456789] [123456789] [123456789] [123456789] 5 [123456789] 4 [123456789] 7
[123456789] [123456789] 8 [123456789] [123456789] [123456789] 3 [123456789] [123456789]
[123456789] [123456789] 1 [123456789] 9 [123456789] [123456789] [123456789] [123456789]
3 [123456789] [123456789] 4 [123456789] [123456789] 2 [123456789] [123456789]
[123456789] 5 [123456789] 1 [123456789] [123456789] [123456789] [123456789] [123456789]
[123456789] [123456789] [123456789] 8 [123456789] 6 [123456789] [123456789] [123456789]
```

The output is more readable now. We see that, at the start, all the non-filled cells have all the digits as possible values. We’ll use these functions for debugging as we go forward. We can now start solving the puzzle.

We can remove the digits of fixed cells from their neighboring cells, one cell as a time. But, it is faster to find all the fixed digits in a row of cells and remove them from the possibilities of all the non-fixed cells of the row, at once. Then we can repeat this *pruning* step for all the rows of the grid (and columns and sub-grids too! We’ll see how).

```
pruneCells :: [Cell] -> Maybe [Cell]
pruneCells cells = traverse pruneCell cells
where
fixeds = [x | Fixed x <- cells]
pruneCell (Possible xs) = case xs Data.List.\\ fixeds of
[] -> Nothing
[y] -> Just $ Fixed y
ys -> Just $ Possible ys
pruneCell x = Just x
```

`pruneCells`

prunes a list of cells as described before. We start with finding the fixed digits in the list of cells. Then we go over each non-fixed cells, removing the fixed digits we found, from their possible values. Two special cases arise:

- If pruning results in a cell with no possible digits, it is a sign that this branch of search has no solution and hence, we return a
`Nothing`

in that case. - If only one possible digit remains after pruning, then we turn that cell into a fixed cell with that digit.

We use the `traverse`

function for pruning the cells so that a `Nothing`

resulting from pruning one cell propagates to the entire list.

Let’s take it for a spin in the *REPL*:

```
*Main> Just grid = readGrid "6......1.4.........2...........5.4.7..8...3....1.9....3..4..2...5.1........8.6..."
*Main> putStr $ showGridWithPossibilities $ [head grid] -- first row of the grid
6 [123456789] [123456789] [123456789] [123456789] [123456789] [123456789] 1 [123456789]
*Main> putStr $ showGridWithPossibilities [fromJust $ pruneCells $ head grid] -- same row after pruning
6 [ 2345 789] [ 2345 789] [ 2345 789] [ 2345 789] [ 2345 789] [ 2345 789] 1 [ 2345 789]
```

It works! `6`

and `1`

are removed from the possibilities of the other cells. Now we are ready for …

Pruning a grid requires us to prune each row, each column and each sub-grid. Let’s try to solve it in the *REPL* first:

```
*Main> Just grid = readGrid "6......1.4.........2...........5.4.7..8...3....1.9....3..4..2...5.1........8.6..."
*Main> Just grid' = traverse pruneCells grid
*Main> putStr $ showGridWithPossibilities grid'
6 [ 2345 789] [ 2345 789] [ 2345 789] [ 2345 789] [ 2345 789] [ 2345 789] 1 [ 2345 789]
4 [123 56789] [123 56789] [123 56789] [123 56789] [123 56789] [123 56789] [123 56789] [123 56789]
[1 3456789] 2 [1 3456789] [1 3456789] [1 3456789] [1 3456789] [1 3456789] [1 3456789] [1 3456789]
[123 6 89] [123 6 89] [123 6 89] [123 6 89] 5 [123 6 89] 4 [123 6 89] 7
[12 4567 9] [12 4567 9] 8 [12 4567 9] [12 4567 9] [12 4567 9] 3 [12 4567 9] [12 4567 9]
[ 2345678 ] [ 2345678 ] 1 [ 2345678 ] 9 [ 2345678 ] [ 2345678 ] [ 2345678 ] [ 2345678 ]
3 [1 56789] [1 56789] 4 [1 56789] [1 56789] 2 [1 56789] [1 56789]
[ 234 6789] 5 [ 234 6789] 1 [ 234 6789] [ 234 6789] [ 234 6789] [ 234 6789] [ 234 6789]
[12345 7 9] [12345 7 9] [12345 7 9] 8 [12345 7 9] 6 [12345 7 9] [12345 7 9] [12345 7 9]
```

By `traverse`

-ing the grid with `pruneCells`

, we are able to prune each row, one-by-one. Since pruning a row doesn’t affect another row, we don’t have to pass the resulting rows between each pruning step. That is to say, `traverse`

is enough for us, we don’t need `foldl`

here.

How do we do the same thing for columns now? Since our representation for the grid is rows-first, we first need to convert it to a columns-first representation. Luckily, that’s what `Data.List.transpose`

function does:

```
*Main> Just grid = readGrid "693784512487512936125963874932651487568247391741398625319475268856129743274836159"
*Main> putStr $ showGrid grid
6 9 3 7 8 4 5 1 2
4 8 7 5 1 2 9 3 6
1 2 5 9 6 3 8 7 4
9 3 2 6 5 1 4 8 7
5 6 8 2 4 7 3 9 1
7 4 1 3 9 8 6 2 5
3 1 9 4 7 5 2 6 8
8 5 6 1 2 9 7 4 3
2 7 4 8 3 6 1 5 9
*Main> putStr $ showGrid $ Data.List.transpose grid
6 4 1 9 5 7 3 8 2
9 8 2 3 6 4 1 5 7
3 7 5 2 8 1 9 6 4
7 5 9 6 2 3 4 1 8
8 1 6 5 4 9 7 2 3
4 2 3 1 7 8 5 9 6
5 9 8 4 3 6 2 7 1
1 3 7 8 9 2 6 4 5
2 6 4 7 1 5 8 3 9
```

Pruning columns is easy now:

```
*Main> Just grid = readGrid "6......1.4.........2...........5.4.7..8...3....1.9....3..4..2...5.1........8.6..."
*Main> Just grid' = fmap Data.List.transpose . traverse pruneCells . Data.List.transpose $ grid
*Main> putStr $ showGridWithPossibilities grid'
6 [1 34 6789] [ 234567 9] [ 23 567 9] [1234 678 ] [12345 789] [1 56789] 1 [123456 89]
4 [1 34 6789] [ 234567 9] [ 23 567 9] [1234 678 ] [12345 789] [1 56789] [ 23456789] [123456 89]
[12 5 789] 2 [ 234567 9] [ 23 567 9] [1234 678 ] [12345 789] [1 56789] [ 23456789] [123456 89]
[12 5 789] [1 34 6789] [ 234567 9] [ 23 567 9] 5 [12345 789] 4 [ 23456789] 7
[12 5 789] [1 34 6789] 8 [ 23 567 9] [1234 678 ] [12345 789] 3 [ 23456789] [123456 89]
[12 5 789] [1 34 6789] 1 [ 23 567 9] 9 [12345 789] [1 56789] [ 23456789] [123456 89]
3 [1 34 6789] [ 234567 9] 4 [1234 678 ] [12345 789] 2 [ 23456789] [123456 89]
[12 5 789] 5 [ 234567 9] 1 [1234 678 ] [12345 789] [1 56789] [ 23456789] [123456 89]
[12 5 789] [1 34 6789] [ 234567 9] 8 [1234 678 ] 6 [1 56789] [ 23456789] [123456 89]
```

First, we `transpose`

the grid to convert the columns into rows. Then, we prune the rows by `traverse`

-ing `pruneCells`

over them. And finally, we turn the rows back into columns by `transpose`

-ing the grid back again. The last `transpose`

needs to be `fmap`

-ped because `traverse pruneCells`

returns a `Maybe`

.

Pruning sub-grids is a bit trickier. Following the same idea as pruning columns, we need two functions to transform the sub-grids into rows and back. Let’s write the first one:

```
subGridsToRows :: Grid -> Grid
subGridsToRows =
concatMap (\rows -> let [r1, r2, r3] = map (Data.List.Split.chunksOf 3) rows
in zipWith3 (\a b c -> a ++ b ++ c) r1 r2 r3)
. Data.List.Split.chunksOf 3
```

And try it out:

```
*Main> Just grid = readGrid "693784512487512936125963874932651487568247391741398625319475268856129743274836159"
*Main> putStr $ showGrid grid
6 9 3 7 8 4 5 1 2
4 8 7 5 1 2 9 3 6
1 2 5 9 6 3 8 7 4
9 3 2 6 5 1 4 8 7
5 6 8 2 4 7 3 9 1
7 4 1 3 9 8 6 2 5
3 1 9 4 7 5 2 6 8
8 5 6 1 2 9 7 4 3
2 7 4 8 3 6 1 5 9
*Main> putStr $ showGrid $ subGridsToRows grid
6 9 3 4 8 7 1 2 5
7 8 4 5 1 2 9 6 3
5 1 2 9 3 6 8 7 4
9 3 2 5 6 8 7 4 1
6 5 1 2 4 7 3 9 8
4 8 7 3 9 1 6 2 5
3 1 9 8 5 6 2 7 4
4 7 5 1 2 9 8 3 6
2 6 8 7 4 3 1 5 9
```

You can go over the code and the output and make yourself sure that it works. Also, it turns out that we don’t need to write the back-transform function. `subGridsToRows`

is its own back-transform:

```
*Main> putStr $ showGrid grid
6 9 3 7 8 4 5 1 2
4 8 7 5 1 2 9 3 6
1 2 5 9 6 3 8 7 4
9 3 2 6 5 1 4 8 7
5 6 8 2 4 7 3 9 1
7 4 1 3 9 8 6 2 5
3 1 9 4 7 5 2 6 8
8 5 6 1 2 9 7 4 3
2 7 4 8 3 6 1 5 9
*Main> putStr $ showGrid $ subGridsToRows $ subGridsToRows $ grid
6 9 3 7 8 4 5 1 2
4 8 7 5 1 2 9 3 6
1 2 5 9 6 3 8 7 4
9 3 2 6 5 1 4 8 7
5 6 8 2 4 7 3 9 1
7 4 1 3 9 8 6 2 5
3 1 9 4 7 5 2 6 8
8 5 6 1 2 9 7 4 3
2 7 4 8 3 6 1 5 9
```

Nice! Now writing the sub-grid pruning function is easy:

```
*Main> Just grid = readGrid "6......1.4.........2...........5.4.7..8...3....1.9....3..4..2...5.1........8.6..."
*Main> Just grid' = fmap subGridsToRows . traverse pruneCells . subGridsToRows $ grid
*Main> putStr $ showGridWithPossibilities grid'
6 [1 3 5 789] [1 3 5 789] [123456789] [123456789] [123456789] [ 23456789] 1 [ 23456789]
4 [1 3 5 789] [1 3 5 789] [123456789] [123456789] [123456789] [ 23456789] [ 23456789] [ 23456789]
[1 3 5 789] 2 [1 3 5 789] [123456789] [123456789] [123456789] [ 23456789] [ 23456789] [ 23456789]
[ 234567 9] [ 234567 9] [ 234567 9] [1234 678 ] 5 [1234 678 ] 4 [12 56 89] 7
[ 234567 9] [ 234567 9] 8 [1234 678 ] [1234 678 ] [1234 678 ] 3 [12 56 89] [12 56 89]
[ 234567 9] [ 234567 9] 1 [1234 678 ] 9 [1234 678 ] [12 56 89] [12 56 89] [12 56 89]
3 [12 4 6789] [12 4 6789] 4 [ 23 5 7 9] [ 23 5 7 9] 2 [1 3456789] [1 3456789]
[12 4 6789] 5 [12 4 6789] 1 [ 23 5 7 9] [ 23 5 7 9] [1 3456789] [1 3456789] [1 3456789]
[12 4 6789] [12 4 6789] [12 4 6789] 8 [ 23 5 7 9] 6 [1 3456789] [1 3456789] [1 3456789]
```

It works well. Now we can string together these three steps to prune the entire grid. We also have to make sure that result of pruning each step is fed into the next step. This is so that the fixed cells created into one step cause more pruning in the further steps. We use monadic bind (`>>=`

) for that. Here’s the final code:

```
pruneGrid' :: Grid -> Maybe Grid
pruneGrid' grid =
traverse pruneCells grid
>>= fmap Data.List.transpose . traverse pruneCells . Data.List.transpose
>>= fmap subGridsToRows . traverse pruneCells . subGridsToRows
```

And the test:

```
*Main> Just grid = readGrid "6......1.4.........2...........5.4.7..8...3....1.9....3..4..2...5.1........8.6..."
*Main> Just grid' = pruneGrid' grid
*Main> putStr $ showGridWithPossibilities grid'
6 [ 3 789] [ 3 5 7 9] [ 23 5 7 9] [ 234 78 ] [ 2345 789] [ 5 789] 1 [ 2345 89]
4 [1 3 789] [ 3 5 7 9] [ 23 567 9] [123 678 ] [123 5 789] [ 56789] [ 23 56789] [ 23 56 89]
[1 5 789] 2 [ 3 5 7 9] [ 3 567 9] [1 34 678 ] [1 345 789] [ 56789] [ 3456789] [ 3456 89]
[ 2 9] [ 3 6 9] [ 23 6 9] [ 23 6 ] 5 [123 8 ] 4 [ 2 6 89] 7
[ 2 5 7 9] [ 4 67 9] 8 [ 2 67 ] [12 4 67 ] [12 4 7 ] 3 [ 2 56 9] [12 56 9]
[ 2 5 7 ] [ 34 67 ] 1 [ 23 67 ] 9 [ 234 78 ] [ 56 8 ] [ 2 56 8 ] [ 2 56 8 ]
3 [1 6789] [ 67 9] 4 7 [ 5 7 9] 2 [ 56789] [1 56 89]
[ 2 789] 5 [ 2 4 67 9] 1 [ 23 7 ] [ 23 7 9] [ 6789] [ 34 6789] [ 34 6 89]
[12 7 9] [1 4 7 9] [ 2 4 7 9] 8 [ 23 7 ] 6 [1 5 7 9] [ 345 7 9] [1 345 9]
*Main> putStr $ showGrid grid
6 . . . . . . 1 .
4 . . . . . . . .
. 2 . . . . . . .
. . . . 5 . 4 . 7
. . 8 . . . 3 . .
. . 1 . 9 . . . .
3 . . 4 . . 2 . .
. 5 . 1 . . . . .
. . . 8 . 6 . . .
*Main> putStr $ showGrid grid'
6 . . . . . . 1 .
4 . . . . . . . .
. 2 . . . . . . .
. . . . 5 . 4 . 7
. . 8 . . . 3 . .
. . 1 . 9 . . . .
3 . . 4 7 . 2 . .
. 5 . 1 . . . . .
. . . 8 . 6 . . .
```

We can clearly see the massive pruning of possibilities all around the grid. We also see a `7`

pop up in the row-7-column-5 cell. This means that we can prune the grid further, until it settles. If you are familiar with Haskell, you may recognize this as trying to find a fixed point for the `pruneGrid'`

function, except in a monadic context. It is simple to implement:

```
pruneGrid :: Grid -> Maybe Grid
pruneGrid = fixM pruneGrid'
where
fixM f x = f x >>= \x' -> if x' == x then return x else fixM f x'
```

The crux of this code is the `fixM`

function. It takes a monadic function `f`

and an initial value, and recursively calls itself till the return value settles. Let’s do another round in the *REPL*:

```
*Main> Just grid = readGrid "6......1.4.........2...........5.4.7..8...3....1.9....3..4..2...5.1........8.6..."
*Main> Just grid' = pruneGrid grid
*Main> putStr $ showGridWithPossibilities grid'
6 [ 3 789] [ 3 5 7 9] [ 23 5 7 9] [ 234 8 ] [ 2345 789] [ 5 789] 1 [ 2345 89]
4 [1 3 789] [ 3 5 7 9] [ 23 567 9] [123 6 8 ] [123 5 789] [ 56789] [ 23 56789] [ 23 56 89]
[1 5 789] 2 [ 3 5 7 9] [ 3 567 9] [1 34 6 8 ] [1 345 789] [ 56789] [ 3456789] [ 3456 89]
[ 2 9] [ 3 6 9] [ 23 6 9] [ 23 6 ] 5 [123 8 ] 4 [ 2 6 89] 7
[ 2 5 7 9] [ 4 67 9] 8 [ 2 67 ] [12 4 6 ] [12 4 7 ] 3 [ 2 56 9] [12 56 9]
[ 2 5 7 ] [ 34 67 ] 1 [ 23 67 ] 9 [ 234 78 ] [ 56 8 ] [ 2 56 8 ] [ 2 56 8 ]
3 [1 6 89] [ 6 9] 4 7 [ 5 9] 2 [ 56 89] [1 56 89]
[ 2 789] 5 [ 2 4 67 9] 1 [ 23 ] [ 23 9] [ 6789] [ 34 6789] [ 34 6 89]
[12 7 9] [1 4 7 9] [ 2 4 7 9] 8 [ 23 ] 6 [1 5 7 9] [ 345 7 9] [1 345 9]
```

We see that `7`

in the row-7-column-5 cell is eliminated from all its neighboring cells. We can’t prune the grid anymore. Now it is time to make the choice.

One the grid is settled, we need to choose a non-fixed cell and make it fixed by assuming one of its possible values. This gives us two grids, next in the state-space of the solution search:

- one which has this chosen cell fixed to this chosen digit, and,
- the other in which the chosen cell has all the other possibilities except the one we chose to fix.

We call this function, `nextGrids`

:

```
nextGrids :: Grid -> (Grid, Grid)
nextGrids grid =
let (i, first@(Fixed _), rest) =
fixCell
. Data.List.minimumBy (compare `Data.Function.on` (possibilityCount . snd))
. filter (isPossible . snd)
. zip [0..]
. concat
$ grid
in (replace2D i first grid, replace2D i rest grid)
where
isPossible (Possible _) = True
isPossible _ = False
possibilityCount (Possible xs) = length xs
possibilityCount (Fixed _) = 1
fixCell (i, Possible [x, y]) = (i, Fixed x, Fixed y)
fixCell (i, Possible (x:xs)) = (i, Fixed x, Possible xs)
fixCell _ = error "Impossible case"
replace2D :: Int -> a -> [[a]] -> [[a]]
replace2D i v =
let (x, y) = (i `quot` 9, i `mod` 9) in replace x (replace y (const v))
replace p f xs = [if i == p then f x else x | (x, i) <- zip xs [0..]]
```

We choose the non-fixed cell with least count of possibilities as the pivot. This strategy make sense intuitively, as with a cell with fewest possibilities, we have the most chance of being right when assuming one. Fixing a non-fixed cell leads to one of the two cases:

- the cell has only two possible values, resulting in two fixed cells, or,
- the cell has more than two possible values, resulting in one fixed and one non-fixed cell.

Then all we are left with is replacing the non-fixed cell with its fixed and fixed/non-fixed choices, which we do with some math and some list traversal. A quick check on the *REPL*:

```
*Main> Just grid = readGrid "6......1.4.........2...........5.4.7..8...3....1.9....3..4..2...5.1........8.6..."
*Main> Just grid' = pruneGrid grid
*Main> putStr $ showGridWithPossibilities grid'
6 [ 3 789] [ 3 5 7 9] [ 23 5 7 9] [ 234 8 ] [ 2345 789] [ 5 789] 1 [ 2345 89]
4 [1 3 789] [ 3 5 7 9] [ 23 567 9] [123 6 8 ] [123 5 789] [ 56789] [ 23 56789] [ 23 56 89]
[1 5 789] 2 [ 3 5 7 9] [ 3 567 9] [1 34 6 8 ] [1 345 789] [ 56789] [ 3456789] [ 3456 89]
[ 2 9] [ 3 6 9] [ 23 6 9] [ 23 6 ] 5 [123 8 ] 4 [ 2 6 89] 7
[ 2 5 7 9] [ 4 67 9] 8 [ 2 67 ] [12 4 6 ] [12 4 7 ] 3 [ 2 56 9] [12 56 9]
[ 2 5 7 ] [ 34 67 ] 1 [ 23 67 ] 9 [ 234 78 ] [ 56 8 ] [ 2 56 8 ] [ 2 56 8 ]
3 [1 6 89] [ 6 9] 4 7 [ 5 9] 2 [ 56 89] [1 56 89]
[ 2 789] 5 [ 2 4 67 9] 1 [ 23 ] [ 23 9] [ 6789] [ 34 6789] [ 34 6 89]
[12 7 9] [1 4 7 9] [ 2 4 7 9] 8 [ 23 ] 6 [1 5 7 9] [ 345 7 9] [1 345 9]
*Main> -- the row-4-column-1 cell is the first cell with only two possibilities, [2, 9].
*Main> -- it is chosen as the pivot cell to find the next grids.
*Main> (grid1, grid2) = nextGrids grid'
*Main> putStr $ showGridWithPossibilities grid1
6 [ 3 789] [ 3 5 7 9] [ 23 5 7 9] [ 234 8 ] [ 2345 789] [ 5 789] 1 [ 2345 89]
4 [1 3 789] [ 3 5 7 9] [ 23 567 9] [123 6 8 ] [123 5 789] [ 56789] [ 23 56789] [ 23 56 89]
[1 5 789] 2 [ 3 5 7 9] [ 3 567 9] [1 34 6 8 ] [1 345 789] [ 56789] [ 3456789] [ 3456 89]
2 [ 3 6 9] [ 23 6 9] [ 23 6 ] 5 [123 8 ] 4 [ 2 6 89] 7
[ 2 5 7 9] [ 4 67 9] 8 [ 2 67 ] [12 4 6 ] [12 4 7 ] 3 [ 2 56 9] [12 56 9]
[ 2 5 7 ] [ 34 67 ] 1 [ 23 67 ] 9 [ 234 78 ] [ 56 8 ] [ 2 56 8 ] [ 2 56 8 ]
3 [1 6 89] [ 6 9] 4 7 [ 5 9] 2 [ 56 89] [1 56 89]
[ 2 789] 5 [ 2 4 67 9] 1 [ 23 ] [ 23 9] [ 6789] [ 34 6789] [ 34 6 89]
[12 7 9] [1 4 7 9] [ 2 4 7 9] 8 [ 23 ] 6 [1 5 7 9] [ 345 7 9] [1 345 9]
*Main> putStr $ showGridWithPossibilities grid2
6 [ 3 789] [ 3 5 7 9] [ 23 5 7 9] [ 234 8 ] [ 2345 789] [ 5 789] 1 [ 2345 89]
4 [1 3 789] [ 3 5 7 9] [ 23 567 9] [123 6 8 ] [123 5 789] [ 56789] [ 23 56789] [ 23 56 89]
[1 5 789] 2 [ 3 5 7 9] [ 3 567 9] [1 34 6 8 ] [1 345 789] [ 56789] [ 3456789] [ 3456 89]
9 [ 3 6 9] [ 23 6 9] [ 23 6 ] 5 [123 8 ] 4 [ 2 6 89] 7
[ 2 5 7 9] [ 4 67 9] 8 [ 2 67 ] [12 4 6 ] [12 4 7 ] 3 [ 2 56 9] [12 56 9]
[ 2 5 7 ] [ 34 67 ] 1 [ 23 67 ] 9 [ 234 78 ] [ 56 8 ] [ 2 56 8 ] [ 2 56 8 ]
3 [1 6 89] [ 6 9] 4 7 [ 5 9] 2 [ 56 89] [1 56 89]
[ 2 789] 5 [ 2 4 67 9] 1 [ 23 ] [ 23 9] [ 6789] [ 34 6789] [ 34 6 89]
[12 7 9] [1 4 7 9] [ 2 4 7 9] 8 [ 23 ] 6 [1 5 7 9] [ 345 7 9] [1 345 9]
```

We have implemented parts of our algorithm till now. Now we’ll put everything together to solve the puzzle. First, we need to know if we are done or have messed up:

```
isGridFilled :: Grid -> Bool
isGridFilled grid = null [ () | Possible _ <- concat grid ]
isGridInvalid :: Grid -> Bool
isGridInvalid grid =
any isInvalidRow grid
|| any isInvalidRow (Data.List.transpose grid)
|| any isInvalidRow (subGridsToRows grid)
where
isInvalidRow row =
let fixeds = [x | Fixed x <- row]
emptyPossibles = [x | Possible x <- row, null x]
in hasDups fixeds || not (null emptyPossibles)
hasDups l = hasDups' l []
hasDups' [] _ = False
hasDups' (y:ys) xs
| y `elem` xs = True
| otherwise = hasDups' ys (y:xs)
```

`isGridFilled`

returns whether a grid is filled completely by checking it for any `Possible`

cells. `isGridInvalid`

checks if a grid is invalid because it either has duplicate fixed cells in any block or has any non-fixed cell with no possibilities.

Writing the `solve`

function is almost trivial now:

```
solve :: Grid -> Maybe Grid
solve grid = pruneGrid grid >>= solve'
where
solve' g
| isGridInvalid g = Nothing
| isGridFilled g = Just g
| otherwise =
let (grid1, grid2) = nextGrids g
in solve grid1 <|> solve grid2
```

We prune the grid as before and pipe it to the helper function `solve'`

. `solve'`

bails with a `Nothing`

if the grid is invalid, or returns the solved grid if it is filled completely. Otherwise, it finds the next two grids in the search tree and solves them recursively with backtracking by calling the `solve`

function. Backtracking here is implemented by the using the `Alternative`

(`<|>`

) implementation of the `Maybe`

type^{2}. It takes the second branch in the computation if the first branch returns a `Nothing`

.

Whew! That took us long. Let’s put it to the final test now:

```
*Main> Just grid =
readGrid "6......1.4.........2...........5.4.7..8...3....1.9....3..4..2...5.1........8.6..."
*Main> putStr $ showGrid grid
6 . . . . . . 1 .
4 . . . . . . . .
. 2 . . . . . . .
. . . . 5 . 4 . 7
. . 8 . . . 3 . .
. . 1 . 9 . . . .
3 . . 4 . . 2 . .
. 5 . 1 . . . . .
. . . 8 . 6 . . .
*Main> Just grid' = solve grid
*Main> putStr $ showGrid grid'
6 9 3 7 8 4 5 1 2
4 8 7 5 1 2 9 3 6
1 2 5 9 6 3 8 7 4
9 3 2 6 5 1 4 8 7
5 6 8 2 4 7 3 9 1
7 4 1 3 9 8 6 2 5
3 1 9 4 7 5 2 6 8
8 5 6 1 2 9 7 4 3
2 7 4 8 3 6 1 5 9
```

It works! Let’s put a quick `main`

wrapper around `solve`

to call it from the command line:

```
main :: IO ()
main = do
inputs <- lines <$> getContents
Control.Monad.forM_ inputs $ \input ->
case readGrid input of
Nothing -> putStrLn "Invalid input"
Just grid -> case solve grid of
Nothing -> putStrLn "No solution found"
Just grid' -> putStrLn $ showGrid grid'
```

And now, we can invoke it from the command line:

```
$ echo ".......12.5.4............3.7..6..4....1..........8....92....8.....51.7.......3..." | stack exec sudoku
3 6 4 9 7 8 5 1 2
1 5 2 4 3 6 9 7 8
8 7 9 1 2 5 6 3 4
7 3 8 6 5 1 4 2 9
6 9 1 2 4 7 3 8 5
2 4 5 3 8 9 1 6 7
9 2 3 7 6 4 8 5 1
4 8 6 5 1 2 7 9 3
5 1 7 8 9 3 2 4 6
```

And, we are done.

If you want to play with different puzzles, the file here lists some of the toughest ones. Let’s run^{3} some of them through our program to see how fast it is:

```
$ head -n100 sudoku17.txt | time stack exec sudoku
... output omitted ...
116.70 real 198.09 user 94.46 sys
```

It took about 117 seconds to solve a hundred puzzles, so, about 1.2 seconds per puzzle. This is pretty slow but we’ll get around to making it faster in the subsequent posts.

In this rather verbose article, we learned how to write a simple Sudoku solver in Haskell step-by-step. In the later parts of this series, we’ll delve into profiling the solution and figuring out better algorithms and data structures to solve Sudoku more efficiently. The code till now is available here. Discuss this post on r/haskell or leave a comment.

If you liked this post, please leave a comment.

]]>