## Solving Sudoku with linear algebra

UPDATE: A real mathematician provides a general solution here. Go read that article…

I recently started hearing more and more about Sudoku from people and decided to check it out [Wikipedia]. After playing a few rounds of it, I thought it would be fun to write a computer program to solve it. So, I whipped up a quick program that will solve all of the easy level Sudokus at Web Sudoku. However, it failed for more complex ones. I realized that a game tree approach with some pruning rules would probably be the way to go to have a computer solve Sudoku.

However, I had also considered solving Sudoku with linear algebra but I had an feeling that it just wouldn’t work. A colleague of mine thought this might be possible as well, so I decided to demonstrate why it doesn’t work.

Given the following Sudoku [via Web Sudoku]

3 | 7 | 8 | 1 | |||||

2 | 9 | |||||||

1 | 6 | 9 | 8 | 8 | ||||

8 | 7 | 1 | ||||||

7 | 6 | 3 | 4 | |||||

4 | 6 | 2 | 9 | |||||

5 | 3 | 8 | 1 | 7 | ||||

8 | 3 | |||||||

6 | 1 | 2 | 8 |

Fill remaining spaces with variables

A0 | 3 | 7 | A3 | A4 | 8 | A6 | A7 | 1 |

B0 | B1 | 2 | B3 | B4 | B5 | B6 | 9 | B8 |

1 | C1 | 6 | C3 | 9 | C5 | 8 | C7 | 8 |

D0 | D1 | D2 | 8 | 7 | 1 | D6 | D7 | D8 |

7 | 6 | E2 | 3 | E4 | 4 | E6 | E7 | E8 |

4 | F1 | F2 | 6 | 2 | 9 | F6 | F7 | F8 |

5 | G1 | 3 | G3 | 8 | G5 | 1 | G7 | 7 |

H0 | 8 | H2 | H3 | H4 | H5 | 3 | H7 | H8 |

6 | I1 | I2 | 1 | I4 | I5 | 2 | 8 | I8 |

Now create an equality constraint for all rows, columns and regions

Sum of groups: 9 + 8 + 7 + 6 + 5 + 4 + 3 + 2 + 1 = 45

A0 + 3 + 7 + A3 + A4 + 8 + A6 + A7 + 1 = 45

B0 + B1 + 2 + B3 + B4 + B5 + B6 + 9 + B8 = 45

1 + C1 + 6 + C3 + 9 + C5 + 8 + C7 + 8 = 45

D0 + D1 + D2 + 8 + 7 + 1 + D6 + D7 + D8 = 45

7 + 6 + E2 + 3 + E4 + 4 + E6 + E7 + E8 = 45

4 + F1 + F2 + 6 + 2 + 9 + F6 + F7 + F8 = 45

5 + G1 + 3 + G3 + 8 + G5 + 1 + G7 + 7 = 45

H0 + 8 + H2 + H3 + H4 + H5 + 3 + H7 + H8 = 45

6 + I1 + I2 + 1 + I4 + I5 + 2 + 8 + I8 = 45

A0 + B0 + 1 + D0 + 7 + 4 + 5 + H0 + 6 = 45

3 + B1 + C1 + D1 + 6 + F1 + G1 + 8 + I1 = 45

7 + 2 + 6 + D2 + E2 + F2 + 3 + H2 + I2 = 45

A3 + B3 + C3 + 8 + 3 + 6 + G3 + H3 + 1 = 45

A4 + B4 + 9 + 7 + E4 + 2 + 8 + H4 + I4 = 45

8 + B5 + C5 + 1 + 4 + 9 + G5 + H5 + I5 = 45

A6 + B6 + 8 + D6 + E6 + F6 + 1 + 3 + 2 = 45

1 + B8 + 8 + D8 + E8 + F8 + 7 + H8 + I8 = 45

A0 + 3 + 7 + B0 + B1 + 2 + 1 + C1 + 6 = 45

A3 + A4 + 8 + B3 + B4 + B5 + C3 + 9 + C5 = 45

A6 + A7 + 1 + B6 + 9 + B8 + 8 + C7 + 8 = 45

D0 + D1 + D2 + 7 + 6 + E2 + 4 + F1 + F2 = 45

8 + 7 + 1 + 3 + E4 + 4 + 6 + 2 + 9 = 45

D6 + D7 + D8 + E6 + E7 + E8 + F6 + F7 + F8 = 45

5 + G1 + 3 + H0 + 8 + H2 + 6 + I1 + I2 = 45

G3 + 8 + G5 + H3 + H4 + H5 + 1 + I4 + I5 = 45

1 + G7 + 7 + 3 + H7 + H8 + 2 + 8 + I8 = 45

This gives you 26 equations and 48 variables. Not enough to solve with simple linear algebra, but might be enough to constrain guesses (prune branches) in a game tree approach.

This is by no means a rigorous proof. You could, in theory, add further constraints that would allow Sudoku to be solved with matrix algebra, but I’m not sure. For instance, you could certainly add a whole slew of inequality constraints such as “0 < A1 < 10” or “A0 A1 … A7”. However, I’m not sure that these expressions can be converted to a linear algebra approach, though they do make good game tree pruning rules.

luke prestonsaid, on November 18, 2011 at 8:18 pmI also thought about this approach. It can’t be done not because there aren’t enough equations to variables but because the equations aren’t independent. Hence we can’t use this approach to solve it. I came to this conclusion because I used the same idea, then like you found there weren’t enough equations. But if you 2 regions or more for your equations (this sums to 90 obviously) it is possible to find enough equations for all variables. I went and tried to solve the problem but was coming stuck. I think that it is because the equations aren’t independent. Can’t work it out but why don’t you try doing this. You could also add rows together. No I don’t think it would work, nor could it ever work using those sort of “simple” methods since all regions, rows and columns all depend on each other for their solutions. That is my guess if nothing else anyway.