Constructing a minimal Sudoku puzzle

ATTENTION: You are viewing a page formatted for mobile devices; to view the full web page, click HERE.

Other Software > Developer's Corner

<< < (2/4) > >>

Deozaan:
"Unit: Any nine squares that compose a row, column, or box. Each square has exactly 3 units." <-- do you mean each square is a MEMBER of exactly 3 units.
-mouser (July 28, 2021, 11:23 AM)
--- End quote ---

Yes, that's a better way of phrasing it. I'll edit the post to reflect that. :Thmbsup:

As I understand the basic gist of the algorithm is that you incrementally blank out squares, and check to make sure (by running a solver) that there is still only a unique solution to the puzzle with the square blanked out. If there are multiple solutions with that square blanked out, you must force it to be a given, and try looking for another square to blank out (dig a hole).

So, this algorithm guarantees that the puzzle, with its givens and its holes (blanks) is UNIQUELY SOLVABLE.

But it doesn't really guarantee that the puzzle is MINIMAL. I don't think it claims to, does it?

It's not doing a brute force search for all POSSIBLE ways to dig holes.. It is in fact going through an ORDERED list of possible squares to dig, checking them one at a time, and when it finds one it can dig (leave blank), it makes it blank going forward.

This does not guarantee a minimal set of givens. Depending on what holes (blanks) you decide on early, you may be forced into having more or less holes in the rest of the puzzle.

You could use it to find minimal puzzles if you ran it once for every possible sequence of potential holes, and kept the one that resulted in the fewest givens.

---

Does that make sense? Or am I missing something?

I just think it's important to understand that the algorithm doesn't seem to be promising to generate minimal puzzles.-mouser (July 28, 2021, 12:00 PM)
--- End quote ---

I was operating under the idea that, of all the number of clues given at the start of the puzzle, if removing any one of them results in multiple solutions, then that puzzle is a minimal puzzle. But you make a convincing case that there is a distinction between a truly minimal puzzle and the conception I had of it, and there could possibly be (and almost certainly is) a different arrangement/number of givens that would result in the same unique solution using the digging holes method.

And now that I realize this, I'm interested in what it would take to attain your definition of a truly minimal puzzle. But at the same time, I think the definition I've been using until now is good enough for my purposes with this project.

For reference, I got the idea of a minimal puzzle from a paper by Andrew C. Stuart (PDF link), and there may have been something lost in my paraphrasing of his definition:

The fifth criterion is that the puzzle should be minimal. That is all the numbers have been removed so that the bare minimum of clues remain make a single solution puzzle. Interestingly mathematicians have worked out that 39 numbers is the most number of clues that a minimal puzzle can have.

[...]

Given a filled board I then start subtracting numbers to make the puzzle. To maintain symmetry either two or four numbers that are diagonally opposite each other must be removed at the same time. For the first twenty or so subtractions four numbers can be removed. Beyond that the chance of four numbers leaving a single solution puzzle get slimmer so two at a time are subtracted. A low target of 20 clues is set and by 30 the remaining numbers are tested individually to see if they can be removed safely. After each subtraction the puzzle is tested to see if it retains a single solution. If this fails the numbers are restored and a different quad, pair or single subtraction is tried.-https://www.sudokuwiki.org/Sudoku_Creation_and_Grading.pdf
--- End quote ---

So that's the "original" source of the term "minimal puzzle" that I'm aware of. And it doesn't seem to contradict my interpretation of it.

Maybe im missing something since it seems like it would break everything if it was missing, or maybe you didn't show all the code, but in find_givens() function, line 21 starts a loop checking if there are solutions with the alternative digits.

In the loop you force the test value, and check if there is a solution..
And at the end of the loop if you DID find other solutions, you force the value back and then set it as a given square.

BUT what you dont do is at the end of that if you DONT find any solutions, is FORCE that value to a hole (value 0).. so in the code you show above it will be left as bad value (the last value you checked in the loop), rather than marked as a hole.
Did you just forget to show that code or am i missing something?
-mouser (July 28, 2021, 12:12 PM)
--- End quote ---

You're right. You're missing something that I failed to mention.

For the sake of simplicity, I lied a little in my description of the problem, because I failed to understand the ramifications of the differences I swept under the rug. I'm not actually using a true array to store the squares. I'm using a special array type in GDScript called a PoolIntArray, which is optimized for memory usage and is passed by value instead of reference. I thought both of these characteristics were important for improving the speed of the algorithm, but didn't really see it worth mentioning this special type. (I've updated the original post to include this detail)

But because it's passed by value instead of reference, there's no need to backtrack the changes because the PoolIntArray from the level above is unchanged. What that means is that I actually don't even need to force the value back to its original value when a solution is found. I just need to mark it as given in the "given" array (which is a true array).

Forcing it back to its original value is an artifact in the code from when I originally wrote the functions using a two-dimensional array to store the grid. I thought it was so slow because not only did I have to duplicate the array each time I tried to solve it recursively, I had to do a deep copy to ensure the nested arrays didn't get set to bad values. So when I went to change the code to use a normal array I found the PoolIntArray and thought that it was killing two birds with one stone: No more deep copying, and memory usage optimization! I managed to remove the code to re-dig the hole when no solution was found, but accidentally left in the code that re-sets the square to the original value when a solution is found.

Here is one in Python that does the same thing you're trying to do (I think)

https://www.101computing.net/sudoku-generator-algorithm/

Not sure if that helps...
-wraith808 (July 28, 2021, 01:05 PM)
--- End quote ---

Thanks. I'll take a look at that.

mouser:
Your solve() function looks a little problematic to me.

First, just as a matter of clarity -- you are returning an int variable "solutions" which seems like its suggesting the number of solutions found, but this isnt quite right is it.. You really just want to return a bool of true if a solution was found, or false if not. The recursive calls must eventually end in a true or a false, which gets propagated back up to be returned. You never want to find more than 1 solution, and your algorithm explicitly stops searching once it finds one solution. Having it look like its adding up solution counts makes it harder to understand. This makes lines like 40 particularly confusing since you really should be saying "return false" -- there is no way to get there with a solutions>0.

And now looking more into the loop, the idea is to recursively fill in one value, and then call solve on the new grid with the filled in value, each recursive call filling in one additional square of the grid.
(you are doing some optimization on what ORDER you try to fill in squares based on looking at ones with least options first -- maybe document the reasoning and approach more; that's not important).

So the idea of this recursion is that it should return FALSE if you ever call solve() with a grid where there is a hole (blank square) that CANNOT be filled in. That false will return to the caller who will continue looping checking for a different value to put into the square before it recursively checks for another solution. That looks right to me.

And then you need it to return TRUE when it has SOLVED the grid COMPLETELY.

But looking at the code, i see what MIGHT be a problem.

Your last lines of solve say "
--- ---# if we got this far then we've solved it! solutions += 1 return solutions"

But let's look how we can get down there.
One way is if there are no more holes -- that's what we want. When there are no holes, return TRUE, we found a solution.
But also i think you can get there if your PRELIMINARY sorting operations identifying possibilities creates a situation where some illegal spot has no good options. this should NOT be returning a true case(!)

I think the better solution would be to set some flag line at line 14: flagFoundHole = true;

then somewhere before line 18 you could just say

--- ---If (!flagFoundHole) {
// no holes found, so this represents a solution!
return true;
}

mouser:
You know the famous Knuth programming quote: "premature optimization is the root of all evil" (which I don't agree with -- I think code duplication gets that spot).

My first strategy for trying to understand code like this would be to remove wherever possible all the clever optimization stuff. Like the lines 2-26 in solve serve only the purpose of reordering the indices to check to get faster results; i'd try to get rid of that kind of thing until i knew it all worked well first.

After it all works then worry about speeding it up.

mouser:
I know I just said to avoid premature optimization, but in terms of speed, I think your original code with ONE grid passed by reference and resetting values that don't lead to success actually seems like it would be a lot faster and memory friendly than the alternative... Because you are doing a depth first search it seems like that would work just fine.

Something to consider after you get it all working :)

Deozaan:
After it all works then worry about speeding it up.-mouser (July 28, 2021, 01:49 PM)
--- End quote ---

That's where I am (I think).

I had it working (I think). But it was too slow to be useable in a real-time game. So now I'm trying to speed it up. And breaking it in the process.

And that's also why some of the code doesn't make much sense. Originally, I was counting solutions. The solve() function was flexible and could be used to find just one solution, two solutions (quitting after it realized the puzzle wasn't unique), or all solutions. Then I realized I didn't really care about finding all solutions. And then I realized I could tweak my other code so that it only needed to find a single solution to know whether or not it was unique. So the solution count stayed in there even though it didn't need to count anymore. Oops! My bad.

But looking at the code, i see what MIGHT be a problem.

Your last lines of solve say "
--- ---# if we got this far then we've solved it! solutions += 1 return solutions"

But let's look how we can get down there.
One way is if there are no more holes -- that's what we want. When there are no holes, return TRUE, we found a solution.
But also i think you can get there if your PRELIMINARY sorting operations identifying possibilities creates a situation where some illegal spot has no good options. this should NOT be returning a true case(!)-mouser (July 28, 2021, 01:45 PM)
--- End quote ---

I think you're onto something. I was writing up a response about how I didn't see how that was possible, but in the process of explaining my viewpoint there was a momentary flash of neurons connecting in such a way as to realize that you may be right. But then those neurons went on break or something, because I don't think I fully comprehend it yet. So I'm not fully convinced that you're right (or in what way you're right). But now my confidence that my initial viewpoint was right is very low. :D

Navigation

[0] Message Index

[#] Next page

[*] Previous page

Go to full version