And here's a screenshot of the tool running in "guessing" mode with that same die.
On the top left you see the live camers image, with inset view of the extracted foreground facing die label.
On the bottom left you see the training image for the 20 that the program saw in a previous labeling/training stage.
Note that it has ONE photo of EACH die face that it uses as it's prototype -- there is no training procedure per se, it's just one example of each die face. In fact as you can see in this case, it's not even of the same die, just another d20 with similar font.
Then on right hand side you see the results of trying to match against it's stored set of 20 prototypes, showing the top 4 candidate matches, rotated and scaled in an attempt to find the best aligning match. The right-hand window of each pair shows the "difference mask" after some processing. So you can see how the "20" is the best match -- which you can see reported at the bottom. (confidence score reflects the difference between best match score and next-best match score).
