That image with the rotated shapes is just a little graphic i'm using to test my rotation and normalization function -- it's not meant to be instructive of anything, just a fun visual.
With regard to identifying dice faces, there are many alternatives, and different techniques needed for different dice.
Some examples:
For D6 dice with pips, the whole thing is quite trivial -- you can just COUNT the number of circular pips on the die face of a reasonable size, and as long as the die is reasonably centered under camera, the other faces don't interfere much.
In short, for D6 pip dice, the whole thing becomes relatively trivial.
Now for larger sided dice with labels, things get considerably more complicated because other faces (in addition to the face up on) are visible to camera, so that has to be dealt with.
This is a case where die geometry can help a lot -- and the more you can figure out the geometry of the front face the better you can isolate the front face from neighbors. Unfortunately it's not an easy problem and the more the die is out of center the harder it is. I can get pretty good results by identifying the CENTER of the die and then finding the graphic nearest the center..
In addition, there is no simple way to identify the face graphic identity as their is in counting blobs with pips.
In fact, the difference between a 6 and a 9 is often a tiny little dot.
However, for dice which are using numerical labels, you COULD train a learning algorithm (neural network, support-vector-machine, etc.) to identify numbers (from 1-20 let's say). You might have to train several models based on different fonts, etc. but it's doable.
Now there are other kinds of dice which have CUSTOM graphics on each face (not numerical labels); for those, training a learning algorithm to do classification on a new kind of die would be much more painful for an end user. it could be done but it would be INCREDIBLY time consuming for both man and computer doing initial training and hand LABELING.
The alternative with custom graphics is something like k-means clustering, where i tell you there are 6 custom die faces, and the computer can roll the die a few hundred times and try to separate all the front face graphics it sees into 6 "classes". It doesnt know what the faces mean it just has some metric of similarity that will allow it to group them such that the same face is put into the same group reliably.
What i'm playing with in the image you see above with the ROTATED versions, is this latter idea -- of extracting foreground graphic and scoring it against previously seen graphics -- because this method is theoretically the most universal and requires no training.
So basically i try to identify foreground graphic, normalize it's size. And then given any two captures, i will score their bitwise similarity (considering rotations).