Well, it's some years since I was involved in this area, but I would suggest that you may have a fairly typical requirement here.
My experience was limited to implementing applications that could:
- Scan a known (predefined) form layout.
- Capture handwritten and/or MICR characters on the form.
- OCR the characters captured.
- Output the data from the relevant fields on the form into a database.
Accuracy was very important, because the data captured in the process was financial transaction data (it was for a bank).
I think this is the sort of thing that you seem to be after. Voice output would probably be a secondary step to the above process.
I had a quick google on "OCR form reader", and at the head of ths list was Recogniform Desktop Reader
Hope this helps or is of use.