ATTENTION: You are viewing a page formatted for mobile devices; to view the full web page, click HERE.

Main Area and Open Discussion > General Software Discussion

Speech to Text Software?

<< < (3/7) > >>

Renegade:
Tellme:

Cloud Platform

For enterprises, ISVs and service providers who want to build phone-based customer care using standards-based VoiceXML, use Tellme Studio with the Microsoft Tellme IVR Service.
--- End quote ---

Yep. Not for me.

Shades:
WAV to text (trial/USD149.95 single developer license) could be what you are looking for.

It certainly looks easy enough to implement and is supported in: Visual C++, Visual Basic, Delphi, C++ Builder, .Net languages, Java and scripts like Perl, Php, Python etc.

You do need a separate SAPI5 speech recognizer though. Office 2003 is the last version of Office that shipped with one (but you can still download it from alternative sources). Windows 7 comes by default with one. And J-Mac is right, that one is quite decent.

joiwind:
No luck. The Cool C ReadWrite wouldn't run without a license, and the company is out of business it looks like. (Web site gone.)

@joiwind - The first link was a 404. Which did you mean there?
-Renegade (August 11, 2011, 05:13 PM)
--- End quote ---

Sorry Ren, try this and look for AARON and AIB. But I'm not at all sure that you can use WAVs, just a micro.

Renegade:
No luck. The Cool C ReadWrite wouldn't run without a license, and the company is out of business it looks like. (Web site gone.)

@joiwind - The first link was a 404. Which did you mean there?
-Renegade (August 11, 2011, 05:13 PM)
--- End quote ---

Sorry Ren, try this and look for AARON and AIB. But I'm not at all sure that you can use WAVs, just a micro.
-joiwind (August 14, 2011, 12:25 PM)
--- End quote ---

Downloading now...

On a side note, they use Google Docs for the download:

Sorry, we are unable to scan this file for viruses.

The file exceeds the maximum size that we scan. Download anyway
--- End quote ---

And that seems very odd. It's a glaring security hole. If you want to infect someone, simply upload a large file. Odd...

1NR1:
Well, what do you know, something I know about.

Here's how I do transcriptions from various sources. First off, Dragon is the best program for this, rant or rave, hands down, I've used them all since L&H, Via Voice etc. So unless you have a way to unlock the TV COP SHOW Prop computer that the pretty girl types into and quickly finds the secrets of the universe and the perps/unsubs pet's middle initial, you are going to have a learning curve. So I'll skip right to the short cut.

First, for a recorded file you have never performed any "training for" (I'll assume you've seen the tv commercial where the professors lecture appears miraculously quick and correct as a text document on a computer from 30 feet away (10 meters) using only the microphone installed in the computer cover?  If you haven't, no worries.  It's the same computer as the TV COP SHOW Prop computer that the pretty girl types into and quickly finds the secrets of the universe...  OK.  not going to happen.)

Now Curt (famously) is the only poster who got this right and was kind of "glossed over" and his suggestion about using Audition seemed to end.

Back to the shortcut. You need to process the file before you drag it into 'dragonpad'. The quick way of putting it would be to say, the best file should sound something like Donald Duck.  No, seriously.  Think about it. The less mid range/low and harmonics there are the better the speech engine will do.  I am NOT going to get into the audio analysis of all this here.

Second, my experience is you are better off starting with a .wav file.  Basically cut the lows and mids so it sounds somewhat "tinny". Pull out the pops and hiss and other noise. Slow it to about 67%. Then drag that processed sound file into Dragon or whatever.  (You may need to convert to mp3.)

Here's some summary points:

The clearer, less noisy, better articulated speaking on the original, the more accurate the text file will be.

Having a high quality microphone positioned properly is huge. (see above)

You can get around 80% accuracy. Remember most speech recognition programs spell the words correctly 100% of the time.  They just spell the wrong words.

About this 80%. It's not a distributed %, You may get a complete paragraph with 98% accuracy, and then a sentence completely unintelligible.

If you can get the speaker to dictate a few paragraphs from Dragons training files placing the mic at the pick-up position will greatly help accuracy.

Some recorded situations just don't work.

Hope this helps.  Cheers.

NR
Washington DC

Navigation

[0] Message Index

[#] Next page

[*] Previous page

Go to full version