PDA

View Full Version : Microsoft Voice Recognition Demo ....


a_unique_person
31st July 2006, 01:51 AM
http://blogs.theage.com.au/barkersbyte/archives/2006/07/hullo_mum_er_au.html

richardm
31st July 2006, 05:57 AM
That reporter has clearly studied Kent Brockman carefully:


Microsoft should remember that the first part of "Recognition" is "Rec" - as in "Train Wreck"



The voice recognition stuff I've used seemed to work pretty well most of the time, although that pattern of words going wrong then things getting worse and worse is familiar.

LeCynthia
31st July 2006, 06:10 AM
I can't use voice recognition. When I see what appears on the screen as opposed to what I actually said I start laughing so hard I sometimes pass out.

Dark Jaguar
31st July 2006, 03:39 PM
Just recently I saw on "Discoveries this Week" a very interesting demonstration of a new voice recognition system. What I was never clear on though was if the system they were using was actually doing that in real time or if the demo was merely a scripted "conceptual demo".

At any rate, if they can do what they claim, it is huge. For the first time, there's a system that actually takes context into consideration. The demo was for a system that would actually consider what you were saying in context with not only the sentence you are saying, but in context with the entire past conversation as well as every conversation you've had in the past with it. If you said something like "Bach. Could you play something by him?" it would say, after a small delay "now playing Bach playlist". Or "I'd like you to play the song I just asked for again", it would understand that.

Apparently it involves a lot of special "learning algorythms and massive data streaming. On top of all that, the voice simulation coming out of the machine actually sounded close to a real human voice. Unfortunatly, I didn't catch any important names or links online I could provide you guys with. Also, again I'm not sure if this was merely "demo of concept" with scripted conversations or a demo of the real thing in action.

a_unique_person
31st July 2006, 07:28 PM
So it's not from Microsoft?

kevin
31st July 2006, 09:08 PM
that blog post has it wrong. It was a bug in the software that caused the microphone gain to go out of control. THe programmer fesses up here:

http://blogs.msdn.com/larryosterman/archive/2006/07/31/684327.aspx

richardm
1st August 2006, 01:44 AM
For the first time, there's a system that actually takes context into consideration. The demo was for a system that would actually consider what you were saying in context with not only the sentence you are saying, but in context with the entire past conversation as well as every conversation you've had in the past with it.

Actually I don't think that's as new as it seems, although it sounds like it's on a bigger scale than usual. Most voice recognition systems try to figure out what words you've spoken based on the other words it thinks it's got, which is why when a sentence starts to go wrong it often tends to get worse, as I've found in the past and as the Microsoft guy found in the clip.

NaturallySpeaking would "twig" what you'd said and sometimes go back and correct an entire sentence, which was impressive (when it worked). It was a bit offputting if you watched it carefully though, you had to keep going and trust that it would get it right in the end.

Meffy
1st August 2006, 08:16 AM
Agreed on the "nothing new." Years ago I tried Dragon for dictation and got superb recognition, even my torture test routine that includes difficult homophones used in correct contexts. Admittedly, I have an "announcer voice," which probably helps.

The software I write professionally employs speech regognition mostly for command and control, much easier to get right than free-form dictation.

Mongrel
1st August 2006, 08:56 AM
Where I used to work we had one of the IBM guys down to demo their software. His advice was to 'train' the software and use it, after a month or two, depending on use, it was advisable to retrain it as a human being learnt how to speak for the software better than the software learnt how a human being could speak.