December 2, 2007

So what’s the state of speech recognition and dictation software?

Linda asked me about the state of desktop dictation technology. In particular, she asked me whether there was much difference between the latest version and earlier, cheaper ones. My knowledge of the area is out of date, so I thought I’d throw both the specific question and the broader subject of speech recognition out there for general discussion.

Here’s much of what I know or believe about speech recognition:

Any thoughts? In particular, what version of Dragon NaturallySpeaking or a competitive product should Linda use, and why?


Comments

12 Responses to “So what’s the state of speech recognition and dictation software?”

  1. Bill Burke on December 3rd, 2007 4:34 am

    .
    Hi Curt,

    Speech recognition in general, is gaining ground as an ubiquitous technology almost daily..

    And Windows Vista offers dictation, and Command and Control that’s previously unheard of!

    Here’s an article that may help you get a better picture:

    http://wirelessspeech.blogspot.com/2007/12/speech-recognition-top-10-flop-says.html

    Bill Burke
    http://wirelessspeech.blogspot.com
    .

  2. Steve Hochschild on December 3rd, 2007 11:21 am

    If you have Vista you don’t need to get Dragon. Just go to the accessibility menu and turn on the speech recognition that is included in the OS. It is very good, and is both free and immediate, just a few clicks and a training session away…

    Thanks!
    steveh

  3. Curt Monash on December 3rd, 2007 3:53 pm

    Thanks, Steve.

    I’ve chickened out and haven’t run Vista so far, despite Microsoft’s blandishments.

    CAM

  4. Martin Griffies on December 3rd, 2007 5:23 pm

    The MS product is also downloadable for XP / Word 2003, or you may already have it.
    It’s only for US English, Chinese, Japanese.

  5. Linda Barlow on December 4th, 2007 1:02 am

    Unfortunately, I don’t have Vista.

    I used the Word program a few years ago, and found it pretty annoying. Despite long hours of “training,” the errors when I dictated were legion. I know writers who use Dragon, and love it, but the version they’re using is several years old. Does anyone know if Dragon has a recent update?

    Thanks.

    –Linda

  6. Curt Monash on December 4th, 2007 1:21 am

    Per Wikipedia, Dragon NaturallySpeaking 9 came out in mid-2006, and doesn’t require training. Does anybody know whether there are other significant enhancements in Version 9? And is the no-training claim really true?

    CAM

  7. Stephen L Reed on December 20th, 2007 2:02 pm

    Hi. I’ve used the open source Java research software Sphinx-4, which performs automatic speech recognition. I get about 5% – 10% error rate on my large vocabulary evaluations. It does not have a facility for training. And its not really an end-user product but it can be incorporated easily into Java applications.

    See: http://cmusphinx.sourceforge.net/html/cmusphinx.php

    -Steve

    Stephen L. Reed

    Artificial Intelligence Researcher
    http://texai.org/blog
    http://texai.org
    3008 Oak Crest Ave.
    Austin, Texas, USA 78704
    512.791.7860

  8. Nicholas Bedworth, CTO, DigitalDirect Development Corporation on May 15th, 2008 10:00 pm

    During the past few days, I’ve been increasing my usage of SR in Vista, and the results are encouraging. A boom microphone is essential (a Bluetooth-connected earset won’t work) and a reasonably quiet environment is needed (loud noises from outside such as bird songs (!) don’t help).

    Anyway, because the Vista SR is part of the OS, it seems to have knowledge of all the special words, names, etc., in ones documents, contacts, and so forth. This radically reduces training. For ordinary conversation recognition, it does very well, and it sure beats typing. If it mis-identifies a word, the correct word is almost always found on the pop-up menu of alternates.

    Not perfect, but impressive. Training is quite short, perhaps 15 minutes. And it also gives good control over the desktop, once again, not perfectly, but it’s a whole lot better than typing.

  9. yasir on October 6th, 2008 7:49 am

    I think Sphinx is the only Opensource Speech recogntion program that works with Asterisk. I’ve never luck to integrate Sphinx 3 and 4 with Asterisk. However I tried Sphinx2 with Asterisk and it worked well for me on Asterisk 1.2.x, I followed the following steps:

    http://www.syednetworks.com/asterisk-integration-with-sphinx-voice-recognition-system

    If version 3 and 4 works for anyone please share. Thanks

  10. MEN ARE FROM EARTH, COMPUTERS ARE FROM VULCAN | Text Technologies on May 30th, 2009 2:15 am

    [...] My December, 2007 survey of speech recognition technology [...]

  11. Bottleneck Whack-A-Mole | DBMS2 -- DataBase Management System Services on August 21st, 2009 3:05 am

    [...] it goes even further. For example, I was told by a guy who is now a senior researcher at Attivio: “How do you make a good speech recognition [...]

  12. Lorne Babcock Sr. on March 1st, 2013 9:28 pm

    I have been using Dragon Dictation since its very earliest inception, in fact I am using Dragon dictation 11.5 to dictate this note. I have never been happy with this or any similar programs.

    We are, at best, but a single step along a very long road toward real speech recognition.

    I am a writer, or at least I lie to myself when I say that and without Dragon Dictation I would be completely lost.

    The problem is that Dragon makes at least one error in every line of text. If I am dictating a paragraph in a novel and Dragon makes a mistake or indeed several mistakes in that paragraph and I return to make corrections, I frequently am at a loss as to the exact wording of the paragraph. That’s my fault, not the fault of Dragon Dictation.

    Nuance naturally advocate for their software suggesting that the software is 99% accurate, when indeed, nothing could be further from the truth. I would suggest, from my personal experience, that the software is at best 70% accurate.

    Windows 7 has speech recognition built-in. If you have any hair and you plan to keep your hair then do not make any effort to use Microsoft’s speech recognition. I have no hair because I tore out what little I had in handfuls trying to get Microsoft’s speech recognition to work in any meaningful way.

    I look forward to the day when we will have real speech recognition but alas, I suspect I will be little but a memory on a tombstone long before that happens.

Leave a Reply




Feed including blog about text analytics, text mining, and text search Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.