Speech recognition

Analysis of technologies that recognize and/or respond directly to voice and speech. Related subjects include:

November 25, 2012

The future of search

I believe there are two ways search will improve significantly in the future. First, since talking is easier than typing, speech recognition will allow longer and more accurate input strings. Second, search will be informed by much more persistent user information, with search companies having very detailed understanding of searchers. Based on that, I expect:

My reasoning starts from several observations:

In principle, there are two main ways to make search better:

The latter, I think, is where significant future improvement will be found.

Read more

May 30, 2009

MEN ARE FROM EARTH, COMPUTERS ARE FROM VULCAN

The newsletter/column excerpted below was originally published in 1998.Ā  Some of the specific references are obviously very dated.Ā  But the general points about the requirements for successful natural language computer interfaces still hold true.Ā  Less progress has been made in the intervening decade-plus than I would have hoped, but some recent efforts — especially in the area of search-over-business-intelligence — are at least mildly encouraging.Ā  Emphasis added.

Natural language computer interfaces were introduced commercially about 15 years ago*.Ā  They failed miserably.

*I.e., the early 1980s

For example, Artificial Intelligence Corporation’s Intellect was a natural language DBMS query/reporting/charting tool.Ā  It was actually a pretty good product.Ā  But it’s infamous among industry insiders as the product for which IBM, in one of its first software licensing deals, got about 1700 trial installations — and less than a 1% sales close rate.Ā  Even its successor, Linguistic Technologies’ English Wizard*, doesn’t seem to be attracting many customers, despite consistently good product reviews.

*These days (i.e., in 2009) it’s owned by Progress and called EasyAsk. It still doesn’t seem to be selling well.

Another example was HAL, the natural language command interface to 1-2-3.Ā  HAL is the product that first made Bill Gross (subsequently the founder of Knowledge Adventure and idealab!) and his brother Larry famous.Ā  However, it achieved no success*, and was quickly dropped from Lotus’ product line.

*I loved the product personally. But I was sadly alone.

In retrospect, it’s obvious why natural language interfaces failed. First of all, they offered little advantage over the forms-and-menus paradigm that dominated enterprise computing in both the online-character-based and client-server-GUI eras.Ā  If you couldn’t meet an application need with forms and menus, you couldn’t meet it with natural language either. Read more

November 11, 2008

Lukewarm review of Yahoo mobile search

Stephen Shankland reviewed Yahoo’s mobile voice search, which works by taking voice input and returning results onscreen (in his case on his Blackberry Pearl). He found:

No big surprises there. šŸ˜€

July 7, 2008

TechCrunchIT rants against voice recognition

TechCrunchIT ranted yesterday against voice recognition. Parts of the argument have validity, but I think the overall argument was overstated.

Key points included:

1. Microsoft and Bill Gates have been overoptimistic about voice recognition.

2. Who needs voice when you have keyboards big and small?

3. The office environment is too noisy for voice recognition to work.

Read more

June 19, 2008

3 specialized markets for text analytics

In the previous post, I offered a list of eight linguistics-based market segments, and a slide deck surveying them. And I promised a series of follow-up posts based on the slides. Read more

June 19, 2008

The Text Analytics Marketplace: Competitive landscape and trends

As I see it, there are eight distinct market areas that each depend heavily on linguistic technology. Five are off-shoots of what used to be called ā€œinformation retrievalā€:

1. Web search

2. Public-facing site search

3. Enterprise search and knowledge management

4. Custom publishing

5. Text mining and extraction

Three are more standalone:

6. Spam filtering

7. Voice recognition

8. Machine translation

Read more

January 17, 2008

Dr. Doolittle in silicon

The Reg passes along a Reuters story that Hungarian scientists have built a system to automatically understand canine vocalizations. I’d like to say it’s a woof-to-Magyar translator, but apparently all it does is recognize the doggies’ emotional states. The story reports that the system has 43% accuracy, vs. 40% for humans.

I must confess, however, to being somewhat puzzled about how they measure success. Does the pooch fill out a survey form afterwards? Do they conclude that the beast wasn’t angry if the experimenter doesn’t get bitten?

I need to know a bit more about the research protocol before I know what to think about this.

EDIT: The CBC has a little more detail. The underlying research paper is appearing in Animal Cognition.

December 2, 2007

So what’s the state of speech recognition and dictation software?

Linda asked me about the state of desktop dictation technology. In particular, she asked me whether there was much difference between the latest version and earlier, cheaper ones. My knowledge of the area is out of date, so I thought I’d throw both the specific question and the broader subject of speech recognition out there for general discussion.

Here’s much of what I know or believe about speech recognition:

November 30, 2007

NEC simplifies the voice translation problem

NEC announced research-level technology that lets a cellphone automatically translate from Japanese into English. The key idea is that they are generating text output, not speech, which lets them sidestep pesky problems about accuracy. I.e. (emphasis mine):

One second after the phone hears speech in Japanese, the cellphone with the new technology shows the text on the screen. One second later, an English version appears. …

“We would need to study how to recognise [sic] voices on the phone precisely. Another problem would be how the person on the other side of the line could know if his or her words are being translated correctly,” he said.

Read more

July 16, 2007

Progress EasyAsk

I dropped by Progress a couple of weeks ago for back-to-back briefings on Apama and EasyAsk. EasyAsk is Larry Harris’ second try at natural language query, after the Intellect product fell by the wayside at Trinzic, the company Artificial Intelligence Corporation grew into.* After a friendly divorce from the company he founded, if my memory is correct, Larry was able to build EasyAsk very directly on top of the Intellect intellectual property.

*Other company or product names in the mix at various times include AI Corp and English Wizard. Not inappropriately, it seems that Larry has quite an affinity for synonyms …

EasyAsk is still a small business. The bulk is still in enterprise query, but new activity is concentrated on e-commerce applications. While Larry thinks that they’ve solved most of the other technical problems that have bedeviled him over the past three decades, the system still takes too long to implement. Read more

Next Page →

Feed including blog about text analytics, text mining, and text search Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.