Language recognition

Analysis of technologies that recognize and/or respond directly to spoken or written human languages.

November 25, 2012

The future of search

I believe there are two ways search will improve significantly in the future. First, since talking is easier than typing, speech recognition will allow longer and more accurate input strings. Second, search will be informed by much more persistent user information, with search companies having very detailed understanding of searchers. Based on that, I expect:

My reasoning starts from several observations:

In principle, there are two main ways to make search better:

The latter, I think, is where significant future improvement will be found.

Read more

May 30, 2009


The newsletter/column excerpted below was originally published in 1998.  Some of the specific references are obviously very dated.  But the general points about the requirements for successful natural language computer interfaces still hold true.  Less progress has been made in the intervening decade-plus than I would have hoped, but some recent efforts — especially in the area of search-over-business-intelligence — are at least mildly encouraging.  Emphasis added.

Natural language computer interfaces were introduced commercially about 15 years ago*.  They failed miserably.

*I.e., the early 1980s

For example, Artificial Intelligence Corporation’s Intellect was a natural language DBMS query/reporting/charting tool.  It was actually a pretty good product.  But it’s infamous among industry insiders as the product for which IBM, in one of its first software licensing deals, got about 1700 trial installations — and less than a 1% sales close rate.  Even its successor, Linguistic Technologies’ English Wizard*, doesn’t seem to be attracting many customers, despite consistently good product reviews.

*These days (i.e., in 2009) it’s owned by Progress and called EasyAsk. It still doesn’t seem to be selling well.

Another example was HAL, the natural language command interface to 1-2-3.  HAL is the product that first made Bill Gross (subsequently the founder of Knowledge Adventure and idealab!) and his brother Larry famous.  However, it achieved no success*, and was quickly dropped from Lotus’ product line.

*I loved the product personally. But I was sadly alone.

In retrospect, it’s obvious why natural language interfaces failed. First of all, they offered little advantage over the forms-and-menus paradigm that dominated enterprise computing in both the online-character-based and client-server-GUI eras.  If you couldn’t meet an application need with forms and menus, you couldn’t meet it with natural language either. Read more

May 29, 2009

Google Wave — finally a Microsoft killer?

Google held a superbly-received preview of a new technology called Google Wave, which promises to “reinvent communication.” In simplest terms, Google Wave is a software platform that:

If this all works out, Google Wave could play merry hell with Microsoft Outlook, Microsoft Word, Microsoft Exchange, Microsoft SharePoint, and more.

I suspect it will.

And by the way, there’s a cool “natural language” angle as well. Read more

November 11, 2008

Lukewarm review of Yahoo mobile search

Stephen Shankland reviewed Yahoo’s mobile voice search, which works by taking voice input and returning results onscreen (in his case on his Blackberry Pearl). He found:

No big surprises there. 😀

October 10, 2008

More on Languageware

Marie Wallace of IBM wrote back in response to my post on Languageware. In particular, it seems I got the Languageware/UIMA relationship wrong. Marie’s email was long and thoughtful enough that, rather than just pointing her at the comment thread, I asked for permission to repost it. Here goes:

Thanks for your mention to LanguageWare on your blog, albeit a skeptical one :-) I totally understand your scepticism as there is so much talk about text analytics these days and everyone believes they have solved the problem. I guess I can only hope that our approach will indeed prove to be different and offers some new and interesting perspectives.

The key differentiation in our approach is that we have completely decoupled the language model from the code that runs the analysis. This has been generalized to a set of data-driven algorithms that apply across many languages so that you can have an approach that makes the solution hugely and rapidly customizable (without having to change code). It is this flexibility that we believe is core to realizing multi-lingual and multi-domain text analysis applications in a real-word scenario. This customization environment is available for download from Alphaworks,, and we would love to get feedback from your community.

On your point about performance, we actually consider UIMA one of our greatest performance optimizations and core to our design. The point about one-pass is that we never go back over the same piece of text twice at the same “level” and take a very careful approach when defining our UIMA Annotators. Certain layers of language processing just don’t make sense to split up due to their interconnectedness and therefore we create our UIMA annotators according to where they sit in the overall processing layers. That’s the key point.

Anyway those are my thoughts, and thanks again for the mention. It’s really great to see these topics being discussed in an open and challenging forum.

October 7, 2008

Languageware — IBM takes another try at natural language processing

Marie Wallace of IBM wrote in from Ireland to call my attention to Languageware, IBM’s latest try at natural language processing (NLP). Obviously, IBM has been down this road multiple times before, from ViaVoice (dictation software that got beat out by Dragon NaturallySpeaking) to Penelope (research project that seemingly went on for as long as Odysseus was away from Ithaca — rumor has it that the principals eventually decamped to Microsoft, and continued to not produce commercial technology there). Read more

July 10, 2008

Chatbot game — Digg meets Eliza?

I forget how I got the URL, but something called the Chatbot Game purports to be a combination of Eliza and Digg. That is, it’s a chatbot with a lot of rules; anybody can submit rules; rules are voted up and down.

I don’t think I’ll want to play with it for a while (I’m heading off on vacation for a while), so I thought I’d post it here to see if anybody else had any thoughts about or familiarity with it.

Related link

July 7, 2008

TechCrunchIT rants against voice recognition

TechCrunchIT ranted yesterday against voice recognition. Parts of the argument have validity, but I think the overall argument was overstated.

Key points included:

1. Microsoft and Bill Gates have been overoptimistic about voice recognition.

2. Who needs voice when you have keyboards big and small?

3. The office environment is too noisy for voice recognition to work.

Read more

June 19, 2008

3 specialized markets for text analytics

In the previous post, I offered a list of eight linguistics-based market segments, and a slide deck surveying them. And I promised a series of follow-up posts based on the slides. Read more

June 19, 2008

The Text Analytics Marketplace: Competitive landscape and trends

As I see it, there are eight distinct market areas that each depend heavily on linguistic technology. Five are off-shoots of what used to be called “information retrieval”:

1. Web search

2. Public-facing site search

3. Enterprise search and knowledge management

4. Custom publishing

5. Text mining and extraction

Three are more standalone:

6. Spam filtering

7. Voice recognition

8. Machine translation

Read more

Next Page →

Feed including blog about text analytics, text mining, and text search Subscribe to the Monash Research feed via RSS or email:


Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.