IBM and UIMA – Text Technologies

MEN ARE FROM EARTH, COMPUTERS ARE FROM VULCAN

Curt Monash — Sat, 30 May 2009 06:15:44 +0000

The newsletter/column excerpted below was originally published in 1998. Some of the specific references are obviously very dated. But the general points about the requirements for successful natural language computer interfaces still hold true. Less progress has been made in the intervening decade-plus than I would have hoped, but some recent efforts — especially in the area of search-over-business-intelligence — are at least mildly encouraging. Emphasis added.

Natural language computer interfaces were introduced commercially about 15 years ago*. They failed miserably.

*I.e., the early 1980s

For example, Artificial Intelligence Corporation’s Intellect was a natural language DBMS query/reporting/charting tool. It was actually a pretty good product. But it’s infamous among industry insiders as the product for which IBM, in one of its first software licensing deals, got about 1700 trial installations — and less than a 1% sales close rate. Even its successor, Linguistic Technologies’ English Wizard*, doesn’t seem to be attracting many customers, despite consistently good product reviews.

*These days (i.e., in 2009) it’s owned by Progress and called EasyAsk. It still doesn’t seem to be selling well.

Another example was HAL, the natural language command interface to 1-2-3. HAL is the product that first made Bill Gross (subsequently the founder of Knowledge Adventure and idealab!) and his brother Larry famous. However, it achieved no success*, and was quickly dropped from Lotus’ product line.

*I loved the product personally. But I was sadly alone.

In retrospect, it’s obvious why natural language interfaces failed. First of all, they offered little advantage over the forms-and-menus paradigm that dominated enterprise computing in both the online-character-based and client-server-GUI eras. If you couldn’t meet an application need with forms and menus, you couldn’t meet it with natural language either.

Even worse, NL actually had a couple of clear disadvantages versus traditional interfaces. First of all, it required (ick!) typing, often more typing than the forms and menus did. Second, forms and menus tell the user exactly what he can do. Natural language, however, lets him give orders the computer doesn’t know how to follow. This is inefficient, not to mention frustrating.

However, even in 1983, it was obvious that the typing objection would go away some day, because of speech recognition — once desktop computers reached 100 MIPs or so. (Effective keyboard-replacement speech recognition — as opposed to true natural language understanding — is mainly a matter of processing power.) 15 years later, standard PCs exceed 100 MIPs (assuming that 1 MIPs = a couple of megahertz for these purposes), and speech recognition is indeed getting practical.

In fact, as become increasingly evident recently, speech recognition is now a hot technology. Bill Gates has been talking it up for a couple of years. Increasingly, the press has swung to believing him … And my parents just bought a PC with two speech recognition products on it.

That said, speech recognition is as misunderstood (no pun intended) as most artificial intelligence technologies. Yes, it beats typing, in a number of circumstances:

On the telephone (duh!)
“Busy hands” and/or “busy eyes” applications and locales (doctors‘ offices, trading floors, warehouses, etc. — and, some day in the future, your kitchen and car)
People simply reluctant to type (e.g., anybody with sufficient wrist or back problems, and many males over the age of 45)

But before our computers talk back and forth with us in the voice of Majel Barrett Roddenberry, applications are going to have to add several important elements required for truly functional natural-language interfaces:

Intuitively clear names for everything on (or just behind) the screen
Application-specific disambiguation logic

For most practical purposes, the latter requirement equates to

A new generation of document selection technology

THE RULE OF NAMES

According to legend, knowing something’s name gives you power over it. When that “something” is a button or menu choice on a speech-enabled computer, the legend is literally true. But when a feature doesn’t have an obvious name, you can’t easily invoke it.

When applications consisted mainly of forms and menus, this was rarely a problem. Everything had a clear role and label. But web pages are less organized. Hyperlinks can be scattered all over the place, with little rhyme or reason.

Frankly, I don’t think this is a hard problem to solve. It wouldn’t take a lot of XML to divide the page into clear regions, so that commands like “Show me article #3” (on a search results list) could be interpreted in the obvious way. But it does take at least some discipline; random web pages will not necessarily be easy to “talk” to.

CYBERNETIC LISTENING SKILLS

The bigger challenge is to make sure that the application can respond in some useful way, no matter what command it’s given. This is even more difficult than it was 15 years ago, because of the radical increase in “casual” computer usage. In the old days, we could assume the user had some clear business reason for using the application, and if necessary that s/he had time to be trained (even if people rarely sat still for as much training as they really needed). Therefore, we could at least assume that the users had at least a general idea of what the application did, and hence of which commands the computer could obey. From an NL standpoint, we could assume that what they actually “said” (which in those days meant “typed”) was at least reasonably close to what they were “supposed” to say.

Now, however, some of the most important applications are internet e-commerce and portals, competing and begging for the user’s attention. The user is there strictly on a voluntary basis, and if he doesn’t get immediate gratification, he‘s gone, history, hasta la bye-bye. Site-specific training isn’t even a consideration. And even if somebody did actually take a class on “How to use Excite,” the knowledge would be obsolete in six months. So applications, if they are to have natural language interfaces that please and respond to users, have to be able to respond pretty much to any command.

Ideally, voice-enabled systems would be like the computers on Star Trek, which can return information from vast archives, brew a pot of Earl Grey tea, play three parts of a quartet, create self-aware life forms, or answer questions like “Computer, what is the nature of the universe?” More realistically, they should be able, for example, to respond to a command like “Tell me about flights to Miami” by automatically giving the user a travel-reservation application or web page, and entering Miami in the appropriate form field.

If one thinks about the complications in such a system, it becomes clear that there are only two possible ways an application system can be designed to respond meaningfully to an enormous range of reasonable possible requests.

1. It can do the equivalent of saying “I’m sorry, I didn’t understand that,” “I’m sorry, I can’t do that,” and so on.

2. It can interpret many commands as text-search strings, and return appropriate results.

The first strategy — application-specific disambiguation logic, clear responses to “errors,” etc. — is absolutely necessary. No software is perfectly intelligent; the user will have to be asked for disambiguation help from time to time (just as clerks today ask customers to repeat their requests!). I’m not going to go into much detail about how that works because, frankly, it’s a tricky thing to get right. Users hate unnecessary disambiguation steps. They also hate the incorrect responses that result from ambiguity, and do tolerate being asked for help when it’s truly needed. In short, whatever you build the first time around will probably be wrong. So build something fast; then run, don’t walk, to the nearest usability lab, find out how you screwed up, and redo your system until you get it right.

I’m convinced that the second strategy — heavy reliance on text search technology — is a requirement as well. Just try to name a major web site that doesn’t use text search. True, text search has gotten a bad rap recently, mainly because a whole generation of search engines didn’t really work. But it will stage a comeback.

Related links

More on Languageware

Curt Monash — Fri, 10 Oct 2008 10:38:29 +0000

Marie Wallace of IBM wrote back in response to my post on Languageware. In particular, it seems I got the Languageware/UIMA relationship wrong. Marie’s email was long and thoughtful enough that, rather than just pointing her at the comment thread, I asked for permission to repost it. Here goes:

Thanks for your mention to LanguageWare on your blog, albeit a skeptical one I totally understand your scepticism as there is so much talk about text analytics these days and everyone believes they have solved the problem. I guess I can only hope that our approach will indeed prove to be different and offers some new and interesting perspectives.

The key differentiation in our approach is that we have completely decoupled the language model from the code that runs the analysis. This has been generalized to a set of data-driven algorithms that apply across many languages so that you can have an approach that makes the solution hugely and rapidly customizable (without having to change code). It is this flexibility that we believe is core to realizing multi-lingual and multi-domain text analysis applications in a real-word scenario. This customization environment is available for download from Alphaworks, http://www.alphaworks.ibm.com/tech/lrw, and we would love to get feedback from your community.

On your point about performance, we actually consider UIMA one of our greatest performance optimizations and core to our design. The point about one-pass is that we never go back over the same piece of text twice at the same “level” and take a very careful approach when defining our UIMA Annotators. Certain layers of language processing just don’t make sense to split up due to their interconnectedness and therefore we create our UIMA annotators according to where they sit in the overall processing layers. That’s the key point.

Anyway those are my thoughts, and thanks again for the mention. It’s really great to see these topics being discussed in an open and challenging forum.

Languageware — IBM takes another try at natural language processing

Curt Monash — Tue, 07 Oct 2008 15:51:26 +0000

Marie Wallace of IBM wrote in from Ireland to call my attention to Languageware, IBM’s latest try at natural language processing (NLP). Obviously, IBM has been down this road multiple times before, from ViaVoice (dictation software that got beat out by Dragon NaturallySpeaking) to Penelope (research project that seemingly went on for as long as Odysseus was away from Ithaca — rumor has it that the principals eventually decamped to Microsoft, and continued to not produce commercial technology there).

By the way — I by no means want to single out IBM’s natural language efforts for especial bashing. The AI industry’s unit of bogosity has long been the “microlenat,” and Doug Lenat’s life work is, approximately, solving natural language access. I sat next to Doug at dinner at an IJCAI/AAAI conference in 1985 or so. So far as I can tell, what he told me about then still hasn’t been delivered in real life. I’m not aware of any connection between Lenat and IBM.

What’s different this time, apparently, is a rigorous focus on performance. Marie and her team seem to believe that what has held natural language processing back in the past has been poor performance. That’s not as crazy as it sounds, since natural language may be one of those artificial intelligence problems in which brute force outperforms sophisticated heuristics (Lenatesque or otherwise). Still, I have to wonder if performance has really been the main problem.

One interesting side note is that a reason given for this great performance is that processing is done in one pass rather than several. Since seems to directly contradict the philosophy of UIMA, IBM’s proposed general-purpose text analytic industry standard. And it’s tough to see how that architectural choice alone can produce enough of a performance advantage to be a game-change.

The link I gave above already has quite a bit of material. Marie tells me that more and/or fresher material is coming soon.

Dubious statistic of the decade

Curt Monash — Fri, 29 Aug 2008 10:43:18 +0000

In a 2006 white paper, IBM claimed that “just 4 years from now, the world’s information base will be doubling in size every 11 hours.” This week, that statistic was passed on — utterly deadpan — by the Industry Standard and Stephen Arnold. Arnold’s post actually reads as if he takes the figure seriously.

Now, I’ll confess to not having seen the argument in favor of that statistic. But color me skeptical that, by any measure of “information”, it will grow by a factor of more than 2^730 in a year, or 2^7300 in a decade …

The phrase “business intelligence” was COINED for text analytics

Curt Monash — Fri, 11 Jul 2008 07:31:00 +0000

Late last year, there was a little flap about who invented the phrase business intelligence. Credit turns out to go to an IBM researcher named H. P. Luhn, as per this 1958 paper. Well, I finally took a look at the paper, after Jeff Jones of IBM sent over another copy. And guess what? It’s all about text analytics. Specifically, it’s about what we might now call a combination of classification and knowledge management.

Half a century later, the industry is finally poised to deliver on that vision.

Clarabridge does SaaS, sees Inxight

Curt Monash — Wed, 14 Nov 2007 18:11:28 +0000

I just had a quick chat with text mining vendor Clarabridge’s CEO Sid Banerjee. Naturally, I asked the standard “So who are you seeing in the marketplace the most?” question. Attensity is unsurprisingly #1. What’s new, however, is that Inxight – heretofore not a text mining presence vs. commercially-focused Clarabridge – has begun to show up a bit this quarter, via the Business Objects sales force. Sid was of course dismissive of their current level of technological readiness and integration – but at least BOBJ/Inxight is showing up now.

The most interesting point was text mining SaaS (Software as a Service). When Clarabridge first put out its “We offer SaaS now!” announcement, I yawned. But Sid tells me that about half of Clarabridge’s deals now are actually SaaS. The way the SaaS technology works is pretty simple. The customer gathers together text into a staging database – typically daily or weekly – and it gets sucked into a Clarabridge-managed Clarabridge installation in some high-end SaaS data center. If there’s a desire to join the results of the text analysis with some tabular data from the client’s data warehouse, the needed columns get sent over as well. And then Clarabridge does its thing.

It has always been the case that business intelligence was an IT systems software technology that often wound up being sold on an application basis to end-user departments. Clarabridge very much fits that model. And while it used to be the case that BI adoption was pretty simple, that’s increasingly not the case, which is one reason SaaS is appealing. So this all makes a lot of sense.

Even so, I was surprised to hear that SaaS had so quickly become half of Clarabridge’s business. Wow.

Since Clarabridge touts Cognos as an important partner, and Cognos is being bought by IBM, I also asked Sid about UIMA. He basically responded that UIMA was unlikely to become relevant to Clarabridge any time soon, because the way Clarabridge interfaces with other software is SQL. Up to a point, that makes great sense to me. But if we buy into the comprehensive/exhaustive extraction story — as Clarabridge does — then the day should and will come when serious linguistic processing gets done on text after it is extracted into a relational database. And if that happens, then all of a sudden SQL won’t be the only interface integrating text analytics with BI.

Everybody’s talking about structured/unstructured integration

Curt Monash — Mon, 12 Nov 2007 17:04:46 +0000

Today’s big news is IBM’s $5 billion acquisition of Cognos. Part of the analyst conference call was two customer examples of how the companies had worked together in the past — and one of those two had a lot of “integration of structured and unstructured data.” The application sounded more like a 360-degree customer view, retrieving text documents alongside relational records, than it did like hardcore text analytics. Even so, it illustrates a trend that I was seeing even before BOBJ’s buy of Inxight, namely an increasing focus in the business intelligence world on at least the trappings of text analytics.

What TEMIS is seeing in the marketplace

Curt Monash — Thu, 01 Nov 2007 09:49:22 +0000

CEO Eric Bregand of Temis recently checked in by email with an update on text mining market activity. Highlights of Eric’s views include:

Yep, Voice Of The Customer is hot, in “many markets”; Eric specifically mentioned banking, car, energy, food, and retail. He further sees IBM backing VotC as text’s “killer app.” (Note: Temis has a history of partnering with IBM, most notably via its unusually strong commitment to UIMA.)
Specifically, THE hot topics in the European market these days are competitive intelligence and sentiment analysis. (Note: I’ve always thought Temis got serious about competitive analysis a little earlier than most other text mining vendors did.)
Life sciences is an ever growing focus for Temis.
I confused him a bit with how I phrased my question about custom publishing and Temis’ Mark Logic partnership. But he did express favorable views of the market, specifically in the area of integrating text mining and native XML database management, and even volunteered that nStein appears to be doing well.

TEMIS, part 1 – overview

Curt Monash — Wed, 04 Apr 2007 19:18:05 +0000

Due to various transatlantic communication glitches, I’d never had a serious briefing with text mining vendor TEMIS until yesterday, when I finally connected with CEO Eric Bregand. So here’s a quick TEMIS overview; I’ll discuss what they actually do in a separate post.

TEMIS has 50 people; 3 main businesses and a couple of secondary ones; two larger offices in France; and smaller offices in Germany and the US. As would be expected, TEMIS’ customer base is concentrated in Continental Europe. The US exceptions seem concentrated in the life sciences vertical (not coincidentally, the US office is outside Philadelphia).
Like Inxight, TEMIS is at least partly a spin-off from Xerox’s text analytics efforts. Indeed, its Grenoble office was acquired from Xerox. Unlike Inxight, TEMIS doesn’t serious pursue OEM business, but a couple of exceptions have occurred (Eric mentioned Convera and Documentum).
TEMIS claims to follow a middle course between ClearForest on the one and Attensity and Clarabridge on the other, in that it doesn’t offer exhaustive extraction but does offer “iterative extraction.” (More on that below.) Frankly, I not yet sure that there’s much of a difference in this regard between TEMIS and ClearForest. Like ClearForest – and I’m not sure Attensity would completely dispute this – TEMIS believes that really sophisticated semantic analysis is hard in an exhaustive-extraction scenario. Eric also raised size/performance issues about exhaustive extraction, but I found those unconvincing in this era of cheap and powerful data warehouse engines.
Unlike most of the rest of the text analytics industry, TEMIS really likes UIMA, having committed to it a year and a half ago. So, apparently, does the customer for at least one large deal jointly won with IBM (Europol). The big benefit of UIMA is openness/connectivity, but load-balancing/failover also got mentioned a few times, and that’s attributed to UIMA as well.

I’ll confess to being a little unclear about “iterative exhaustion,” and indeed to suspecting that it conflates two different things. One would just be the inherent waterfall-style processing inherent to UIMA and, for that matter, to most other approaches to tokenization. The other is the idea that you can do a decent job of identifying what’s in each document in a large corpus in one pass, then do another pass focusing more intently on the ones that might have exactly what you’re looking for.

Links:

Attensity FRN (fact-relationship network)
Attensity vs. ClearForest
Clarabridge overview
UIMA
DBMS2 coverage of data warehouse appliances and other data warehouse engines
The French presence in text analytics

Text mining and search, joined at the hip

Curt Monash — Sat, 11 Nov 2006 08:14:25 +0000

Most people in the text analytics market realize that text mining and search are somewhat related. But I don’t think they often stop to contemplate just how close the relationship is, could be, or someday probably will become. Here’s part of what I mean:

Text mining powers search. The biggest text mining outfits in the world, possibly excepting the US intelligence community, are surely Google, Yahoo, and perhaps Microsoft.
Search powers text mining. Restricting the corpus of documents to mine, even via a keyword search, makes tons of sense. That’s one of the good ideas in Attensity 4.
Text mining and search are powered by the same underlying technologies. For starters, there’s all the tokenization, extraction, etc. that vendors in both areas license from Inxight and its competitors. Beyond that, I think there’s a future play in integrated taxonomy management that will rearrange the text analytics market landscape.

So who does “get it” about the search/text mining connection? The UIMA folks at IBM probably do. Inxight surely does. Attensity seemingly does, and so do most large search engine vendors (FAST and the public guys for sure; I’m not so certain about Autonomy and Convera). A small company whose CEO just called me yesterday does. I think I do.

But I’m not sure that the smaller text mining and search outfits – or the small text-oriented parts of large enterprise software vendors — have gotten the message at all yet …