By: Kas Thomas

Kas Thomas — Thu, 23 Oct 2008 13:00:24 +0000

I don’t mean to be flip, but the mere fact that you have to ask the question points to the answer, I think. I don’t know anyone who is seeing significant demand for UIMA. The folks at Nstein might have some insight into this.

By: Curt Monash

Curt Monash — Thu, 23 Oct 2008 00:41:21 +0000

Temis had a deal with, I think, Europol that specified UIMA.

Otherwise, I haven’t heard of a lot of demand.

By: Bob Carpenter

Bob Carpenter — Wed, 22 Oct 2008 21:49:59 +0000

We typically tell our potential customers that not only don’t we have any magic pixie dust, no one does. It’s best to cut out the hype up front!

I didn’t understand Marie’s point about how their key differetiator is that they’ve “completely decoupled the language model from the code that runs the analysis”. Doesn’t everyone do this?

Our product, LingPipe, has general high-level interfaces that are uniform across applications for everything from spelling correction to classification to tokenization, sentence detection, part-of-speech tagging and entity extraction.

Uniformity and portability I understand (up to the problem of adapting tag sets and tokenization standards), but how could UIMA help with speed? To code reusable components in UIMA requires translation into (and often out of) the common analysis stream (CAS) that handles “data exchange” among modules. For third parties, this is prohibitive, because I need to translate our tokens into UIMA tokens and back again before I send them to a tagger. Of course, I can just wrap tokenization, tagging, sentence extraction and entity extraction in a single UIMA module, but that defeats plug-and-play portability.

UIMA isn’t unusual with respect to streaming. We can do named entity annotation on multi-GB XML docs with very low memory overhead using the SAX parser and our generic entity extraction interface. With a single model shared across threads.

Is anyone seeing demand for UIMA outside of government-funded research? We’re still debating whether it’s worth our time to write more general and complete UIMA wrappers for LingPipe than have been contributed by third parties.

Comments on: More on Languageware

By: Kas Thomas

By: Curt Monash

By: Bob Carpenter