October 5, 2007

When to use exhaustive extraction

I’ve been emailing and/or talking with both Clarabridge and Attensity this week. Since they’re the two big proponents of exhaustive extraction, I naturally asked whether there are any cases exhaustive extraction should not be used. In Clarabridge’s case, it turns out exhaustive extraction is the default, and no customer has ever turned this default off. However, their current high end is several million documents* per year. They suspect that in some current projects with much higher volumes the default may finally be turned off.

*Actually, the word Clarabridge CTO Justin Langseth used was “verbatim.” But that’s essentially a synonym for document, only with the connotation that these documents will probably be people’s statements (think warranty cards, customer surveys, email, call center notes, etc.), with all that implies for their grammar, structure (or lack thereof), and so on.

I didn’t push Attensity for an answer that clear. What they said was simply that all their capabilities were integrated together, so everybody uses exhaustive extraction. I imagine they’d say something similar, but it seems I should follow up a little bit further …


One Response to “When to use exhaustive extraction”

  1. DBMS2 — DataBase Management System Services » Blog Archive » The four horsemen of data warehousing on April 25th, 2008 12:08 am

    […] per a series of posts over on Text Technologies. Specifically, I’ve focused on the two with exhaustive extraction strategies, namely Attensity and Clarabridge. (Exhaustive extraction is Attensity’s term for […]

Leave a Reply

Feed including blog about text analytics, text mining, and text search Subscribe to the Monash Research feed via RSS or email:


Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.