July 23, 2006

Text mining for compliance and legal discovery

One theme that keeps recurring in my talks with text mining and other text analytics/text technology companies is compliance. Ditto legal discovery, which is closely related. Most of the focus seems to be on three kinds of data:

  1. Vehicle defect evidence. The TREAD Act is of course the big driver here (no pun intended).
  2. Drug side effect evidence. The FDA is pushing that one.
  3. Email/correspondence archives. Text search/filtering/clustering/mining whatever is now a standard part of legal discovery.

There’s also activity in censoring real time email, IMs, etc., but that seems to often be done either by speciality products (e.g., Assentor), or as part of a general email/spam/whatever control product. And I have a lot of question as to how well at least the latter works, especially in enterprises that don’t totally shut down workers’ access to private webmail accounts.

These are active, interesting markets, and I intend to write more about them soon. But for now, are there any big compliance/legal drivers of text technologies that I’m simply overlooking?


2 Responses to “Text mining for compliance and legal discovery”

  1. Text Technologies»Blog Archive » Introduction to ClearForest on July 23rd, 2006 7:12 am

    […] ClearForest is one of the two companies whose name comes up for fact extraction applications, probably even a little ahead of Attensity. Their flagship account is the GM deal they did with IBM, kicking off the whole warranty report mining boom. Procter & Gamble is no slouch of a customer either. They’re involved enough in anti-terrorism that, when I asked Jay if he knew who Cogito was, he said “Of course.” And apparently one of their techie founders is the guy who coined the term “text mining” in the first place. […]

  2. Text Technologies»Blog Archive » Application processes in text mining – finding warning signs on July 27th, 2006 5:36 am

    […] 3. In other cases, one is looking for trouble even before one has found some. Compliance often falls into this category, as does web-crawling reputation management. One process, favored by Autonomy, is simply to monitor document flow for all important themes, and hope that the trouble signs jump out at you. Alternatively, one can monitor documents for known bad event flags – vehicle malfunctions, drug side effects, angry customers, whatever. If there are only a few documents with such flags, one can read them directly If there are too many for humans to just read and digest in a timely manner – well, then you’ve transitioned into Case 1 or Case 2! • • • […]

Leave a Reply

Feed including blog about text analytics, text mining, and text search Subscribe to the Monash Research feed via RSS or email:


Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.