According to Attensity CTO David Bean:
- Voice of the Customer/Market applications require less linguistic sophistication than other text mining applications.
- Hence, Voice of the Customer/Market apps are easier to get running than other text mining applications, which he conjectures is a big part of the reason for burgeoning sales.
I’m guessing most text mining vendors would agree with those views, although they might not agree with his elaborations, which include:
- Attensity’s knowledge extraction technology is more sophisticated than Clarabridge’s or most other competitors’.
- In particular, Clarabridge’s extraction is little more than bag-of-words.
- There’s a good match between companies he thinks have less-sophisticated extraction (e.g., Clarabridge, SAS, SPSS) and companies whose text mining sales are heavily concentrated in Voice of the Customer/Market applications.
So the question arises: Just how much linguistic sophistication is needed in these market-trend-oriented text mining applications?
I actually got onto this subject not just because of what David said, but also via a conversation an hour earlier with Brooke Aker of Expert System, who proposed linguistic sophistication as a key reason for beating the competition (which, however, didn’t include Attensity or Clarabridge) at two accounts. The point Brooke was stressing is that it’s important to be able to extract multiple facts or indicators of sentiment from the same sentence. E.g., “I just had a crummy Chevy, but at least the seats were comfortable” is both a negative indicator about Chevrolet and a positive indicator about Chevrolet’s seats. Attensity captures both of those too, and I think Clarabridge would as well. (If you do comprehensive/ exhaustive extraction, you extract — well, you should extract comprehensively.)
Anyhow, my first-best answer to the question I posed is:
- Sentiment analysis is hard, at least in venues where you have to deal with slang, metaphor, or irony (the real biggie). The more sophisticated, the better.
- Otherwise, the linguistics of customer/marketing applications is pretty straightforward. Just put together the right list of wacky synonyms, and you’re good to go.
But what do you think?