Sergei Ananyan is president of Megaputer, which is not one of the easier companies to get information about. They’re an essentially Russian firm based in Bloomington, Indiana. Their website is, to put it kindly, not up to date. And I wound up speaking with Sergei while he was at his rural vacation house, located somewhere between the Black and Aral Seas.
However, Sergei followed up by email with his views of the marketplace, and I think they’re interesting enough to share below. I really like his focus on analytic business processes, something that generally doesn’t get enough consideration.
(Emphasis mine. Also, for context, please note that Megacomputer started out as a data mining generalist, but has increasingly focused on text mining.)
I believe that the Text Mining market is currently characterized by three main features:
1) This is an emerging and highly fragmented market. So far, only early adopters have incorporated text mining systems as an integral part of their business processes. Most customers are evaluating the effectiveness of a text mining solution by comparing it to the effectiveness of their existing manual data analysis processes, more than to the solutions from other vendors. Different customers are focused on different tasks, and thus have to seek tools from vendors that have good offerings of the respective capabilities. This leads to market fragmentation. The situation will be gradually changing as best practices are worked out for various standard application domains. But so far, relatively few case studies with proven ROI have been reported; correspondingly, best practices are yet to be formulated.
2) End consumers of results generated in Text Mining are not data analysts or statisticians, as in Data Mining, but rather the upper management of a company. These people need to interact with the results of text analysis in order to make decisions based on text mining efforts and substantiate these decisions. They have no time or skills to mess with developing analysis scenarios; rather, they need a very simple interface for viewing and manipulating the results of the analysis. They need dashboards featuring results obtained through the execution of nontrivial data analysis scenarios developed by their colleagues, the data analysts.
3) Documents that require analysis are frequently linked with some structured attributes. For example, for drug safety reports structured fields can embrace date and time of the report, drug name, and age, gender, type and location of the reporter. Values of these attributes provide vital context for correctly interpreting the related narratives. Customers expect a text mining system to be able to perform joint analysis of information extracted from report narratives and associated structured attributes.