May 8th, 2008 Curt Monash
Ironically coming right after a Google indexing problem, I am putting up my first sponsored blog post ever. It’s in connection with the forthcoming Text Analytics Summit, at which I will be speaking (in Boston) on June 16. The post itself offers a free white paper by the estimable Seth Grimes.
Read the rest of this entry »
Posted in Text Analytics Summit, Text mining | No Comments »
April 25th, 2008 Curt Monash
As per this job listing, at least one “major NYC investment bank” plans to do text mining on a proprietary trading desk.
The successful candidate will mine text data from numerous news sources and incorporate the information the proprietary trading systems.
Posted in Application areas, Investment research and trading, Text mining | No Comments »
January 31st, 2008 Curt Monash
I caught up with Expert System S.p.A. last week. They turn out to be doing $10 million in text technology annual revenue. That alone is surprising (sadly), but what’s really remarkable is that they did it almost entirely in the Italian market. As you might guess, that figure includes a little bit of everything, from search engines to Italian language filters for Microsoft Office to text mining. But only $3 ½ million of Expert System’s revenue is from the government (and I think that includes civilian agencies), and under 30% is professional services, so on the whole it seems like a pretty real accomplishment. Oh yes – Expert Systems says it’s entirely self-funded.
As of last year, Expert System also has English-language products, and a couple of minor OEM sales in the US (for mobile search and semantic web applications). German- and Arabic-language products are in beta test. The company says that its market focus going forward is national security – surely the reason for the Arabic – and competitive intelligence. It envisions selling through partners such as system integrators, although I think that makes more sense for the government market than it does vis-a-vis civilian companies. In February the company is introducing a market intelligence product focused on sentiment analysis.
Expert System is a bit of a throwback, in that it talks lovingly of the semantic network that informs its products.
Read the rest of this entry »
Posted in Application areas, Enterprise search, Expert System S.p.A., Ontologies and context identification, Search and text storage, Text mining, Voice of the Market/competitive intelligence | No Comments »
December 23rd, 2007 Curt Monash
Text mining is science-project artificial intelligence. Fiction. Text mining is proven in many practical applications.
To implement text mining, you need computational linguists. Fact. Monash’s Second Law of Commercial Semantics states “Where there are ontologies, there is consulting.” And it’s linguists, or reasonable facsimiles of same, who do the consulting.
To use text mining, you need computational linguists. Fiction. When last I counted, the number of known computational linguists working for end-user organizations, worldwide, was precisely 1, at Procter & Gamble. (Intelligence agencies excepted, of course.) I’d guess it’s higher now, but I probably could still count them all without taking my socks off.
CRM applications are driving the growth of text mining. Fact. Most current growth in text mining seems to come from Voice of the Customer and Voice of the Market/competitive intelligence applications. And a couple of years ago, when SAS and SPSS had a joint boom in text mining, a lot of that was coming from CRM.
Text mining products are useful mainly for large enterprises. More fact than fiction. Text mining makes the most sense when you have too much text for humans to read and summarize.
Text mining doesn’t fit well with relational databases. Fiction. The fastest-growing text mining companies seem to be Attensity and Clarabridge, who consistently extract textual information into relational databases.
Text mining imposes structure on unstructured* data. More fact than fiction. Most text mining applications involve examining free-text documents and creating entries in relational or XML databases. Most people would call that a transition from unstructured to structured form.
*I still don’t like the “structured/unstructured” distinction, but with repetition I’m getting somewhat inured to it.
Enterprise search is an alternative to text mining. Fact. You can use a high-end search engine to cluster documents and look for trends and insight. It’s not the real McCoy, but in some cases it gives you 80% of the benefit of the real thing.
Text mining is an ingredient, not a product category. Part fact, part fiction. The biggest text mining efforts in the world are probably at Google, Yahoo, Microsoft search, and Dow Jones/Factiva. Antispam vendors also invest a lot in text mining. Two of the top five independent text mining vendors were acquired this year (ClearForest and Inxight). And of the many dozens of small text mining independents, most are focused on specific niches.
Even so, Attensity, Clarabridge, and Temis show that, at least for now, text mining remains a legitimate product category.
The text mining industry is in trouble. Part fact, part fiction. As I recently ranted, even the leading text mining vendors are letting many opportunities pass them by. And like many software sectors, text mining seems poised to be absorbed via large-company acquisition. SAP has already secured a text mining business via BOBJ/Inxight, but at least one vendor each could easily be bought by Oracle, Microsoft (despite the in-house expertise from its search arm), and IBM (despite or even in connection with UIMA).
But in the meantime, a few small text mining vendors are still showing rapid growth.
Previous “fact and fiction” post: Data warehouse appliances.
Stay informed! No hassle, no spam — all it takes is an email address or an RSS subscription! Get all our research — on text analytics, DBMS, BI, and everything else — or just the text analytics part, or even just a very few notifications of our most important news.
Technorati Tags: Text mining, text analytics
Posted in Text mining | 2 Comments »
December 19th, 2007 Curt Monash
Scout Labs sounds like even more of what I was thinking of than Summize. It’s a shame that the “traditional” text mining vendors didn’t get there first.
Posted in Text mining, Voice of the Market/competitive intelligence | 2 Comments »
December 18th, 2007 Curt Monash
I’ve been thinking for a long time that the various text mining companies doing sentiment analysis should try some public-facing (or at least multi-customer) services. Investors might love such a thing. So might marketing managers (actually, Factiva claims to be active there, at least as per their web site). And as a key part of the strategy, text mining companies selling to enterprises might brand such a site and gain massive awareness accordingly. Well, it seems that public-facing sentiment analysis sites are springing up. At least, Summize has. (Hat tip to TechCrunch.) And the text mining vendors are nowhere to be seen.
So what else is new? Read the rest of this entry »
Posted in Application areas, Factiva and Dow Jones, Investment research and trading, Text mining | 1 Comment »
December 7th, 2007 Curt Monash
Here are some highlights of the QL2 story, per exec Mike McDermott.
- QL2’s main business is scraping price and other product offering data from the web for high-speed competitive analysis. For example, of their 250ish customers overall, over 90 are airlines. Online retailers are another big chunk of their customer base.
- QL2 also commonly partners with text mining companies in applications such as Voice of the Market or competitive intelligence. E.g., QL2 has been brought into a few deals each by Attensity, Clarabridge, and especially Temis.
- QL2 goes well beyond basic crawling. Notably, the system fills in forms with parameters. And of course it monitors pages for changes.
- QL2’s scripting language is, Mike tells me, very SQL-like. Hence the “QL” in the name.
- QL2 rolls its own filters, rather than using INSO or whoever. (Actually, what are the main file-reading filter choices these days? I’ve lost track.) Indeed, Mike fondly believes QL2 does a better job with PDFs than Adobe does.
- QL2 doesn’t want to be thought of as web-only. Rather, Mike likes my formulation of “text data ETL, web or otherwise.” That said, he freely admits QL2’s strength is in Extract rather than in Transform or Load.
Read the rest of this entry »
Posted in Application areas, QL2, Text mining, Voice of the Market/competitive intelligence | No Comments »
November 14th, 2007 Curt Monash
I just had a quick chat with text mining vendor Clarabridge’s CEO Sid Banerjee. Naturally, I asked the standard “So who are you seeing in the marketplace the most?” question. Attensity is unsurprisingly #1. What’s new, however, is that Inxight – heretofore not a text mining presence vs. commercially-focused Clarabridge – has begun to show up a bit this quarter, via the Business Objects sales force. Sid was of course dismissive of their current level of technological readiness and integration – but at least BOBJ/Inxight is showing up now.
The most interesting point was text mining SaaS (Software as a Service). When Clarabridge first put out its “We offer SaaS now!” announcement, I yawned. But Sid tells me that about half of Clarabridge’s deals now are actually SaaS. The way the SaaS technology works is pretty simple. The customer gathers together text into a staging database – typically daily or weekly – and it gets sucked into a Clarabridge-managed Clarabridge installation in some high-end SaaS data center. If there’s a desire to join the results of the text analysis with some tabular data from the client’s data warehouse, the needed columns get sent over as well. And then Clarabridge does its thing.
Read the rest of this entry »
Posted in BI integration, Clarabridge, Comprehensive or exhaustive extraction, IBM and UIMA, Text mining | 1 Comment »
November 1st, 2007 Curt Monash
CEO Eric Bregand of Temis recently checked in by email with an update on text mining market activity. Highlights of Eric’s views include:
- Yep, Voice Of The Customer is hot, in “many markets”; Eric specifically mentioned banking, car, energy, food, and retail. He further sees IBM backing VotC as text’s “killer app.” (Note: Temis has a history of partnering with IBM, most notably via its unusually strong commitment to UIMA.)
- Specifically, THE hot topics in the European market these days are competitive intelligence and sentiment analysis. (Note: I’ve always thought Temis got serious about competitive analysis a little earlier than most other text mining vendors did.)
- Life sciences is an ever growing focus for Temis.
- I confused him a bit with how I phrased my question about custom publishing and Temis’ Mark Logic partnership. But he did express favorable views of the market, specifically in the area of integrating text mining and native XML database management, and even volunteered that nStein appears to be doing well.
Get great research about text mining, database management, and other hot analytics-related topics! Subscribe to our comprehensive (if not exhaustive) feed, by RSS/Atom or e-mail! We recommend taking the integrated feed for all our blogs, but blog-specific ones are also easily available.
Technorati Tags: TEMIS, nStein, IBM, text mining, voice of the customer
Posted in Application areas, IBM and UIMA, Investment research and trading, Mark Logic, TEMIS, Text mining, Voice of the Customer, Voice of the Market/competitive intelligence, nStein | 1 Comment »
October 8th, 2007 Curt Monash
More precisely, SAP is acquiring Business Objects, and of course Business Objects already acquired Inxight.
This could be interesting …
Posted in BI integration, Business Objects and Inxight, SAP AG and TREX, Text mining | No Comments »