<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Text Technologies &#187; Application areas</title>
	<atom:link href="http://www.texttechnologies.com/category/text-analytics-applications/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.texttechnologies.com</link>
	<description>Understanding technology ... in both senses of the phrase</description>
	<lastBuildDate>Sun, 28 Feb 2010 05:30:01 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>More website weirdness</title>
		<link>http://www.texttechnologies.com/2008/11/19/more-website-weirdness/</link>
		<comments>http://www.texttechnologies.com/2008/11/19/more-website-weirdness/#comments</comments>
		<pubDate>Thu, 20 Nov 2008 03:27:27 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[ClearForest/Reuters]]></category>
		<category><![CDATA[Custom publishing]]></category>
		<category><![CDATA[Mark Logic]]></category>
		<category><![CDATA[Search engines]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/?p=298</guid>
		<description><![CDATA[Here&#8217;s something longer-lasting and weirder than Vertica&#8217;s &#8220;We sell turkeys&#8221; theme: Mark Logic, whose product is used primarily to help enterprises make their content more acceptable, doesn&#8217;t have a search engine on its own website.*
*Or if it does, it&#8217;s VERY well-hidden. I looked at the home page and site map alike.
I wanted to refresh my [...]]]></description>
			<content:encoded><![CDATA[<p>Here&#8217;s something longer-lasting and weirder than <a href="http://www.dbms2.com/2008/11/18/silly-website-tricks/" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.dbms2.com');">Vertica&#8217;s &#8220;We sell turkeys&#8221; theme</a>: Mark Logic, whose product is used primarily to help enterprises make their content more acceptable, doesn&#8217;t have a search engine on its own website.*<span id="more-298"></span></p>
<p><em>*Or if it does, it&#8217;s VERY well-hidden. I looked at the home page and site map alike.</em></p>
<p>I wanted to refresh my memory as to Mark Logic&#8217;s history of working with specific text mining vendors, beyond what&#8217;s on the official <a href="http://www.marklogic.com/partners/open-enrichment-framework.html" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.marklogic.com');">partner page</a>. No luck.  Normally when site search is inadequate, one goes to Google.   But that&#8217;s problematic too.  Marklogic.com pages come up pretty low on Google&#8217;s search results, suggesting that:</p>
<ol>
<li>Mark Logic doesn&#8217;t put a lot of effort into SEO (or else doesn&#8217;t do it very well).</li>
<li>One can&#8217;t be confident all the site&#8217;s significant pages are findable by Google.</li>
</ol>
<p>Looking to other companies&#8217; sites for clues isn&#8217;t conclusive either.  E.g., <a href="http://clearforest.com/Partners/PartnerDetails.asp?id=11" onclick="javascript:pageTracker._trackPageview('/outbound/article/clearforest.com');">Clearforest lists Mark Logic as a partner</a>, but Mark Logic doesn&#8217;t return the compliment.  (If memory serves, Mark Logic and Clearforest have worked together both on national security deals and custom publishing deals &#8212; but don&#8217;t hold me to that.)</p>
<p>When it comes to making its own information conveniently available, Mark Logic is quite the unshod cobbler.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2008/11/19/more-website-weirdness/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>Are denial-of-insight attacks a threat to search logs and/or VOTC/VOTM apps?</title>
		<link>http://www.texttechnologies.com/2008/11/12/denial-of-insight-attacks/</link>
		<comments>http://www.texttechnologies.com/2008/11/12/denial-of-insight-attacks/#comments</comments>
		<pubDate>Wed, 12 Nov 2008 07:45:39 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Competitive intelligence]]></category>
		<category><![CDATA[Search engines]]></category>
		<category><![CDATA[Spam and antispam]]></category>
		<category><![CDATA[Voice of the Customer]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/?p=295</guid>
		<description><![CDATA[TechTaxi points out that it&#8217;s at least theoretically possible to, by polluting the Web, pollute somebody&#8217;s web-wide information gathering.  (Hat tip to Daniel Tunkelang.)  They further assert this is a relatively near-term threat.
The theory can&#8217;t be denied. What&#8217;s more, bad actors have other motives to pollute the Web.  For example, if they [...]]]></description>
			<content:encoded><![CDATA[<p>TechTaxi <a href="http://techtaxi.blogspot.com/2006/04/denial-of-insight-attacks-could.html" onclick="javascript:pageTracker._trackPageview('/outbound/article/techtaxi.blogspot.com');">points out</a> that it&#8217;s at least theoretically possible to, by polluting the Web, pollute somebody&#8217;s web-wide information gathering.  (Hat tip to <a href="http://thenoisychannel.com/2008/11/11/big-google-can-be-benign/" onclick="javascript:pageTracker._trackPageview('/outbound/article/thenoisychannel.com');">Daniel Tunkelang</a>.)  They further assert this is a relatively near-term threat.</p>
<p>The theory can&#8217;t be denied. What&#8217;s more, bad actors have other motives to pollute the Web.  For example, if they plant favorable automated comments about their own products or unfavorable about the competition&#8217;s,<a href="http://www.texttechnologies.com/2008/06/17/voice-of-the-customermarket-indeed-where-the-action-is/" > Voice of the Customer/Market</a> applications will naturally be confused.  And if automated reputation-checkers get more prominent, there will be a <em>major</em> incentive to game them, just as there has been for Google&#8217;s PageRank.  So VOTC/VOTM market research tools could polluted as a side effect.</p>
<p>Similarly, if somebody wants to test your e-commerce site by throwing a ton of searches at it, your search logs will lose value.</p>
<p>But disinformation of competitors for the sake of disinformation? Or, as the article suggestions, vandalism/extortion? Off the top of my head, I&#8217;m not thinking of a serious near-term threat scenario.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2008/11/12/denial-of-insight-attacks/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Attensity update</title>
		<link>http://www.texttechnologies.com/2008/10/24/attensity-update-2/</link>
		<comments>http://www.texttechnologies.com/2008/10/24/attensity-update-2/#comments</comments>
		<pubDate>Fri, 24 Oct 2008 04:29:24 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Application areas]]></category>
		<category><![CDATA[Attensity]]></category>
		<category><![CDATA[Clarabridge]]></category>
		<category><![CDATA[Competitive intelligence]]></category>
		<category><![CDATA[Software as a Service (SaaS)]]></category>
		<category><![CDATA[Text mining]]></category>
		<category><![CDATA[Text mining SaaS]]></category>
		<category><![CDATA[Voice of the Customer]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/?p=288</guid>
		<description><![CDATA[I had a brief chat with the Attensity guys at their Teradata Partners Conference booth – mainly CTO David Bean, although he did buck one question to sales chief Jeff Johnson.  The business trends story remained the same as it was in June:  The sweet spot for new sales remains Voice of the [...]]]></description>
			<content:encoded><![CDATA[<p style="margin-bottom: 0in;">I had a brief chat with the Attensity guys at their Teradata Partners Conference booth – mainly CTO David Bean, although he did buck one question to sales chief Jeff Johnson.  The business trends story remained the same as it was in <a href="http://www.texttechnologies.com/2008/06/16/attensity-update-updated/" >June</a>:  The sweet spot for new sales remains Voice of the Customer/Voice of the Market, while on-premise/SaaS new-name accounts are split around 50-50 (by number, not revenue).</p>
<p style="margin-bottom: 0in;">David&#8217;s thoughts as to why the SaaS share isn&#8217;t even higher – as it seems to be for <a href="http://www.texttechnologies.com/2008/06/04/clarabridge-is-now-all-about-text-mining-saas/" >Clarabridge</a>* – centered on the point that some customers want to blend internal and external data, and may not want to ship the internal part out to a SaaS provider.  Besides, if it&#8217;s tabular data, I suspect Attensity isn&#8217;t the right place to ship it anyway.</p>
<p style="margin-bottom: 0in;"><em>*Speaking of Clarabridge, CEO Sid Banerjee recently posted a thoughtful company update in <a href="http://www.texttechnologies.com/2008/09/08/attensit-layered-messaging-marketing-model/" >this comment thread.</a></em></p>
<p style="margin-bottom: 0in;">When I challenged him on ease of use, David said that <strong>Attensity is readying a Microstrategy-based offering,</strong> which is obviously meant to compete with Clarabridge and any of its perceived advantages head-on.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2008/10/24/attensity-update-2/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Attivio update</title>
		<link>http://www.texttechnologies.com/2008/09/20/attivio-update/</link>
		<comments>http://www.texttechnologies.com/2008/09/20/attivio-update/#comments</comments>
		<pubDate>Sat, 20 Sep 2008 05:00:06 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Application areas]]></category>
		<category><![CDATA[Attivio]]></category>
		<category><![CDATA[Enterprise search]]></category>
		<category><![CDATA[Lucene]]></category>
		<category><![CDATA[Structured search]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/?p=283</guid>
		<description><![CDATA[I talked w/ Andrew McKay of Attivio for 2 ½ hours Thursday.  I&#8217;ve also been working with some Attivio engineers on a blog search engine.  I think it&#8217;s time to post about Attivio.   
In its full conception, the Attivio Intelligence Engine is something like Endeca + RDBMS + search engine + [...]]]></description>
			<content:encoded><![CDATA[<p style="margin-bottom: 0in;">I talked w/ Andrew McKay of Attivio for 2 ½ hours Thursday.  I&#8217;ve also been working with some Attivio engineers on a blog search engine.  I think it&#8217;s time to post about Attivio. <img src='http://www.texttechnologies.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  <span id="more-283"></span></p>
<p style="margin-bottom: 0in;">In its full conception, the Attivio Intelligence Engine is something like Endeca + RDBMS + search engine + XML store + cool extra features.  And all with seamless, lightweight, integrated installation and administration.  That&#8217;s the goal, anyway.  At this point, naturally, each individual piece is far from complete. For example:</p>
<ul>
<li>Sufficient SQL support to handle 	most BI tools is still a matter for future releases &#8212; apparently in 	2009, although Attivio is one of those agile companies for which 	pinning down product releases is somewhat difficult.</li>
<li>The same goes some basic GUI 	features (such as  most non-programmatic search tuning).</li>
<li>ACID compliance is not a high 	priority for Attivio. I actually think it should be higher, just 	because it&#8217;s increasingly become an “OK, we don&#8217;t have to worry 	about THAT” checkmark item.</li>
</ul>
<p style="margin-bottom: 0in;">Even in its early days, Attivio has had some nice-sounding customer successes.  There are 8 paying Attivio customers, including 2 &gt; $1 million deals, one half-millionish dollar deal, and 1 large OEM.  3 represent actual deployments, with the rest in development.  More sales are on the way, as are permissions to disclose customer names that people will actually recognize.  Customer application stories Andrew told me about include:</p>
<ul>
<li>A web-business parameterized, 	adjustable-weight search that&#8217;s starting with tabular data and only 	getting to free-text later.</li>
<li>An enterprise that&#8217;s using Attivio 	for content management, enterprise search, public-facing search, <em>and</em> data warehousing.</li>
<li>Something 	big/mysterious/classified, with large document volumes.</li>
<li>Something to do with compliance, 	about which Andrew was going to forward a lot more detail that 	evening (Hint, hint).</li>
</ul>
<p style="margin-bottom: 0in;">Since the major RDBMS (Oracle, Microsoft SQL Server, DB2) all have text search and XML subsystems, they can in principle do everything Attivio does on the back end, and with a lot more features and maturity.  The same would go for Marklogic.   Performance and overhead might be different matters, however; Andrew certainly believes so.</p>
<p style="margin-bottom: 0in;">Except that Lucene is included on the search side, I haven&#8217;t actually figured out how Attivio stores data.  The fact that SQL features are being added incrementally suggests Attivio is rolling its own relational database capability, but how it&#8217;s organized I don&#8217;t really know.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2008/09/20/attivio-update/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>Low-latency text mining in the investment market</title>
		<link>http://www.texttechnologies.com/2008/09/19/low-latency-text-mining-in-the-investment-market/</link>
		<comments>http://www.texttechnologies.com/2008/09/19/low-latency-text-mining-in-the-investment-market/#comments</comments>
		<pubDate>Fri, 19 Sep 2008 09:15:58 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[ClearForest/Reuters]]></category>
		<category><![CDATA[Investment research and trading]]></category>
		<category><![CDATA[Sentiment analysis]]></category>
		<category><![CDATA[Text mining]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/?p=282</guid>
		<description><![CDATA[I&#8217;m not at Gartner&#8217;s Event Processing conference, but there seem to be some interesting posts and articles coming out of it.  Seth Grimes has one on Reuters&#8217; integration of text mining and event processing, including sentiment analysis.  Well worth reading.  Lots more detail than I&#8217;ve ever posted on similar applications.
]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m not at Gartner&#8217;s Event Processing conference, but there seem to be some interesting posts and articles coming out of it.  Seth Grimes has one on <a href="http://www.intelligententerprise.com/blog/archives/2008/09/event_processin_1.html" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.intelligententerprise.com');">Reuters&#8217; integration of text mining and event processing</a>, including sentiment analysis.  Well worth reading.  Lots more detail than I&#8217;ve ever posted on <a href="http://www.texttechnologies.com/2006/12/27/text-analytics-is-finally-being-used-for-investment-analysis/" >similar</a> <a href="http://www.texttechnologies.com/2007/08/03/more-on-text-processing-in-cep/" >applications</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2008/09/19/low-latency-text-mining-in-the-investment-market/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>One overview of e-discovery</title>
		<link>http://www.texttechnologies.com/2008/09/13/emc-ediscovery/</link>
		<comments>http://www.texttechnologies.com/2008/09/13/emc-ediscovery/#comments</comments>
		<pubDate>Sat, 13 Sep 2008 09:17:21 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[E-discovery]]></category>
		<category><![CDATA[Enterprise search]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/?p=281</guid>
		<description><![CDATA[I just found a year-old (almost) blog post from EMC executive Andrew Cohen that succinctly lays out his view (which he believes to mainly be a consensus stance) on e-discovery.  Cohen is evidently both a lawyer and a honcho in document management system vendor EMC&#8217;s Compliance Division, which is probably relevant to interpreting his [...]]]></description>
			<content:encoded><![CDATA[<p>I just found a year-old (almost) <a href="http://andrewsblog.typepad.com/andrew/2007/11/bringing-edisco.html" onclick="javascript:pageTracker._trackPageview('/outbound/article/andrewsblog.typepad.com');">blog post</a> from EMC executive Andrew Cohen that succinctly lays out his view (which he believes to mainly be a consensus stance) on e-discovery.  Cohen is evidently both a lawyer and a honcho in document management system vendor EMC&#8217;s Compliance Division, which is probably relevant to interpreting his outlook, in the spirit of the old Kennedy School dictum that &#8220;Where you stand depends upon where you sit.&#8221;</p>
<p>Highlights included:</p>
<ul>
<li>Information management is central to e-discovery.</li>
<li>In particular, auditability (my word) is central, if you want electronic documents to hold up as evidence in court.</li>
<li>Search is good enough, but it&#8217;s not the biggest issue in e-discovery.</li>
<li>E-mail archiving has reached the tipping point, and is increasingly a must-have, largely for its e-discovery benefits.</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2008/09/13/emc-ediscovery/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The layered messaging marketing model as applied to Attensity</title>
		<link>http://www.texttechnologies.com/2008/09/08/attensit-layered-messaging-marketing-model/</link>
		<comments>http://www.texttechnologies.com/2008/09/08/attensit-layered-messaging-marketing-model/#comments</comments>
		<pubDate>Mon, 08 Sep 2008 06:52:15 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Attensity]]></category>
		<category><![CDATA[Competitive intelligence]]></category>
		<category><![CDATA[Text mining]]></category>
		<category><![CDATA[Voice of the Customer]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/?p=279</guid>
		<description><![CDATA[My general layered messaging theory survived its first test against an IT vendor example – Netezza.  Let&#8217;s try another, in this case a company that&#8217;s not a Monash Research client.
Attensity is a text mining vendor with a lot of cool technology.  Like other text mining vendors, it&#8217;s had mixed market success at best. [...]]]></description>
			<content:encoded><![CDATA[<p style="margin-bottom: 0in;">My general <a href="http://www.strategicmessaging.com/enterprise-technology-marketing-layered-messaging-model/2008/09/08/#more-35" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.strategicmessaging.com');"><strong>layered messaging</strong></a> theory survived its first test against an IT vendor example – Netezza.  Let&#8217;s try another, in this case a company that&#8217;s not a <a href="http://www.monash.com/" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.monash.com');"><em>Monash Research</em></a> client.<span id="more-279"></span></p>
<p style="margin-bottom: 0in;">Attensity is a text mining vendor with a lot of cool technology.  Like other text mining vendors, it&#8217;s had mixed market success at best.  However, <a href="../2008/06/10/attensity-update/">sales activity suggests that Attensity recently put together it&#8217;s strongest marketing story ever</a>, specifically in its new <a href="http://www.texttechnologies.com/category/text-analytics-applications/voice-of-the-customer/" >Voice of the Customer</a> / <a href="http://www.texttechnologies.com/category/text-analytics-applications/competitive-intelligence-voice-of-the-market/" >Voice of the Market</a> (VotC/VotM) focus.</p>
<p style="margin-bottom: 0in;"><em><strong>Attensity Voice of the Market messaging stack</strong></em></p>
<ul>
<li>Know what real consumers think 	about your products/services, how they react to your marketing, and 	what stories are being told about you</li>
<li><em>The only way to listen in on 	actual consumer conversations.  Humans can&#8217;t begin to to do this.</em></li>
<li>Mine the Web to find out what&#8217;s 	being said about you; easy SaaS install</li>
<li><em>See – here are real, usable 	results</em></li>
<li>Extraction of the essence from any 	kind of text, as exhibited via proofs-of-concept</li>
</ul>
<p style="margin-bottom: 0in;">That&#8217;s a good story.  The technology works. Prospects can see that it works.  The benefits are self-evident, because the technology gives unique access to highly desirable information. (Obviously, you can&#8217;t have employees sit at their screens and try to read the whole Web on your behalf.)  The cost, time to installation, and so on are attractive.  All is good.</p>
<p style="margin-bottom: 0in;">Let&#8217;s now compare that to what probably was Attensity&#8217;s prior commercial focus, warranty analysis, for products like automobiles, other vehicles, and consumer electronics.  In this market, the story was something like:</p>
<p><em><strong>Attensity warranty messaging stack</strong></em></p>
<ul>
<li>Faster, more 	accurate warning of product problems</li>
<li><em>Human 	reading of the warranty claims is too slow or costly</em></li>
<li>Mine your 	warranty claims to see why your products break</li>
<li><em>See – here are real, usable 	results</em></li>
<li>Extraction of 	the essence from warranty claims, as exhibited via proofs-of-concept</li>
</ul>
<p style="margin-bottom: 0in;">That worked up to a point, which is a big part of why Attensity remained in business.  But in fact, there were relatively few customers for whom the assertion “Human reading of the warranty claims is too slow or costly” was true.  So relatively few sales on that basis were ever made.</p>
<p style="margin-bottom: 0in;">Now, as a market-success-prediction tool, this kind of analysis may seem like overkill.  In essence, all I&#8217;ve done is reiterate:</p>
<ul>
<li>Text mining 	has shown slow growth because too few customers had internal 	corpuses large enough to need it.</li>
<li>If you&#8217;re 	mining the whole Web, however, your corpus is enormous.</li>
</ul>
<p style="margin-bottom: 0in;">But this analysis has another point.  There&#8217;s a text mining industry consensus saying, more or less:</p>
<p style="margin-bottom: 0in;"><em>The text mining industry used to be too focused on the minutiae of technology and especially semantics, but now we&#8217;ve seen the light and are selling straight to business users who don&#8217;t really care about how the stuff works. </em></p>
<p style="margin-bottom: 0in;">As with most views held by a broad consensus of smart people, that one contains a lot of truth. But it&#8217;s missing a next act. Whether or not Attensity, Clarabridge, and TEMIS get acquired soon – as most industry participants seem to expect – it seems inevitable that there will be large, technology-rich contenders in the text mining market.  SAP/Business Objects/Inxight? Oracle/somebody? The enterprise search players? Dow Jones/Factiva?   One way or another, there will eventually be big companies in the text mining market.  Attensity (and the same goes for Clarabridge) isn&#8217;t doing much these days to position itself in advance of such an onslaught.</p>
<p style="margin-bottom: 0in;">Anyhow, whatever you think of my market-evolution views, it sure seems as if the layered-messaging template works in this example as well.</p>
<p style="margin-bottom: 0in;">
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2008/09/08/attensit-layered-messaging-marketing-model/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>How good does e-discovery search need to be?</title>
		<link>http://www.texttechnologies.com/2008/09/01/how-good-does-e-discovery-search-need-to-be/</link>
		<comments>http://www.texttechnologies.com/2008/09/01/how-good-does-e-discovery-search-need-to-be/#comments</comments>
		<pubDate>Mon, 01 Sep 2008 04:44:58 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Autonomy]]></category>
		<category><![CDATA[E-discovery]]></category>
		<category><![CDATA[Enterprise search]]></category>
		<category><![CDATA[Search engines]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/?p=277</guid>
		<description><![CDATA[Two years ago, CEO Mike Lynch of Autonomy tried to persuade me that Autonomy was and would remain dominant in the e-discovery search market because:

The essence of the buying decision was that enterprises wanted to fulfill obligations to make their information available in a way that would would satisfy the courts.
Autonomy had some high-profile traction [...]]]></description>
			<content:encoded><![CDATA[<p>Two years ago, CEO Mike Lynch of Autonomy tried to persuade me that Autonomy was and would remain dominant in the e-discovery search market because:<span id="more-277"></span></p>
<ul>
<li>The essence of the buying decision was that enterprises wanted to fulfill obligations to make their information available in a way that would would satisfy the courts.</li>
<li>Autonomy had some high-profile traction (e.g., the Enron case) that made it the default decision, and hence in particular a choice that met the requirement.</li>
</ul>
<p>Recently, I ran that theory by David Ferris, whose firm <a href="http://www.ferris.com" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.ferris.com');">Ferris Research</a> has long been a/the leading small analyst firm covering e-mail and related technologies.  He wasn&#8217;t buying.  David believes courts are getting <a href="http://www.ferris.com/2008/07/22/courts-will-tolerate-search-inaccuracies/" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.ferris.com');">more sophisticated in their understanding of search technology</a>.  Even more to the point, David cited several other buying motivations that would lead enterprises to want best-available rather than just-good-enough e-discovery search technology, such as:</p>
<ul>
<li>Enterprises want to know what information is available to be discovered against them.</li>
<li>Enterprises want to discover the information that will best aid their legal defense.</li>
<li>If they&#8217;re archiving the material for one purpose (e-discovery) anyway, enterprises want to get the most possible value out of it for other purposes while they&#8217;re at it.</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2008/09/01/how-good-does-e-discovery-search-need-to-be/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>The Text Analytics Marketplace: Competitive landscape and trends</title>
		<link>http://www.texttechnologies.com/2008/06/19/text-analytics-marketplace-competitive-landscape-trends/</link>
		<comments>http://www.texttechnologies.com/2008/06/19/text-analytics-marketplace-competitive-landscape-trends/#comments</comments>
		<pubDate>Thu, 19 Jun 2008 07:35:39 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Audio and video search]]></category>
		<category><![CDATA[BI integration]]></category>
		<category><![CDATA[Custom publishing]]></category>
		<category><![CDATA[Enterprise search]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Natural language processing (NLP)]]></category>
		<category><![CDATA[Nuance]]></category>
		<category><![CDATA[Progress and EasyAsk]]></category>
		<category><![CDATA[Search engines]]></category>
		<category><![CDATA[Social software and online media]]></category>
		<category><![CDATA[Spam and antispam]]></category>
		<category><![CDATA[Speech recognition]]></category>
		<category><![CDATA[Structured search]]></category>
		<category><![CDATA[Text Analytics Summit]]></category>
		<category><![CDATA[Text mining]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/?p=249</guid>
		<description><![CDATA[As I see it, there are eight distinct market areas that each depend heavily on linguistic technology. Five are off-shoots of what used to be called “information retrieval”:
1.  Web search
2.  Public-facing site search
3.  Enterprise search and knowledge management
4.  Custom publishing
5.  Text mining and extraction
Three are more standalone:
6.  Spam filtering
7. [...]]]></description>
			<content:encoded><![CDATA[<p style="margin-bottom: 0in;">As I see it, there are eight distinct market areas that each depend heavily on linguistic technology. Five are off-shoots of what used to be called “information retrieval”:</p>
<p style="margin-bottom: 0in; font-style: normal; padding-left: 30px;">1.  Web search</p>
<p style="margin-bottom: 0in; font-style: normal; padding-left: 30px;">2.  Public-facing site search</p>
<p style="margin-bottom: 0in; font-style: normal; padding-left: 30px;">3.  Enterprise search and knowledge management</p>
<p style="margin-bottom: 0in; font-style: normal; padding-left: 30px;">4.  Custom publishing</p>
<p style="padding-left: 30px;">5.  Text mining and extraction</p>
<p style="margin-bottom: 0in; font-style: normal;">Three are more standalone:</p>
<p style="margin-bottom: 0in; font-style: normal; padding-left: 30px;">6.  Spam filtering</p>
<p style="margin-bottom: 0in; font-style: normal; padding-left: 30px;">7.  Voice recognition</p>
<p style="margin-bottom: 0in; font-style: normal; padding-left: 30px;">8.  Machine translation</p>
<p><span id="more-249"></span></p>
<p style="margin-bottom: 0in;">This list comes from a talk I gave Monday at the Text Analytics Summit called <em>The Text Analytics Marketplace: Competitive landscape and trends. </em>In half an hour, I covered the first five areas (in Sue Feldman&#8217;s word, at a “gallop”). The slide deck has been uploaded to the link below.  <span style="font-style: normal;"><span>I plan to break out the material from the talk into a series of blog posts over the next few (or perhaps not-so-few) weeks. </span></span></p>
<p style="margin-bottom: 0in;"><em><strong>Slides:</strong></em></p>
<ul>
<li><a href="http://www.monash.com/Text-analytics-markets-June-2008.ppt " onclick="javascript:pageTracker._trackPageview('/outbound/article/www.monash.com');"><span>The Text Analytics Marketplace: Competitive landscape and trends</span></a></li>
</ul>
<p style="margin-bottom: 0in;"><strong><em>Other posts based on those slides:</em></strong></p>
<ul>
<li><span><a href="http://www.texttechnologies.com/2008/06/19/3-specialized-markets-for-text-analytics/" >Three specialized markets for text analytics</a> (based on Slide 2)</span></li>
<li><span><a href="http://www.texttechnologies.com/2008/06/19/6-trends-that-could-shake-up-the-text-analytics-market/" >6 trends that could shake up the text analytics market</a> (based on Slide 19)</span></li>
<li><span><a href="(in A World of Bytes)">Why search technologies are going to recombine</a> (in <em>A World of Bytes</em>, based on Slide 19)<br />
</span></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2008/06/19/text-analytics-marketplace-competitive-landscape-trends/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>SPSS update</title>
		<link>http://www.texttechnologies.com/2008/06/17/spss-update/</link>
		<comments>http://www.texttechnologies.com/2008/06/17/spss-update/#comments</comments>
		<pubDate>Tue, 17 Jun 2008 06:51:45 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[SPSS]]></category>
		<category><![CDATA[Text Analytics Summit]]></category>
		<category><![CDATA[Text mining]]></category>
		<category><![CDATA[Voice of the Customer]]></category>
		<category><![CDATA[Attensity]]></category>
		<category><![CDATA[Clarabridge]]></category>
		<category><![CDATA[data mining]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/?p=245</guid>
		<description><![CDATA[I emailed a bit with Olivier Jouve last week, and chatted with him at the Text Analytics Summit yesterday.  He cited a figure of 2400 SPSS text mining users (unique user organizations).  The majority of these are for a low-cost, desktop-based surveys product.  But when I pressed him, he eventually gave a [...]]]></description>
			<content:encoded><![CDATA[<p>I emailed a bit with Olivier Jouve last week, and chatted with him at the Text Analytics Summit yesterday.  He cited a figure of 2400 SPSS text mining users (unique user organizations).  The majority of these are for a low-cost, desktop-based surveys product.  But when I pressed him, he eventually gave a 500-1000 figure for actual Text Mining For Clementine users.<span id="more-245"></span></p>
<p>That is, of course, hugely more than any of the independents (e.g. Attensity and Clarabridge) have.  And it&#8217;s focused on marketing-oriented apps &#8212; especially Voice of the Customer &#8212; just as those vendors are.  Even so, they report rarely seeing SPSS, and SPSS agrees with that assessment.</p>
<p>The obvious explanation &#8212; which Olivier does not dispute &#8212; is that Text Mining For Clementine sales are focused on Clementine data mining users.  But that raises an interesting follow-up &#8212; how much data mining are these users really doing on text data?  Attensity and Clarabridge customers do little true data mining, but Olivier asserts that SPSS customers do quite a bit &#8212; predictive modeling, real-time scoring, and the whole enchilada.</p>
<p>By the way, Olivier actually no longer runs SPSS&#8217; text mining business.  He&#8217;s moved to Chicago as VP of Corporate Development, focused on acquisitions.  Coincidentally, he has a glum view of the prospects for independent text analytics companies, and believes the best course for them is to be acquired.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2008/06/17/spss-update/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
