<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Text Technologies &#187; Application areas</title>
	<atom:link href="http://www.texttechnologies.com/category/text-analytics-applications/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.texttechnologies.com</link>
	<description>Understanding technology ... in both senses of the phrase</description>
	<lastBuildDate>Wed, 18 Jan 2012 17:02:59 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.3</generator>
		<item>
		<title>Social technology in the enterprise</title>
		<link>http://www.texttechnologies.com/2011/09/14/social-technology-in-the-enterprise/</link>
		<comments>http://www.texttechnologies.com/2011/09/14/social-technology-in-the-enterprise/#comments</comments>
		<pubDate>Wed, 14 Sep 2011 06:04:36 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[E-discovery]]></category>
		<category><![CDATA[Social software and online media]]></category>
		<category><![CDATA[Voice of the Customer]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/?p=510</guid>
		<description><![CDATA[The recent Dreamforce conference (i.e, salesforce.com&#8217;s extravaganza) focused attention on &#8220;the social enterprise&#8221; or, more generally, enterprises&#8217; uses of social technology. salesforce is evidently serious about this push, with development/acquisition investment (e.g. Chatter, Radian 6), marketing focus (e.g. much of Dreamforce) and sales effort (Mark Benioff says he got thrown out of a CIO&#8217;s office [...]]]></description>
			<content:encoded><![CDATA[<p>The recent Dreamforce conference (i.e, salesforce.com&#8217;s extravaganza) focused attention on &#8220;the social enterprise&#8221; or, more generally, enterprises&#8217; uses of social technology. salesforce is evidently serious about this push, with development/acquisition investment (e.g. Chatter, Radian 6), marketing focus (e.g. much of Dreamforce) and sales effort (Mark Benioff says he got thrown out of a CIO&#8217;s office because he wouldn&#8217;t stop talking about the &#8220;social&#8221; subject) all aligned.</p>
<p><em><a href="http://www.enterpriseirregulars.com/41437/some-economic-consequences-of-dreamforce/?utm_source=feedburner&amp;utm_medium=twitter&amp;utm_campaign=Feed%3A+EIblogs+%28Enterprise+Irregulars%29" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.enterpriseirregulars.com');">Denis Pombriant</a> obviously attended the same Marc Benioff session I did. <a href="http://www.zdnet.com/blog/hinchcliffe/the-promise-and-challenges-of-benioffs-social-enterprise-vision/1722" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.zdnet.com');">Dion Hinchcliffe</a> blogged the whole story in considerable detail.</em></p>
<p>It&#8217;s a cool story, and worthy of attention. But I&#8217;d like to step back and remind us that there are numerous different ways to use social technology in the enterprise, which probably shouldn&#8217;t be confused with each other. And then I&#8217;d like to discuss one area of social technology that&#8217;s relatively new to me: <strong>integration between social and operational applications.</strong></p>
<p><span id="more-510"></span>Suppose we split up social technology use cases by saying it can help you:</p>
<ul>
<li>Communicate      and collaborate internally &#8230;</li>
<li>&#8230;      and also with small groups of outsiders, such as your supply chain.</li>
<li>Observe,      listen to, and interact with consumers (and the world at large).</li>
</ul>
<p>The biggest buzz, of course, is around social technology that reaches out to the buying public or world at large. You can use social technology to:</p>
<ul>
<li>Observe      and listen to consumers &#8212; i.e., classic <a href="../../../../../category/text-analytics-applications/voice-of-the-customer/">Voice      of the Customer/Voice of the Market</a> text analytics.</li>
<li>Publish      to consumers, influencers, etc., via blogging, broadcast-oriented Twitter,      and other social media, or go even further and &#8230;</li>
<li>&#8230; communicate      with consumers interactively, whether through loosely-structured      interaction (e.g. Twitter), or in the more structured ways that <a href="../../../../../2010/12/01/state-of-the-art-text-analytics-mining-applications/">Attensity</a> and others provide.</li>
</ul>
<p>I support all that, and indeed participate ferociously myself. But for now, let&#8217;s move on.</p>
<p>On the internal collaboration/communication side, I&#8217;d say:</p>
<ul>
<li>Any communication tool useful for communicating with the public may be valuable internally as well &#8212; <a href="http://www.monashreport.com/2006/01/20/the-power-of-portals/" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.monashreport.com');">portals</a>, blogs, Twitter-imitators, and so on.</li>
<li>Pure email &#8220;push&#8221; may not always be the best tool for point-to-point internal communication.</li>
<li>Text analytics on internal communication can have a variety of uses, e.g:
<ul>
<li>Compliance (yet another privacy intrusion, but sometimes a legitimate one).</li>
<li>Internal expert-finding. (In principle, this is the traditional genuine benefit of elaborate &#8220;knowledge management&#8221; implementations, but without the burdens of traditional knowledge management. In practice, that didn&#8217;t work out so great for <a href="http://en.wikipedia.org/wiki/Tacit_Software" onclick="javascript:pageTracker._trackPageview('/outbound/article/en.wikipedia.org');">Tacit Software</a>.)</li>
<li><a href="../../../../../2006/07/11/google-project-knowledge-management/">Project management</a>.</li>
</ul>
</li>
</ul>
<p>That all gives plenty of scope for useful adoption, on both the email-replacement and text-analytic sides. But again, let&#8217;s keep going.</p>
<p>The relatively new to me &#8212; notwithstanding the &#8220;portals&#8221; link above &#8212; part of the social technology story is <strong>integration between social and operational applications.</strong> While at Dreamforce, I talked with two manufacturing application SaaS vendors &#8212; Kenandy and Rootstock Software. In both cases I asked &#8220;So what are you doing that&#8217;s an advance over where MRP was 20 years ago?&#8221; In both cases the main answer was &#8220;Now users can use social technology to track and communicate about particular orders or issues.&#8221;</p>
<p><em>*MRP stood for &#8220;Material Requirements Planning&#8221; and then &#8220;Manufacturing Resources Planning&#8221;, and is essentially the  forerunner of ERP. By &#8220;Kenandy&#8221; I specifically mean Kenandy&#8217;s founder &#8212; ASK Computer Systems founder and thus MRP legend Sandy Kurtzig.</em></p>
<p>Good point. Of course, it can be generalized; <strong>one can communicate and collaborate around almost any kind of business process. </strong>I&#8217;ve mentioned this before in analytic contexts; it&#8217;s an important concept on the monitoring-oriented side of <a href="http://www.dbms2.com/2009/05/30/reinventing-business-intelligence/" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.dbms2.com');">business intelligence</a> and &#8212; if <a href="http://www.dbms2.com/2010/10/06/ebay-followup-greenplum-out-teradata-10-petabytes-hadoop-has-some-value-and-more/" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.dbms2.com');">Oliver Ratzesberger</a> is to be believed &#8212; in investigative analytics as well. But the operational side may actually be more important.</p>
<p>Some things one does in the business world actually involve using one&#8217;s body, from manufacturing products to repairing power stations to standing in a store and serving customers. Most of the rest fits into one or more of three buckets:</p>
<ul>
<li>Creating (a product, a marketing plan, a marketing document, a compensation plan, a program for internal use, an analytic insight, &#8230;)</li>
<li>Relating (to an employee, a sales prospect, a reporter, &#8230;)</li>
<li>Participating in a fairly routine business process (data entry, accounting, mortgage approval, parts ordering, &#8230;)</li>
</ul>
<p>And why can&#8217;t we just automate those routine business processes away? Because there&#8217;s so often a need for manual intervention. And <strong>when there&#8217;s a need for manual intervention, there&#8217;s usually also an element of communicating with other people.</strong> This is almost always true in cases of trouble-shooting or exception-handling (an order is late, a system is down, the automated result violates common sense). It may be present in other cases as well (the new account calls for a personal thank you note, the food order needs to be annotated with special requests). General email is commonly an awkward medium for these communications; automated messages are worse. Newer social technologies, however, have the potential to do much better.</p>
<p><em>So what do you think? Have I drunk too much Kool-Aid, or is this stuff for real?</em></p>
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2011/09/14/social-technology-in-the-enterprise/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>The state of the art in text analytics applications</title>
		<link>http://www.texttechnologies.com/2010/12/01/state-of-the-art-text-analytics-mining-applications/</link>
		<comments>http://www.texttechnologies.com/2010/12/01/state-of-the-art-text-analytics-mining-applications/#comments</comments>
		<pubDate>Thu, 02 Dec 2010 02:06:54 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Attensity]]></category>
		<category><![CDATA[BI integration]]></category>
		<category><![CDATA[Investment research and trading]]></category>
		<category><![CDATA[SPSS]]></category>
		<category><![CDATA[Text mining]]></category>
		<category><![CDATA[Voice of the Customer]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/?p=443</guid>
		<description><![CDATA[Text analytics application areas typically fall into one or more of three broad, often overlapping domains: Understanding the opinions of customers, prospects, or other groups. This can be based on any combination of documents the user organization controls (email, surveys, warranty reports, call center logs, etc.) &#8212; in which case &#8212; or public-domain documents such [...]]]></description>
			<content:encoded><![CDATA[<p>Text analytics application areas typically fall into one or more of three broad, often overlapping domains:</p>
<ul>
<li><strong>Understanding the opinions of customers, prospects, or other groups.</strong> This can be based on any combination of documents the user organization controls (email, surveys, warranty reports, call center logs, etc.) &#8212; in which case &#8212; or public-domain documents such as blogs, forum posts, and tweets. The former is usually called <strong>Voice of the Customer (VotC),</strong> while the latter is <strong>Voice of the Market (VotM).</strong></li>
<li><strong>Detecting and identifying problems.</strong> This can happen across many domains &#8212; VotC, VotM, diagnosing equipment malfunctions, identifying bad guys (from terrorists to fraudsters), or even getting early warnings of infectious disease outbreaks.</li>
<li><strong>Aiding text search, custom publishing, and other electronic document-shuffling use cases,</strong> often via document <a href="http://www.dbms2.com/2010/11/29/data-that-is-derived-augmented-enhanced-adjusted-or-cooked/" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.dbms2.com');">augmentation</a>.</li>
</ul>
<p>For several years, I&#8217;ve been distressed at the lack of progress in text analytics or, as it used to be called, text mining. Yes, the rise of <a href="../../../../../category/text-mining/sentiment-analysis/">sentiment analysis</a> has been impressive, and higher volumes of text data are being processed than were before. But otherwise, there&#8217;s been a lot of the same old, same old. Most actual deployed applications of text analytics or text mining go something like this:</p>
<ul>
<li>A bunch of documents are analyzed to ascertain the ideas expressed in them.</li>
<li>A count is made as to how many times each idea turns up.</li>
<li>The application user notices any surprisingly large numbers, and as result of noticing pays attention to the corresponding ideas.</li>
</ul>
<p>Often, it seems desirable to integrate text analytics with business intelligence and/or predictive analytics tools that operate on tabular data is. Even so, such<strong> integration is most commonly weak or nonexistent. </strong>Apart from the usual reasons for silos of automation, I blame this lack on a mismatch in precision, among <a href="../../../../../2008/10/24/text-mining-data-warehousin/">other reasons</a>. A 500% increase in mentions of a subject could be simple coincidence, or the result of a single identifiable press article. In comparison, a 5% increase in a conventional business metric might be much more important.</p>
<p>But in fairness, <strong>the text analytics innovation picture hasn&#8217;t been quite as bleak as what I&#8217;ve been painting so far. </strong><span id="more-443"></span>While standalone, passively-reported text analytics is indeed the baseline, there are some interesting exceptions. For example:</p>
<ul>
<li>I once confirmed that SPSS customer <a href="http://www.spss.com/press/template_view.cfm?PR_ID=1059" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.spss.com');">Cablecom</a>&#8216;s statistical models for churn and the like absolutely included text data; Cablecom even assigned different weights to the same apparent level of emotion depending on whether the text was in German, French, or Italian. Vertica recently told me of a <a href="http://www.dbms2.com/2010/10/12/vertica-hadoop-connector-integration/" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.dbms2.com');">Vertica/Hadoop</a> customer doing something similar, except for the multilingual aspect. And the end of a <a href="http://www2.sas.com/proceedings/forum2008/123-2008.pdf" onclick="javascript:pageTracker._trackPageview('/outbound/article/www2.sas.com');">2008 SAS-based paper</a> makes similar claims.</li>
<li>There long* have been some examples of fact extraction that don&#8217;t really fit into my three buckets above. For example, researchers mine collections of articles to try to determine biochemical or biological pathways that would not be apparent from examining single research studies alone.</li>
<li>It also has long* been the case that some bad-guy-finding applications &#8212; especially in the anti-terrorism area &#8212; used text analytics to populate state-of-the-art <a href="http://www.dbms2.com/2009/08/21/social-network-analysis-aka-relationship-analytics/" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.dbms2.com');">graph-oriented data analysis tools</a>.</li>
</ul>
<p><em>*When it comes to text analytics, &#8220;long&#8221; means &#8220;at least for the past several years.&#8221;</em></p>
<p>In more recent examples:</p>
<ul>
<li><a href="http://www.dbms2.com/category/products-and-vendors/greenplum/" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.dbms2.com');">Greenplum</a> built a document recommender for law firms that does hard-core statistical analysis to determine which .1% of a document set lawyers might actually want to see, and which then learns from users&#8217; feedback after they respond to initial result sets.</li>
<li><a href="../../../../../2008/09/19/low-latency-text-mining-in-the-investment-market/">Information extracted from investment news</a> gets included into automated trading algorithms. This was unusual technology a couple of years ago, but is more common today.</li>
<li>After a series of mergers, <a href="../../../../../2009/04/20/the-new-attensity-deal-overview/">Attensity</a> now uses marketing-oriented text analytics in at least three different ways:
<ul>
<li>Attensity text analytics feeds marketing dashboards just as it always did.</li>
<li>Attensity text analytics triggers alerts, as I wish dashboards and business intelligence tools more often did, <a href="http://www.dbms2.com/2010/07/25/alerts-metrics-dashboards/" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.dbms2.com');">the false positives problem</a> notwithstanding.</li>
<li>Attensity text analytics triggers concrete workflows, for example <a href="http://www.attensity.com/2010/10/05/attensity-announces-respond-for-social-media/" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.attensity.com');">routing specific social media hits for priority response</a>.</li>
<li>And in one example that did not actually get into production, a very large social networking company correlated word usage (e.g., choice among different synonyms) against user characteristics such as age and gender.</li>
</ul>
</li>
</ul>
<p>Finally there are some applications that, while fitting the standard template, just strike me as getting to unusually sophisticated levels of analysis. For example, Vertica told me of another Vertica/Hadoop case where VotM document analysis is carried out to the level of observing which order brand names appear in, and adjusting that for whether or not it was just an alphabetical list.</p>
<p>I suspect <strong>text analytics is about to become more interesting again.</strong></p>
<p><strong><em>Related links</em></strong></p>
<ul>
<li>The enabling <a href="../../../../../2006/06/24/attensity-extractive-exhaustion-and-the-frn/">technology for text/tabular data integration</a> has existed for years.</li>
<li>In 2006, I listed <a href="http://www.monashreport.com/2006/09/08/where-does-data-mining-succeed-and-why/" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.monashreport.com');">major application areas for data mining/predictive analytics</a>. It overlaps pretty closely with the similar list for text mining/text analytics.</li>
<li>Before being acquired by IBM, <a href="../../../../../2008/06/17/spss-update/">SPSS boasted a rather large text mining user base</a>.</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2010/12/01/state-of-the-art-text-analytics-mining-applications/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>More website weirdness</title>
		<link>http://www.texttechnologies.com/2008/11/19/more-website-weirdness/</link>
		<comments>http://www.texttechnologies.com/2008/11/19/more-website-weirdness/#comments</comments>
		<pubDate>Thu, 20 Nov 2008 03:27:27 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[ClearForest/Reuters]]></category>
		<category><![CDATA[Custom publishing]]></category>
		<category><![CDATA[Mark Logic]]></category>
		<category><![CDATA[Search engines]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/?p=298</guid>
		<description><![CDATA[Here&#8217;s something longer-lasting and weirder than Vertica&#8217;s &#8220;We sell turkeys&#8221; theme: Mark Logic, whose product is used primarily to help enterprises make their content more acceptable, doesn&#8217;t have a search engine on its own website.* *Or if it does, it&#8217;s VERY well-hidden. I looked at the home page and site map alike. I wanted to [...]]]></description>
			<content:encoded><![CDATA[<p>Here&#8217;s something longer-lasting and weirder than <a href="http://www.dbms2.com/2008/11/18/silly-website-tricks/" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.dbms2.com');">Vertica&#8217;s &#8220;We sell turkeys&#8221; theme</a>: Mark Logic, whose product is used primarily to help enterprises make their content more acceptable, doesn&#8217;t have a search engine on its own website.*<span id="more-298"></span></p>
<p><em>*Or if it does, it&#8217;s VERY well-hidden. I looked at the home page and site map alike.</em></p>
<p>I wanted to refresh my memory as to Mark Logic&#8217;s history of working with specific text mining vendors, beyond what&#8217;s on the official <a href="http://www.marklogic.com/partners/open-enrichment-framework.html" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.marklogic.com');">partner page</a>. No luck.  Normally when site search is inadequate, one goes to Google.   But that&#8217;s problematic too.  Marklogic.com pages come up pretty low on Google&#8217;s search results, suggesting that:</p>
<ol>
<li>Mark Logic doesn&#8217;t put a lot of effort into SEO (or else doesn&#8217;t do it very well).</li>
<li>One can&#8217;t be confident all the site&#8217;s significant pages are findable by Google.</li>
</ol>
<p>Looking to other companies&#8217; sites for clues isn&#8217;t conclusive either.  E.g., <a href="http://clearforest.com/Partners/PartnerDetails.asp?id=11" onclick="javascript:pageTracker._trackPageview('/outbound/article/clearforest.com');">Clearforest lists Mark Logic as a partner</a>, but Mark Logic doesn&#8217;t return the compliment.  (If memory serves, Mark Logic and Clearforest have worked together both on national security deals and custom publishing deals &#8212; but don&#8217;t hold me to that.)</p>
<p>When it comes to making its own information conveniently available, Mark Logic is quite the unshod cobbler.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2008/11/19/more-website-weirdness/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>Are denial-of-insight attacks a threat to search logs and/or VOTC/VOTM apps?</title>
		<link>http://www.texttechnologies.com/2008/11/12/denial-of-insight-attacks/</link>
		<comments>http://www.texttechnologies.com/2008/11/12/denial-of-insight-attacks/#comments</comments>
		<pubDate>Wed, 12 Nov 2008 07:45:39 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Competitive intelligence]]></category>
		<category><![CDATA[Search engines]]></category>
		<category><![CDATA[Spam and antispam]]></category>
		<category><![CDATA[Voice of the Customer]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/?p=295</guid>
		<description><![CDATA[TechTaxi points out that it&#8217;s at least theoretically possible to, by polluting the Web, pollute somebody&#8217;s web-wide information gathering. (Hat tip to Daniel Tunkelang.) They further assert this is a relatively near-term threat. The theory can&#8217;t be denied. What&#8217;s more, bad actors have other motives to pollute the Web. For example, if they plant favorable [...]]]></description>
			<content:encoded><![CDATA[<p>TechTaxi <a href="http://techtaxi.blogspot.com/2006/04/denial-of-insight-attacks-could.html" onclick="javascript:pageTracker._trackPageview('/outbound/article/techtaxi.blogspot.com');">points out</a> that it&#8217;s at least theoretically possible to, by polluting the Web, pollute somebody&#8217;s web-wide information gathering.  (Hat tip to <a href="http://thenoisychannel.com/2008/11/11/big-google-can-be-benign/" onclick="javascript:pageTracker._trackPageview('/outbound/article/thenoisychannel.com');">Daniel Tunkelang</a>.)  They further assert this is a relatively near-term threat.</p>
<p>The theory can&#8217;t be denied. What&#8217;s more, bad actors have other motives to pollute the Web.  For example, if they plant favorable automated comments about their own products or unfavorable about the competition&#8217;s,<a href="http://www.texttechnologies.com/2008/06/17/voice-of-the-customermarket-indeed-where-the-action-is/" > Voice of the Customer/Market</a> applications will naturally be confused.  And if automated reputation-checkers get more prominent, there will be a <em>major</em> incentive to game them, just as there has been for Google&#8217;s PageRank.  So VOTC/VOTM market research tools could polluted as a side effect.</p>
<p>Similarly, if somebody wants to test your e-commerce site by throwing a ton of searches at it, your search logs will lose value.</p>
<p>But disinformation of competitors for the sake of disinformation? Or, as the article suggestions, vandalism/extortion? Off the top of my head, I&#8217;m not thinking of a serious near-term threat scenario.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2008/11/12/denial-of-insight-attacks/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Attensity update</title>
		<link>http://www.texttechnologies.com/2008/10/24/attensity-update-2/</link>
		<comments>http://www.texttechnologies.com/2008/10/24/attensity-update-2/#comments</comments>
		<pubDate>Fri, 24 Oct 2008 04:29:24 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Application areas]]></category>
		<category><![CDATA[Attensity]]></category>
		<category><![CDATA[Clarabridge]]></category>
		<category><![CDATA[Competitive intelligence]]></category>
		<category><![CDATA[Software as a Service (SaaS)]]></category>
		<category><![CDATA[Text mining]]></category>
		<category><![CDATA[Text mining SaaS]]></category>
		<category><![CDATA[Voice of the Customer]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/?p=288</guid>
		<description><![CDATA[I had a brief chat with the Attensity guys at their Teradata Partners Conference booth – mainly CTO David Bean, although he did buck one question to sales chief Jeff Johnson. The business trends story remained the same as it was in June: The sweet spot for new sales remains Voice of the Customer/Voice of [...]]]></description>
			<content:encoded><![CDATA[<p style="margin-bottom: 0in;">I had a brief chat with the Attensity guys at their Teradata Partners Conference booth – mainly CTO David Bean, although he did buck one question to sales chief Jeff Johnson.  The business trends story remained the same as it was in <a href="http://www.texttechnologies.com/2008/06/16/attensity-update-updated/" >June</a>:  The sweet spot for new sales remains Voice of the Customer/Voice of the Market, while on-premise/SaaS new-name accounts are split around 50-50 (by number, not revenue).</p>
<p style="margin-bottom: 0in;">David&#8217;s thoughts as to why the SaaS share isn&#8217;t even higher – as it seems to be for <a href="http://www.texttechnologies.com/2008/06/04/clarabridge-is-now-all-about-text-mining-saas/" >Clarabridge</a>* – centered on the point that some customers want to blend internal and external data, and may not want to ship the internal part out to a SaaS provider.  Besides, if it&#8217;s tabular data, I suspect Attensity isn&#8217;t the right place to ship it anyway.</p>
<p style="margin-bottom: 0in;"><em>*Speaking of Clarabridge, CEO Sid Banerjee recently posted a thoughtful company update in <a href="http://www.texttechnologies.com/2008/09/08/attensit-layered-messaging-marketing-model/" >this comment thread.</a></em></p>
<p style="margin-bottom: 0in;">When I challenged him on ease of use, David said that <strong>Attensity is readying a Microstrategy-based offering,</strong> which is obviously meant to compete with Clarabridge and any of its perceived advantages head-on.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2008/10/24/attensity-update-2/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Attivio update</title>
		<link>http://www.texttechnologies.com/2008/09/20/attivio-update/</link>
		<comments>http://www.texttechnologies.com/2008/09/20/attivio-update/#comments</comments>
		<pubDate>Sat, 20 Sep 2008 05:00:06 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Application areas]]></category>
		<category><![CDATA[Attivio]]></category>
		<category><![CDATA[Enterprise search]]></category>
		<category><![CDATA[Lucene]]></category>
		<category><![CDATA[Structured search]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/?p=283</guid>
		<description><![CDATA[I talked w/ Andrew McKay of Attivio for 2 ½ hours Thursday. I&#8217;ve also been working with some Attivio engineers on a blog search engine. I think it&#8217;s time to post about Attivio. In its full conception, the Attivio Intelligence Engine is something like Endeca + RDBMS + search engine + XML store + cool [...]]]></description>
			<content:encoded><![CDATA[<p style="margin-bottom: 0in;">I talked w/ Andrew McKay of Attivio for 2 ½ hours Thursday.  I&#8217;ve also been working with some Attivio engineers on a blog search engine.  I think it&#8217;s time to post about Attivio. <img src='http://www.texttechnologies.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  <span id="more-283"></span></p>
<p style="margin-bottom: 0in;">In its full conception, the Attivio Intelligence Engine is something like Endeca + RDBMS + search engine + XML store + cool extra features.  And all with seamless, lightweight, integrated installation and administration.  That&#8217;s the goal, anyway.  At this point, naturally, each individual piece is far from complete. For example:</p>
<ul>
<li>Sufficient SQL support to handle 	most BI tools is still a matter for future releases &#8212; apparently in 	2009, although Attivio is one of those agile companies for which 	pinning down product releases is somewhat difficult.</li>
<li>The same goes some basic GUI 	features (such as  most non-programmatic search tuning).</li>
<li>ACID compliance is not a high 	priority for Attivio. I actually think it should be higher, just 	because it&#8217;s increasingly become an “OK, we don&#8217;t have to worry 	about THAT” checkmark item.</li>
</ul>
<p style="margin-bottom: 0in;">Even in its early days, Attivio has had some nice-sounding customer successes.  There are 8 paying Attivio customers, including 2 &gt; $1 million deals, one half-millionish dollar deal, and 1 large OEM.  3 represent actual deployments, with the rest in development.  More sales are on the way, as are permissions to disclose customer names that people will actually recognize.  Customer application stories Andrew told me about include:</p>
<ul>
<li>A web-business parameterized, 	adjustable-weight search that&#8217;s starting with tabular data and only 	getting to free-text later.</li>
<li>An enterprise that&#8217;s using Attivio 	for content management, enterprise search, public-facing search, <em>and</em> data warehousing.</li>
<li>Something 	big/mysterious/classified, with large document volumes.</li>
<li>Something to do with compliance, 	about which Andrew was going to forward a lot more detail that 	evening (Hint, hint).</li>
</ul>
<p style="margin-bottom: 0in;">Since the major RDBMS (Oracle, Microsoft SQL Server, DB2) all have text search and XML subsystems, they can in principle do everything Attivio does on the back end, and with a lot more features and maturity.  The same would go for Marklogic.   Performance and overhead might be different matters, however; Andrew certainly believes so.</p>
<p style="margin-bottom: 0in;">Except that Lucene is included on the search side, I haven&#8217;t actually figured out how Attivio stores data.  The fact that SQL features are being added incrementally suggests Attivio is rolling its own relational database capability, but how it&#8217;s organized I don&#8217;t really know.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2008/09/20/attivio-update/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>Low-latency text mining in the investment market</title>
		<link>http://www.texttechnologies.com/2008/09/19/low-latency-text-mining-in-the-investment-market/</link>
		<comments>http://www.texttechnologies.com/2008/09/19/low-latency-text-mining-in-the-investment-market/#comments</comments>
		<pubDate>Fri, 19 Sep 2008 09:15:58 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[ClearForest/Reuters]]></category>
		<category><![CDATA[Investment research and trading]]></category>
		<category><![CDATA[Sentiment analysis]]></category>
		<category><![CDATA[Text mining]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/?p=282</guid>
		<description><![CDATA[I&#8217;m not at Gartner&#8217;s Event Processing conference, but there seem to be some interesting posts and articles coming out of it. Seth Grimes has one on Reuters&#8217; integration of text mining and event processing, including sentiment analysis. Well worth reading. Lots more detail than I&#8217;ve ever posted on similar applications.]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m not at Gartner&#8217;s Event Processing conference, but there seem to be some interesting posts and articles coming out of it.  Seth Grimes has one on <a href="http://www.intelligententerprise.com/blog/archives/2008/09/event_processin_1.html" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.intelligententerprise.com');">Reuters&#8217; integration of text mining and event processing</a>, including sentiment analysis.  Well worth reading.  Lots more detail than I&#8217;ve ever posted on <a href="http://www.texttechnologies.com/2006/12/27/text-analytics-is-finally-being-used-for-investment-analysis/" >similar</a> <a href="http://www.texttechnologies.com/2007/08/03/more-on-text-processing-in-cep/" >applications</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2008/09/19/low-latency-text-mining-in-the-investment-market/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>One overview of e-discovery</title>
		<link>http://www.texttechnologies.com/2008/09/13/emc-ediscovery/</link>
		<comments>http://www.texttechnologies.com/2008/09/13/emc-ediscovery/#comments</comments>
		<pubDate>Sat, 13 Sep 2008 09:17:21 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[E-discovery]]></category>
		<category><![CDATA[Enterprise search]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/?p=281</guid>
		<description><![CDATA[I just found a year-old (almost) blog post from EMC executive Andrew Cohen that succinctly lays out his view (which he believes to mainly be a consensus stance) on e-discovery. Cohen is evidently both a lawyer and a honcho in document management system vendor EMC&#8217;s Compliance Division, which is probably relevant to interpreting his outlook, [...]]]></description>
			<content:encoded><![CDATA[<p>I just found a year-old (almost) <a href="http://andrewsblog.typepad.com/andrew/2007/11/bringing-edisco.html" onclick="javascript:pageTracker._trackPageview('/outbound/article/andrewsblog.typepad.com');">blog post</a> from EMC executive Andrew Cohen that succinctly lays out his view (which he believes to mainly be a consensus stance) on e-discovery.  Cohen is evidently both a lawyer and a honcho in document management system vendor EMC&#8217;s Compliance Division, which is probably relevant to interpreting his outlook, in the spirit of the old Kennedy School dictum that &#8220;Where you stand depends upon where you sit.&#8221;</p>
<p>Highlights included:</p>
<ul>
<li>Information management is central to e-discovery.</li>
<li>In particular, auditability (my word) is central, if you want electronic documents to hold up as evidence in court.</li>
<li>Search is good enough, but it&#8217;s not the biggest issue in e-discovery.</li>
<li>E-mail archiving has reached the tipping point, and is increasingly a must-have, largely for its e-discovery benefits.</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2008/09/13/emc-ediscovery/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The layered messaging marketing model as applied to Attensity</title>
		<link>http://www.texttechnologies.com/2008/09/08/attensit-layered-messaging-marketing-model/</link>
		<comments>http://www.texttechnologies.com/2008/09/08/attensit-layered-messaging-marketing-model/#comments</comments>
		<pubDate>Mon, 08 Sep 2008 06:52:15 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Attensity]]></category>
		<category><![CDATA[Competitive intelligence]]></category>
		<category><![CDATA[Text mining]]></category>
		<category><![CDATA[Voice of the Customer]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/?p=279</guid>
		<description><![CDATA[My general layered messaging theory survived its first test against an IT vendor example – Netezza. Let&#8217;s try another, in this case a company that&#8217;s not a Monash Research client. Attensity is a text mining vendor with a lot of cool technology. Like other text mining vendors, it&#8217;s had mixed market success at best. However, [...]]]></description>
			<content:encoded><![CDATA[<p style="margin-bottom: 0in;">My general <a href="http://www.strategicmessaging.com/enterprise-technology-marketing-layered-messaging-model/2008/09/08/#more-35" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.strategicmessaging.com');"><strong>layered messaging</strong></a> theory survived its first test against an IT vendor example – Netezza.  Let&#8217;s try another, in this case a company that&#8217;s not a <a href="http://www.monash.com/" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.monash.com');"><em>Monash Research</em></a> client.<span id="more-279"></span></p>
<p style="margin-bottom: 0in;">Attensity is a text mining vendor with a lot of cool technology.  Like other text mining vendors, it&#8217;s had mixed market success at best.  However, <a href="../2008/06/10/attensity-update/">sales activity suggests that Attensity recently put together it&#8217;s strongest marketing story ever</a>, specifically in its new <a href="http://www.texttechnologies.com/category/text-analytics-applications/voice-of-the-customer/" >Voice of the Customer</a> / <a href="http://www.texttechnologies.com/category/text-analytics-applications/competitive-intelligence-voice-of-the-market/" >Voice of the Market</a> (VotC/VotM) focus.</p>
<p style="margin-bottom: 0in;"><em><strong>Attensity Voice of the Market messaging stack</strong></em></p>
<ul>
<li>Know what real consumers think 	about your products/services, how they react to your marketing, and 	what stories are being told about you</li>
<li><em>The only way to listen in on 	actual consumer conversations.  Humans can&#8217;t begin to to do this.</em></li>
<li>Mine the Web to find out what&#8217;s 	being said about you; easy SaaS install</li>
<li><em>See – here are real, usable 	results</em></li>
<li>Extraction of the essence from any 	kind of text, as exhibited via proofs-of-concept</li>
</ul>
<p style="margin-bottom: 0in;">That&#8217;s a good story.  The technology works. Prospects can see that it works.  The benefits are self-evident, because the technology gives unique access to highly desirable information. (Obviously, you can&#8217;t have employees sit at their screens and try to read the whole Web on your behalf.)  The cost, time to installation, and so on are attractive.  All is good.</p>
<p style="margin-bottom: 0in;">Let&#8217;s now compare that to what probably was Attensity&#8217;s prior commercial focus, warranty analysis, for products like automobiles, other vehicles, and consumer electronics.  In this market, the story was something like:</p>
<p><em><strong>Attensity warranty messaging stack</strong></em></p>
<ul>
<li>Faster, more 	accurate warning of product problems</li>
<li><em>Human 	reading of the warranty claims is too slow or costly</em></li>
<li>Mine your 	warranty claims to see why your products break</li>
<li><em>See – here are real, usable 	results</em></li>
<li>Extraction of 	the essence from warranty claims, as exhibited via proofs-of-concept</li>
</ul>
<p style="margin-bottom: 0in;">That worked up to a point, which is a big part of why Attensity remained in business.  But in fact, there were relatively few customers for whom the assertion “Human reading of the warranty claims is too slow or costly” was true.  So relatively few sales on that basis were ever made.</p>
<p style="margin-bottom: 0in;">Now, as a market-success-prediction tool, this kind of analysis may seem like overkill.  In essence, all I&#8217;ve done is reiterate:</p>
<ul>
<li>Text mining 	has shown slow growth because too few customers had internal 	corpuses large enough to need it.</li>
<li>If you&#8217;re 	mining the whole Web, however, your corpus is enormous.</li>
</ul>
<p style="margin-bottom: 0in;">But this analysis has another point.  There&#8217;s a text mining industry consensus saying, more or less:</p>
<p style="margin-bottom: 0in;"><em>The text mining industry used to be too focused on the minutiae of technology and especially semantics, but now we&#8217;ve seen the light and are selling straight to business users who don&#8217;t really care about how the stuff works. </em></p>
<p style="margin-bottom: 0in;">As with most views held by a broad consensus of smart people, that one contains a lot of truth. But it&#8217;s missing a next act. Whether or not Attensity, Clarabridge, and TEMIS get acquired soon – as most industry participants seem to expect – it seems inevitable that there will be large, technology-rich contenders in the text mining market.  SAP/Business Objects/Inxight? Oracle/somebody? The enterprise search players? Dow Jones/Factiva?   One way or another, there will eventually be big companies in the text mining market.  Attensity (and the same goes for Clarabridge) isn&#8217;t doing much these days to position itself in advance of such an onslaught.</p>
<p style="margin-bottom: 0in;">Anyhow, whatever you think of my market-evolution views, it sure seems as if the layered-messaging template works in this example as well.</p>
<p style="margin-bottom: 0in;">
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2008/09/08/attensit-layered-messaging-marketing-model/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>How good does e-discovery search need to be?</title>
		<link>http://www.texttechnologies.com/2008/09/01/how-good-does-e-discovery-search-need-to-be/</link>
		<comments>http://www.texttechnologies.com/2008/09/01/how-good-does-e-discovery-search-need-to-be/#comments</comments>
		<pubDate>Mon, 01 Sep 2008 04:44:58 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Autonomy]]></category>
		<category><![CDATA[E-discovery]]></category>
		<category><![CDATA[Enterprise search]]></category>
		<category><![CDATA[Search engines]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/?p=277</guid>
		<description><![CDATA[Two years ago, CEO Mike Lynch of Autonomy tried to persuade me that Autonomy was and would remain dominant in the e-discovery search market because: The essence of the buying decision was that enterprises wanted to fulfill obligations to make their information available in a way that would would satisfy the courts. Autonomy had some [...]]]></description>
			<content:encoded><![CDATA[<p>Two years ago, CEO Mike Lynch of Autonomy tried to persuade me that Autonomy was and would remain dominant in the e-discovery search market because:<span id="more-277"></span></p>
<ul>
<li>The essence of the buying decision was that enterprises wanted to fulfill obligations to make their information available in a way that would would satisfy the courts.</li>
<li>Autonomy had some high-profile traction (e.g., the Enron case) that made it the default decision, and hence in particular a choice that met the requirement.</li>
</ul>
<p>Recently, I ran that theory by David Ferris, whose firm <a href="http://www.ferris.com" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.ferris.com');">Ferris Research</a> has long been a/the leading small analyst firm covering e-mail and related technologies.  He wasn&#8217;t buying.  David believes courts are getting <a href="http://www.ferris.com/2008/07/22/courts-will-tolerate-search-inaccuracies/" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.ferris.com');">more sophisticated in their understanding of search technology</a>.  Even more to the point, David cited several other buying motivations that would lead enterprises to want best-available rather than just-good-enough e-discovery search technology, such as:</p>
<ul>
<li>Enterprises want to know what information is available to be discovered against them.</li>
<li>Enterprises want to discover the information that will best aid their legal defense.</li>
<li>If they&#8217;re archiving the material for one purpose (e-discovery) anyway, enterprises want to get the most possible value out of it for other purposes while they&#8217;re at it.</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2008/09/01/how-good-does-e-discovery-search-need-to-be/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

