<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Text Technologies &#187; Enterprise search</title>
	<atom:link href="http://www.texttechnologies.com/category/storage-search/enterprise-search/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.texttechnologies.com</link>
	<description>Understanding technology ... in both senses of the phrase</description>
	<lastBuildDate>Wed, 18 Jan 2012 17:02:59 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.3</generator>
		<item>
		<title>Data marts in the world of text</title>
		<link>http://www.texttechnologies.com/2009/09/20/data-marts-in-the-world-of-text/</link>
		<comments>http://www.texttechnologies.com/2009/09/20/data-marts-in-the-world-of-text/#comments</comments>
		<pubDate>Sun, 20 Sep 2009 09:08:53 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Enterprise search]]></category>
		<category><![CDATA[Ontologies]]></category>
		<category><![CDATA[Search engines]]></category>
		<category><![CDATA[Specialized search]]></category>
		<category><![CDATA[Structured search]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/?p=334</guid>
		<description><![CDATA[CMS/search (Content Management System) expert Alan Pelz-Sharpe recently decried &#8220;Shadow IT&#8221;, by which he seems to mean departmental proliferation of data stores outside the control of the IT department. In other words, he&#8217;s talking about data marts, only for documents rather than tabular data. Notwithstanding the manifest virtues of centralization, there are numerous reasons you [...]]]></description>
			<content:encoded><![CDATA[<p style="margin-bottom: 0in;">CMS/search (Content Management System) expert Alan Pelz-Sharpe recently <a href="http://www.intelligententerprise.com/blog/archives/2009/08/shadow_it_and_e.html" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.intelligententerprise.com');">decried &#8220;Shadow IT&#8221;</a>, by which he seems to mean departmental proliferation of data stores outside the control of the IT department. In other words, he&#8217;s talking about data marts, only for documents rather than tabular data.</p>
<p style="margin-bottom: 0in;">Notwithstanding the manifest virtues of centralization, there are numerous reasons you might want data marts,  in the tabular and document worlds alike.  For example:</p>
<ul>
<li><strong>Price/performance.</strong> Your 	main/central data manager might be too expensive to support 	additional large specialized databases. Or different databases and 	applications might have sufficiently different profiles so as to get 	great price/performance from different kinds of data managers. This 	is particularly prevalent in the relational world, where each of 	column stores, sequentially-oriented row stores, and random 	I/O-oriented row stores have compelling use cases.</li>
<li><strong>Different SLAs</strong> (Service-Level Agreements). Similarly, different applications may 	have very different requirements for uptime, response time, and the 	like.  (In the relational world, think of operational data stores.)</li>
<li><strong>Different security 	requirements.</strong> Different subsets of the data may need different 	levels of security. This is particularly prevalent in the document 	world, where security problems are not as well-solved as in the 	tabular arena, and where it&#8217;s common for a search engine to index 	across different corpuses with radically different levels of 	sensitivity.</li>
<li><strong>Integrated application and user 	interfaces.</strong> In the relational world, there&#8217;s a pretty clean 	separation between data management and interface logic; most serious 	business intelligence tools can talk to most DBMS. The document 	world is quite different. Some search engines bundle, for example, 	various kinds of faceted or parameterized search interfaces. What&#8217;s 	more, in public-facing search, a major differentiator is the 	facilities that the product offers for skewing search results.</li>
<li><strong>Different text applications 	require different thesauruses or taxonomy management systems</strong>. 	Ideally, those should all be integrated &#8212; but <a href="../2005/12/11/the-text-technologies-market-3-heres-whats-missing/">the 	requisite technology still doesn&#8217;t exist</a>.</li>
</ul>
<p style="margin-bottom: 0in;">Bottom line: <strong>Text data marts, much like relational data marts, are almost surely here to stay.</strong></p>
<p style="margin-bottom: 0in;"><em><strong>Related link</strong></em></p>
<ul>
<li>
<p style="margin-bottom: 0in;"><a href="http://www.dbms2.com/2009/06/08/the-future-of-data-marts/" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.dbms2.com');">The 	future of data marts</a></p>
</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2009/09/20/data-marts-in-the-world-of-text/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Where &#8220;semantic&#8221; technology is or isn&#8217;t important</title>
		<link>http://www.texttechnologies.com/2008/12/29/where-semantic-technology-is-or-isnt-important/</link>
		<comments>http://www.texttechnologies.com/2008/12/29/where-semantic-technology-is-or-isnt-important/#comments</comments>
		<pubDate>Tue, 30 Dec 2008 00:59:55 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Enterprise search]]></category>
		<category><![CDATA[Ontologies]]></category>
		<category><![CDATA[Search engines]]></category>
		<category><![CDATA[Specialized search]]></category>
		<category><![CDATA[Structured search]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/?p=301</guid>
		<description><![CDATA[At Lynda Moulton&#8217;s behest, I spoke a couple of times recently on the subject of where &#8220;semantic&#8221; technology is or isn&#8217;t likely to be important.  One was at the Gilbane conference in early December.  The slides were based on my previously posted deck for a June talk I gave on a text analytics market overview. [...]]]></description>
			<content:encoded><![CDATA[<p>At Lynda Moulton&#8217;s behest, I spoke a couple of times recently on the subject of where &#8220;semantic&#8221; technology is or isn&#8217;t likely to be important.  One was at the Gilbane conference in early December.  The slides were based on my previously posted deck for a June talk I gave on a <a href="http://www.texttechnologies.com/2008/06/19/text-analytics-marketplace-competitive-landscape-trends/" >text analytics market overview</a>. The actual Gilbane slides may be found <a href="http://www.monash.com/uploads/Gilbane-December-2008.ppt" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.monash.com');">here</a>.</p>
<p>My opinions about the applicability of semantic technology include:</p>
<ul>
<li>The big bucks in web search are for &#8220;transactional&#8221; web search, and semantics isn&#8217;t the issue there. <em>(Slides 3-4)</em></li>
<li>When UIs finally go beyond the simple search box &#8212; e.g. to clusters/facets or to voice &#8212; semantics should have a role to play. <em>(Slide 5)</em></li>
<li>Public-facing site search depends &#8212; more than any other area of text analytics &#8212; on hand-tagging. <em>(Slide 7)</em></li>
<li>&#8220;Enterprise&#8221; search that searches specialized external databases could benefit from semantic technologies. <em>(Slide <img src='http://www.texttechnologies.com/wp-includes/images/smilies/icon_cool.gif' alt='8)' class='wp-smiley' /> </em></li>
<li>True enterprise search could benefit from semantic technologies in multiple ways, but has other problems as well. <em>(Slides 10-11)</em></li>
<li>Semantics &#8212; specifically extraction &#8212; is central to custom publishing. <em>(Slide 12 &#8212; upon review I regret using the word &#8220;sophisticated&#8221;)</em></li>
<li>Semantics is central to text mining. <em>(Slide 18)</em></li>
<li>Semantics could play a big role in all sorts of exciting future developments. <em>(Slide 19)</em></li>
</ul>
<p>So what would your list be like?</p>
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2008/12/29/where-semantic-technology-is-or-isnt-important/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Lynda Moulton prefers enterprise search products that get up and running quickly</title>
		<link>http://www.texttechnologies.com/2008/10/11/lynda-moulton-on-enterprise-search-2/</link>
		<comments>http://www.texttechnologies.com/2008/10/11/lynda-moulton-on-enterprise-search-2/#comments</comments>
		<pubDate>Sun, 12 Oct 2008 02:46:07 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Coveo]]></category>
		<category><![CDATA[Enterprise search]]></category>
		<category><![CDATA[FAST]]></category>
		<category><![CDATA[Microsoft]]></category>
		<category><![CDATA[Search engines]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/?p=287</guid>
		<description><![CDATA[Lynda Moulton, to put it mildly, disagrees with the Gartner Magic Quadrant analysis of enterprise search. Her preferred approach is captured in: Coveo, Exalead, ISYS, Recommind, Vivisimo, and X1 are a few of a select group that are marking a mark in their respective niches, as products ready for action with a short implementation cycle [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://gilbane.com/search_blog/2008/10/what_determines_a_leader_in_th.html" onclick="javascript:pageTracker._trackPageview('/outbound/article/gilbane.com');">Lynda Moulton</a>, to put it mildly, disagrees with the Gartner Magic Quadrant analysis of enterprise search.  Her preferred approach is captured in:</p>
<blockquote><p>Coveo, Exalead, ISYS, Recommind, Vivisimo, and X1 are a few of a select group that are marking a mark in their respective niches, as products ready for action with a short implementation cycle (weeks or months not years).</p></blockquote>
<p>By way of contrast, Lynda opines:</p>
<blockquote><p>Autonomy and Endeca continue to bring value to very large projects in large companies but are not plug-and-play solutions, by any means. Oracle, IBM, and Microsoft offer search solutions of a very different type with a heavy vendor or third-party service requirement. Google Search Appliance has a much larger installed base than any of these but needs serious tuning and customization to make it suitable to enterprise needs.</p></blockquote>
<p>In particular, her views about FAST (now Microsoft) are scathing.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2008/10/11/lynda-moulton-on-enterprise-search-2/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Attivio update</title>
		<link>http://www.texttechnologies.com/2008/09/20/attivio-update/</link>
		<comments>http://www.texttechnologies.com/2008/09/20/attivio-update/#comments</comments>
		<pubDate>Sat, 20 Sep 2008 05:00:06 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Application areas]]></category>
		<category><![CDATA[Attivio]]></category>
		<category><![CDATA[Enterprise search]]></category>
		<category><![CDATA[Lucene]]></category>
		<category><![CDATA[Structured search]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/?p=283</guid>
		<description><![CDATA[I talked w/ Andrew McKay of Attivio for 2 ½ hours Thursday. I&#8217;ve also been working with some Attivio engineers on a blog search engine. I think it&#8217;s time to post about Attivio. In its full conception, the Attivio Intelligence Engine is something like Endeca + RDBMS + search engine + XML store + cool [...]]]></description>
			<content:encoded><![CDATA[<p style="margin-bottom: 0in;">I talked w/ Andrew McKay of Attivio for 2 ½ hours Thursday.  I&#8217;ve also been working with some Attivio engineers on a blog search engine.  I think it&#8217;s time to post about Attivio. <img src='http://www.texttechnologies.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  <span id="more-283"></span></p>
<p style="margin-bottom: 0in;">In its full conception, the Attivio Intelligence Engine is something like Endeca + RDBMS + search engine + XML store + cool extra features.  And all with seamless, lightweight, integrated installation and administration.  That&#8217;s the goal, anyway.  At this point, naturally, each individual piece is far from complete. For example:</p>
<ul>
<li>Sufficient SQL support to handle 	most BI tools is still a matter for future releases &#8212; apparently in 	2009, although Attivio is one of those agile companies for which 	pinning down product releases is somewhat difficult.</li>
<li>The same goes some basic GUI 	features (such as  most non-programmatic search tuning).</li>
<li>ACID compliance is not a high 	priority for Attivio. I actually think it should be higher, just 	because it&#8217;s increasingly become an “OK, we don&#8217;t have to worry 	about THAT” checkmark item.</li>
</ul>
<p style="margin-bottom: 0in;">Even in its early days, Attivio has had some nice-sounding customer successes.  There are 8 paying Attivio customers, including 2 &gt; $1 million deals, one half-millionish dollar deal, and 1 large OEM.  3 represent actual deployments, with the rest in development.  More sales are on the way, as are permissions to disclose customer names that people will actually recognize.  Customer application stories Andrew told me about include:</p>
<ul>
<li>A web-business parameterized, 	adjustable-weight search that&#8217;s starting with tabular data and only 	getting to free-text later.</li>
<li>An enterprise that&#8217;s using Attivio 	for content management, enterprise search, public-facing search, <em>and</em> data warehousing.</li>
<li>Something 	big/mysterious/classified, with large document volumes.</li>
<li>Something to do with compliance, 	about which Andrew was going to forward a lot more detail that 	evening (Hint, hint).</li>
</ul>
<p style="margin-bottom: 0in;">Since the major RDBMS (Oracle, Microsoft SQL Server, DB2) all have text search and XML subsystems, they can in principle do everything Attivio does on the back end, and with a lot more features and maturity.  The same would go for Marklogic.   Performance and overhead might be different matters, however; Andrew certainly believes so.</p>
<p style="margin-bottom: 0in;">Except that Lucene is included on the search side, I haven&#8217;t actually figured out how Attivio stores data.  The fact that SQL features are being added incrementally suggests Attivio is rolling its own relational database capability, but how it&#8217;s organized I don&#8217;t really know.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2008/09/20/attivio-update/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>One overview of e-discovery</title>
		<link>http://www.texttechnologies.com/2008/09/13/emc-ediscovery/</link>
		<comments>http://www.texttechnologies.com/2008/09/13/emc-ediscovery/#comments</comments>
		<pubDate>Sat, 13 Sep 2008 09:17:21 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[E-discovery]]></category>
		<category><![CDATA[Enterprise search]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/?p=281</guid>
		<description><![CDATA[I just found a year-old (almost) blog post from EMC executive Andrew Cohen that succinctly lays out his view (which he believes to mainly be a consensus stance) on e-discovery. Cohen is evidently both a lawyer and a honcho in document management system vendor EMC&#8217;s Compliance Division, which is probably relevant to interpreting his outlook, [...]]]></description>
			<content:encoded><![CDATA[<p>I just found a year-old (almost) <a href="http://andrewsblog.typepad.com/andrew/2007/11/bringing-edisco.html" onclick="javascript:pageTracker._trackPageview('/outbound/article/andrewsblog.typepad.com');">blog post</a> from EMC executive Andrew Cohen that succinctly lays out his view (which he believes to mainly be a consensus stance) on e-discovery.  Cohen is evidently both a lawyer and a honcho in document management system vendor EMC&#8217;s Compliance Division, which is probably relevant to interpreting his outlook, in the spirit of the old Kennedy School dictum that &#8220;Where you stand depends upon where you sit.&#8221;</p>
<p>Highlights included:</p>
<ul>
<li>Information management is central to e-discovery.</li>
<li>In particular, auditability (my word) is central, if you want electronic documents to hold up as evidence in court.</li>
<li>Search is good enough, but it&#8217;s not the biggest issue in e-discovery.</li>
<li>E-mail archiving has reached the tipping point, and is increasingly a must-have, largely for its e-discovery benefits.</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2008/09/13/emc-ediscovery/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>How good does e-discovery search need to be?</title>
		<link>http://www.texttechnologies.com/2008/09/01/how-good-does-e-discovery-search-need-to-be/</link>
		<comments>http://www.texttechnologies.com/2008/09/01/how-good-does-e-discovery-search-need-to-be/#comments</comments>
		<pubDate>Mon, 01 Sep 2008 04:44:58 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Autonomy]]></category>
		<category><![CDATA[E-discovery]]></category>
		<category><![CDATA[Enterprise search]]></category>
		<category><![CDATA[Search engines]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/?p=277</guid>
		<description><![CDATA[Two years ago, CEO Mike Lynch of Autonomy tried to persuade me that Autonomy was and would remain dominant in the e-discovery search market because: The essence of the buying decision was that enterprises wanted to fulfill obligations to make their information available in a way that would would satisfy the courts. Autonomy had some [...]]]></description>
			<content:encoded><![CDATA[<p>Two years ago, CEO Mike Lynch of Autonomy tried to persuade me that Autonomy was and would remain dominant in the e-discovery search market because:<span id="more-277"></span></p>
<ul>
<li>The essence of the buying decision was that enterprises wanted to fulfill obligations to make their information available in a way that would would satisfy the courts.</li>
<li>Autonomy had some high-profile traction (e.g., the Enron case) that made it the default decision, and hence in particular a choice that met the requirement.</li>
</ul>
<p>Recently, I ran that theory by David Ferris, whose firm <a href="http://www.ferris.com" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.ferris.com');">Ferris Research</a> has long been a/the leading small analyst firm covering e-mail and related technologies.  He wasn&#8217;t buying.  David believes courts are getting <a href="http://www.ferris.com/2008/07/22/courts-will-tolerate-search-inaccuracies/" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.ferris.com');">more sophisticated in their understanding of search technology</a>.  Even more to the point, David cited several other buying motivations that would lead enterprises to want best-available rather than just-good-enough e-discovery search technology, such as:</p>
<ul>
<li>Enterprises want to know what information is available to be discovered against them.</li>
<li>Enterprises want to discover the information that will best aid their legal defense.</li>
<li>If they&#8217;re archiving the material for one purpose (e-discovery) anyway, enterprises want to get the most possible value out of it for other purposes while they&#8217;re at it.</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2008/09/01/how-good-does-e-discovery-search-need-to-be/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>The Attivio angle on the FAST story</title>
		<link>http://www.texttechnologies.com/2008/07/08/the-attivio-angle-on-the-fast-story/</link>
		<comments>http://www.texttechnologies.com/2008/07/08/the-attivio-angle-on-the-fast-story/#comments</comments>
		<pubDate>Tue, 08 Jul 2008 19:16:50 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Attivio]]></category>
		<category><![CDATA[Enterprise search]]></category>
		<category><![CDATA[FAST]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/?p=259</guid>
		<description><![CDATA[Attivio CEO Ali Riaz was previously CFO and COO of FAST. He tried to avoid involvement in the recent expose&#8217; of his former employer. For his troubles he got a parking lot ambush, a big photograph, and some unflattering coverage. Adriaan Bloem and Stephen Arnold have been hotly debating Ali&#8217;s culpability. There are two general [...]]]></description>
			<content:encoded><![CDATA[<p>Attivio CEO Ali Riaz was previously CFO and COO of FAST.  He tried to avoid involvement in the recent <a href="http://www.texttechnologies.com/2008/07/08/recent-reporting-on-the-shenanigans-at-fast/" >expose&#8217;</a> of his former employer.  For his troubles he got a parking lot ambush, a big photograph, and some unflattering coverage.  <span id="more-259"></span> <a href="http://www.cmswatch.com/Trends/1294-How-Fast-is-Attivio" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.cmswatch.com');">Adriaan Bloem</a> and <a href="http://arnoldit.com/wordpress/2008/07/06/not-so-fast-folks/" onclick="javascript:pageTracker._trackPageview('/outbound/article/arnoldit.com');">Stephen Arnold </a>have been hotly debating Ali&#8217;s culpability.</p>
<p>There are two general issues here, based on the fact that Ali and a couple of other key Attivio executives come from FAST.  First, they were at a corrupt company &#8212; but resigned before the worst (and perhaps all) of the corruption happened.  Second, they were at a company that did very well in some respects, but very badly in others, so it&#8217;s a mixed-quality resume item.</p>
<p>So far, no biggie. Lots of executives exude overoptimism about their companies products and business prospects. And I haven&#8217;t identified anything which suggests to me as a former stock analyst that the controls Ali put in place as CFO/COO were inadequate.  (If he&#8217;d been long-time CEO, it would have been a different matter, as he would have been more responsible for the general ethical culture of the company &#8212; but he wasn&#8217;t.)</p>
<p>So the main serious charge is that FAST funneled a lot of sales through small reseller companies owned by its executives, including Ali.  Such arrangements could be used either for misappropriation of funds, or to inflate revenue.  In the article, Ali denies involvement in any reseller until after he left FAST&#8217;s employment, but the reporter purports to have discovered proof to the contrary.  I couldn&#8217;t quite get Ali to reiterate his denial to me &#8212; or, indeed, to talk with me directly about the matter &#8212; but did get an emailed statement which reads:</p>
<blockquote><p>Mr. Riaz categorically denies any wrongdoing during his tenure at FAST or in any  relationship with FAST thereafter. He has not been an employee of FAST for  almost two years now, and therefore must defer all further comments to  Microsoft’s official 2006 and 2007 statements on the matter.</p></blockquote>
<p>I&#8217;ve advised my clients at Attivio that they should be clearer and more specific, but so far I&#8217;m not carrying the day.  So for now, we&#8217;ll go with that.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2008/07/08/the-attivio-angle-on-the-fast-story/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Recent reporting on the shenanigans at FAST</title>
		<link>http://www.texttechnologies.com/2008/07/08/recent-reporting-on-the-shenanigans-at-fast/</link>
		<comments>http://www.texttechnologies.com/2008/07/08/recent-reporting-on-the-shenanigans-at-fast/#comments</comments>
		<pubDate>Tue, 08 Jul 2008 19:16:18 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Enterprise search]]></category>
		<category><![CDATA[FAST]]></category>
		<category><![CDATA[Search engines]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/?p=258</guid>
		<description><![CDATA[A Norwegian newspaper did an expose&#8217; on FAST, dated June 28. Helpful search industry participants quickly distributed English translations to a variety of commentators, including me. TechCrunch posted a scan of part of the article. The gist is that FAST followed a pattern very common in the packaged enterprise software industry: It had trouble meeting [...]]]></description>
			<content:encoded><![CDATA[<p>A Norwegian newspaper did an expose&#8217; on FAST, dated June 28.  Helpful search industry participants quickly distributed English translations to a variety of commentators, including me.   <a href="http://www.techcrunch.com/2008/07/03/did-the-enron-of-norway-pull-a-fast-one-on-microsoft-more-details-about-the-mess-at-fast-search-transfer/" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.techcrunch.com');">TechCrunch</a> posted a scan of part of the article.</p>
<p>The gist is that FAST followed a pattern very common in the packaged enterprise software industry:<span id="more-258"></span></p>
<ul>
<li>It had trouble meeting its growth targets.</li>
<li>It inflated reported revenue (in the high-margin software industry, inflating license revenue has a huge impact on profits).</li>
<li>One technique whereby it inflated revenue was to count deals that actually closed after quarter end.</li>
<li>Another technique was to count deals as closed in which the customer hadn&#8217;t actually fully committed to buy.</li>
</ul>
<p>There&#8217;s nothing new here.  Back in the 1980s, we used to joke that <a href="http://www.softwarememories.com/2006/02/13/msa-memories-the-basics/" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.softwarememories.com');">MSA</a> made 10% of its annual revenue and 100% of its profits between the 32nd and 40th of December.</p>
<p>Often, such problems are associated with difficulties getting product installations to succeed.  Stephen Arnold suggests <a href="http://arnoldit.com/wordpress/2008/07/04/fast-cash-faster-crash/" onclick="javascript:pageTracker._trackPageview('/outbound/article/arnoldit.com');">that&#8217;s exactly what happened in the case of FAST</a>:</p>
<blockquote><p>So, Fast Search’s problems began as soon as the company decided to push into the enterprise search market. The adjustments were, as noted in the documents I cited in my previous Fast Search analyses and in the TechCrunch article, small at the outset. Who knew that a customer would not pay his license fee installment? Then more customers groused about slow installs and the up front payments were not followed by any other payments. One Fast Search licensee told me that his Global 1000 company would not pay until Fast Search produced an engineer who could complete the installation per the task order. Well, Fast Search got an engineer to the client, but it was six months after I heard the complaint. Not surprisingly, this big outfit turned to a smaller vendor who got a different system up and running in three weeks.</p></blockquote>
<p><em><strong>Related links</strong></em></p>
<p><a href="http://www.texttechnologies.com/2008/07/08/the-attivio-angle-on-the-fast-story/" >The Attivio angle on this story</a><br />
Edit:  <a href="http://www.scribd.com/doc/3809691/Fasts-Stock-Market-Bluff" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.scribd.com');">The actual article</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2008/07/08/recent-reporting-on-the-shenanigans-at-fast/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>6 trends that could shake up the text analytics market</title>
		<link>http://www.texttechnologies.com/2008/06/19/6-trends-that-could-shake-up-the-text-analytics-market/</link>
		<comments>http://www.texttechnologies.com/2008/06/19/6-trends-that-could-shake-up-the-text-analytics-market/#comments</comments>
		<pubDate>Thu, 19 Jun 2008 08:33:31 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[BI integration]]></category>
		<category><![CDATA[Enterprise search]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Microsoft]]></category>
		<category><![CDATA[Search engines]]></category>
		<category><![CDATA[Social software and online media]]></category>
		<category><![CDATA[Text mining]]></category>
		<category><![CDATA[Cache']]></category>
		<category><![CDATA[Intersystems]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/?p=251</guid>
		<description><![CDATA[My last two posts were based on the introductory slide to my talk The Text Analytics Marketplace: Competitive landscape and trends. I&#8217;ll now jump straight ahead to the talk&#8217;s conclusion. Text analytics vendors participate in the same trends as other software and technology vendors. For example, relational business intelligence and data warehousing products are increasingly [...]]]></description>
			<content:encoded><![CDATA[<p style="margin-bottom: 0in; font-style: normal;"><span>My <a href="http://www.texttechnologies.com/2008/06/19/text-analytics-marketplace-competitive-landscape-trends/" >last</a> <a href="http://www.texttechnologies.com/2008/06/19/3-specialized-markets-for-text-analytics/" >two</a> posts were based on the introductory slide to my talk </span><em><span>The Text Analytics Marketplace: Competitive landscape and trends. </span></em><span style="font-style: normal;"><span>I&#8217;ll now jump straight ahead to the talk&#8217;s conclusion.</span></span></p>
<p style="margin-bottom: 0in; font-style: normal;"><span style="font-style: normal;"><span>Text analytics vendors participate in the same trends as other software and technology vendors.  For example, relational business intelligence and data warehousing products are increasingly being sold to departmental buyers.  Those buyers place particularly high value on ease of installation.  And golly gee whiz, both parts of that are also true in text mining. </span></span></p>
<p style="margin-bottom: 0in; font-style: normal;"><span style="font-style: normal;"><span>But beyond such general trends, I&#8217;ve identified six developments that I think could radically transform the text analytics market landscape.  Indeed, they could invalidate the neat little eight-bucket categorization I laid out in the prior post.  Each is highly likely to occur, although in some cases the timing remains greatly in doubt.</span></span></p>
<p style="margin-bottom: 0in; font-style: normal;"><span style="font-style: normal;"><span>These six market-transforming trends are:</span></span></p>
<ol>
<li> Web/enterprise/messaging 	integration</li>
<li> BI 	integration</li>
<li> Universal 	message retention</li>
<li> Portable 	personal profiles</li>
<li> Electronic 	health records</li>
<li> Voice 	command &amp; control</li>
</ol>
<p style="margin-bottom: 0in; font-style: normal;"><span id="more-251"></span><span>I&#8217;ll explain briefly.</span></p>
<p style="margin-bottom: 0in; font-style: normal;"><span>1.  Google and Microsoft are two of the three leaders in web search.  Now that Microsoft has bought FAST, they are also two of the leaders in enterprise search.  They are also two of the leaders in hosted email. Ditto instant messaging.  So </span><strong>there&#8217;s a good chance these various disciplines will converge.</strong></p>
<p style="margin-bottom: 0in; font-style: normal;"><span>2.  There are a number of ways text analytics and traditional analytics can and are being integrated:</span></p>
<ul>
<li><span>Enterprise 	search and business intelligence are akin; both involve digging 	information out of the data you already have.</span></li>
<li><span>Text 	mining is naturally integrated with business intelligence and/or 	data mining.</span></li>
<li><span>There&#8217;s 	a trend toward using text search to dig up business intelligence 	documents such as specific reports, spreadsheets, etc.</span></li>
</ul>
<p style="margin-bottom: 0in; font-style: normal;"><span>To date the latter is focused on reports that already exist, rather than queries that could be run on the fly, but I hope and trust the technology will be extended over time.  Natural language queries have merit anyway; </span><strong>I&#8217;d like to see the search box be extended in functionality to a true data-retrieval command line.</strong></p>
<p style="margin-bottom: 0in; font-style: normal;">3.  One of the big purchase drivers of storage, search, and clustering technology is mandates to preserve information and make it available to auditors, regulators, and/or people who want to sue you.  Email in particular is changing from being ephemeral to becoming part of the permanent record.  Well, if the information is being retained anyway, then maybe it&#8217;s time to see how to get useful insight from it.</p>
<p style="margin-bottom: 0in; font-style: normal;"><strong>Right now, a company&#8217;s overall text archives aren&#8217;t being leveraged in the same way data warehouses are.  That will change.</strong></p>
<p style="margin-bottom: 0in; font-style: normal;"><span>4.  For over a decade, online companies have fought to exploit the fact that users were registered with their sites or services, but not with others.  Huge amounts of investment money were wasted in the dot-com bubble because people thought “registered users” was a significant metric, or that ISP subscribers could be directed to proprietary content.  Enormous valuations are being assigned to Facebook and LinkedIn on similar theories today.</span></p>
<p style="margin-bottom: 0in; font-style: normal;"><span>But as site owners and other marketers get ever more aggressive about exploiting user-specific information, users will get ever more sophisticated about controlling it. </span><strong>The obvious solution is for each internet user to control a sophisticated database of their contact information, presence information, actions, preferences, and writings, and to be very selective about which online services are allowed to see which portions of the data. </strong><span>I think that will come about some day, but I don&#8217;t know when.  When it does, text analytics will be affected in a variety of interesting ways.</span></p>
<p style="margin-bottom: 0in; font-style: normal;"><span>5.  Electronic health records are almost unique in IT.  What other enterprise app can you think of for which relational DBMS aren&#8217;t the default underpinning?  (Intersystems&#8217; object-oriented DBMS Cache&#8217; has huge share in the clinical records market.)   Normal tabular data, text, images, sensor output streams – health records have it all.  What&#8217;s more, the health records area is coming upon some very interesting times in the area of data sharing, at least in the US.</span></p>
<p style="margin-bottom: 0in; font-style: normal;"><span>Just as retailing went from being an IT backwater (through the mid-1980s), to a sophisticated user of database technology (1990s), to the leader of the internet revolution (rise of e-commerce), </span><strong>I think health care is due to take a leadership role in IT advances</strong><span>.   And when it does, search, text mining, and voice recognition will all play important roles.</span></p>
<p style="margin-bottom: 0in; font-style: normal;"><span>6.  Most people reading this far have probably watched Star Trek.</span><strong> Well, what is keeping us from being able to command computers in a Star Trek fashion?  Not really that much. </strong><span> Sure, there are some big missing pieces.  We need a mapping from commands to the specific applications that would carry them out.  We also need a more structured kind of analytic middle tier so that there&#8217;s something to map questions to.  But those are solvable problems.  And by the way – when everybody wears headphones, voice commands emanating from the next cubicle are no longer the big annoyance they would be today.  Mobile/small devices only add to the business case for voice recognition advances.</span></p>
<p style="margin-bottom: 0in; font-style: normal;"><span>When voice becomes a primary mode of human/device communication, “text” analytics will be affected in any number of ways.</span></p>
<p style="margin-bottom: 0in;"><em><strong>Related links:</strong></em></p>
<ul>
<li><a href="http://www.texttechnologies.com/2008/06/19/text-analytics-marketplace-competitive-landscape-trends/" >The introductory post in this series</a></li>
<li><a href="http://www.texttechnologies.com/2008/02/03/microsoft-yahoo-synergies/" >19 possible Microsoft/Yahoo synergies</a>, many of them related to text technology convergence, e.g. between web search and enterprise search</li>
<li>The compelling case for <a href="http://www.monashreport.com/2008/01/04/early-thoughts-on-outsourcing-to-google-mail/" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.monashreport.com');">letting Google handle your enterprise email</a></li>
<li>An old post on <a href="http://www.texttechnologies.com/2006/09/01/why-the-bi-vendors-are-integrating-with-google-onebox/" >why BI vendors flocked to integrate with Google OneBox</a></li>
<li>A proposal to <a href="http://www.texttechnologies.com/2007/02/06/what-is-linkedin-needed-for-absolutely-nothing-and-the-same-goes-for-myspace/" >refactor social networks</a></li>
<li>An old post in which I outlined some of the criteria for <a href="http://www.dbms2.com/2005/11/17/native-xml-storage-part-2-apps/" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.dbms2.com');">Profiles 2.0</a></li>
<li><a href="http://www.networkworld.com/community/node/29109" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.networkworld.com');">Why text technologies are going to recombine</a> (in <em>A World of Bytes</em>)</li>
</ul>
<p style="margin-bottom: 0in; font-style: normal;">
<p style="margin-bottom: 0in; font-style: normal;">
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2008/06/19/6-trends-that-could-shake-up-the-text-analytics-market/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>The Text Analytics Marketplace: Competitive landscape and trends</title>
		<link>http://www.texttechnologies.com/2008/06/19/text-analytics-marketplace-competitive-landscape-trends/</link>
		<comments>http://www.texttechnologies.com/2008/06/19/text-analytics-marketplace-competitive-landscape-trends/#comments</comments>
		<pubDate>Thu, 19 Jun 2008 07:35:39 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Audio and video search]]></category>
		<category><![CDATA[BI integration]]></category>
		<category><![CDATA[Custom publishing]]></category>
		<category><![CDATA[Enterprise search]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Natural language processing (NLP)]]></category>
		<category><![CDATA[Nuance]]></category>
		<category><![CDATA[Progress and EasyAsk]]></category>
		<category><![CDATA[Search engines]]></category>
		<category><![CDATA[Social software and online media]]></category>
		<category><![CDATA[Spam and antispam]]></category>
		<category><![CDATA[Speech recognition]]></category>
		<category><![CDATA[Structured search]]></category>
		<category><![CDATA[Text Analytics Summit]]></category>
		<category><![CDATA[Text mining]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/?p=249</guid>
		<description><![CDATA[As I see it, there are eight distinct market areas that each depend heavily on linguistic technology. Five are off-shoots of what used to be called “information retrieval”: 1. Web search 2. Public-facing site search 3. Enterprise search and knowledge management 4. Custom publishing 5. Text mining and extraction Three are more standalone: 6. Spam [...]]]></description>
			<content:encoded><![CDATA[<p style="margin-bottom: 0in;">As I see it, there are eight distinct market areas that each depend heavily on linguistic technology. Five are off-shoots of what used to be called “information retrieval”:</p>
<p style="margin-bottom: 0in; font-style: normal; padding-left: 30px;">1.  Web search</p>
<p style="margin-bottom: 0in; font-style: normal; padding-left: 30px;">2.  Public-facing site search</p>
<p style="margin-bottom: 0in; font-style: normal; padding-left: 30px;">3.  Enterprise search and knowledge management</p>
<p style="margin-bottom: 0in; font-style: normal; padding-left: 30px;">4.  Custom publishing</p>
<p style="padding-left: 30px;">5.  Text mining and extraction</p>
<p style="margin-bottom: 0in; font-style: normal;">Three are more standalone:</p>
<p style="margin-bottom: 0in; font-style: normal; padding-left: 30px;">6.  Spam filtering</p>
<p style="margin-bottom: 0in; font-style: normal; padding-left: 30px;">7.  Voice recognition</p>
<p style="margin-bottom: 0in; font-style: normal; padding-left: 30px;">8.  Machine translation</p>
<p><span id="more-249"></span></p>
<p style="margin-bottom: 0in;">This list comes from a talk I gave Monday at the Text Analytics Summit called <em>The Text Analytics Marketplace: Competitive landscape and trends. </em>In half an hour, I covered the first five areas (in Sue Feldman&#8217;s word, at a “gallop”). The slide deck has been uploaded to the link below.  <span style="font-style: normal;"><span>I plan to break out the material from the talk into a series of blog posts over the next few (or perhaps not-so-few) weeks. </span></span></p>
<p style="margin-bottom: 0in;"><em><strong>Slides:</strong></em></p>
<ul>
<li><a href="http://www.monash.com/Text-analytics-markets-June-2008.ppt " onclick="javascript:pageTracker._trackPageview('/outbound/article/www.monash.com');"><span>The Text Analytics Marketplace: Competitive landscape and trends</span></a></li>
</ul>
<p style="margin-bottom: 0in;"><strong><em>Other posts based on those slides:</em></strong></p>
<ul>
<li><span><a href="http://www.texttechnologies.com/2008/06/19/3-specialized-markets-for-text-analytics/" >Three specialized markets for text analytics</a> (based on Slide 2)</span></li>
<li><span><a href="http://www.texttechnologies.com/2008/06/19/6-trends-that-could-shake-up-the-text-analytics-market/" >6 trends that could shake up the text analytics market</a> (based on Slide 19)</span></li>
<li><span><a href="(in A World of Bytes)">Why search technologies are going to recombine</a> (in <em>A World of Bytes</em>, based on Slide 19)<br />
</span></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2008/06/19/text-analytics-marketplace-competitive-landscape-trends/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
	</channel>
</rss>

