<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Text Technologies &#187; ClearForest/Reuters</title>
	<atom:link href="http://www.texttechnologies.com/category/vendors/clearforest/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.texttechnologies.com</link>
	<description>Understanding technology ... in both senses of the phrase</description>
	<lastBuildDate>Wed, 18 Jan 2012 17:02:59 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.3</generator>
		<item>
		<title>More website weirdness</title>
		<link>http://www.texttechnologies.com/2008/11/19/more-website-weirdness/</link>
		<comments>http://www.texttechnologies.com/2008/11/19/more-website-weirdness/#comments</comments>
		<pubDate>Thu, 20 Nov 2008 03:27:27 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[ClearForest/Reuters]]></category>
		<category><![CDATA[Custom publishing]]></category>
		<category><![CDATA[Mark Logic]]></category>
		<category><![CDATA[Search engines]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/?p=298</guid>
		<description><![CDATA[Here&#8217;s something longer-lasting and weirder than Vertica&#8217;s &#8220;We sell turkeys&#8221; theme: Mark Logic, whose product is used primarily to help enterprises make their content more acceptable, doesn&#8217;t have a search engine on its own website.* *Or if it does, it&#8217;s VERY well-hidden. I looked at the home page and site map alike. I wanted to [...]]]></description>
			<content:encoded><![CDATA[<p>Here&#8217;s something longer-lasting and weirder than <a href="http://www.dbms2.com/2008/11/18/silly-website-tricks/" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.dbms2.com');">Vertica&#8217;s &#8220;We sell turkeys&#8221; theme</a>: Mark Logic, whose product is used primarily to help enterprises make their content more acceptable, doesn&#8217;t have a search engine on its own website.*<span id="more-298"></span></p>
<p><em>*Or if it does, it&#8217;s VERY well-hidden. I looked at the home page and site map alike.</em></p>
<p>I wanted to refresh my memory as to Mark Logic&#8217;s history of working with specific text mining vendors, beyond what&#8217;s on the official <a href="http://www.marklogic.com/partners/open-enrichment-framework.html" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.marklogic.com');">partner page</a>. No luck.  Normally when site search is inadequate, one goes to Google.   But that&#8217;s problematic too.  Marklogic.com pages come up pretty low on Google&#8217;s search results, suggesting that:</p>
<ol>
<li>Mark Logic doesn&#8217;t put a lot of effort into SEO (or else doesn&#8217;t do it very well).</li>
<li>One can&#8217;t be confident all the site&#8217;s significant pages are findable by Google.</li>
</ol>
<p>Looking to other companies&#8217; sites for clues isn&#8217;t conclusive either.  E.g., <a href="http://clearforest.com/Partners/PartnerDetails.asp?id=11" onclick="javascript:pageTracker._trackPageview('/outbound/article/clearforest.com');">Clearforest lists Mark Logic as a partner</a>, but Mark Logic doesn&#8217;t return the compliment.  (If memory serves, Mark Logic and Clearforest have worked together both on national security deals and custom publishing deals &#8212; but don&#8217;t hold me to that.)</p>
<p>When it comes to making its own information conveniently available, Mark Logic is quite the unshod cobbler.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2008/11/19/more-website-weirdness/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>Low-latency text mining in the investment market</title>
		<link>http://www.texttechnologies.com/2008/09/19/low-latency-text-mining-in-the-investment-market/</link>
		<comments>http://www.texttechnologies.com/2008/09/19/low-latency-text-mining-in-the-investment-market/#comments</comments>
		<pubDate>Fri, 19 Sep 2008 09:15:58 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[ClearForest/Reuters]]></category>
		<category><![CDATA[Investment research and trading]]></category>
		<category><![CDATA[Sentiment analysis]]></category>
		<category><![CDATA[Text mining]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/?p=282</guid>
		<description><![CDATA[I&#8217;m not at Gartner&#8217;s Event Processing conference, but there seem to be some interesting posts and articles coming out of it. Seth Grimes has one on Reuters&#8217; integration of text mining and event processing, including sentiment analysis. Well worth reading. Lots more detail than I&#8217;ve ever posted on similar applications.]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m not at Gartner&#8217;s Event Processing conference, but there seem to be some interesting posts and articles coming out of it.  Seth Grimes has one on <a href="http://www.intelligententerprise.com/blog/archives/2008/09/event_processin_1.html" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.intelligententerprise.com');">Reuters&#8217; integration of text mining and event processing</a>, including sentiment analysis.  Well worth reading.  Lots more detail than I&#8217;ve ever posted on <a href="http://www.texttechnologies.com/2006/12/27/text-analytics-is-finally-being-used-for-investment-analysis/" >similar</a> <a href="http://www.texttechnologies.com/2007/08/03/more-on-text-processing-in-cep/" >applications</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2008/09/19/low-latency-text-mining-in-the-investment-market/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Text mining applications as per Attensity and Clarabridge</title>
		<link>http://www.texttechnologies.com/2007/10/05/text-mining-applications-as-per-attensity-and-clarabridge/</link>
		<comments>http://www.texttechnologies.com/2007/10/05/text-mining-applications-as-per-attensity-and-clarabridge/#comments</comments>
		<pubDate>Sat, 06 Oct 2007 03:37:59 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Application areas]]></category>
		<category><![CDATA[Attensity]]></category>
		<category><![CDATA[Clarabridge]]></category>
		<category><![CDATA[ClearForest/Reuters]]></category>
		<category><![CDATA[Competitive intelligence]]></category>
		<category><![CDATA[Factiva/Dow Jones]]></category>
		<category><![CDATA[Investment research and trading]]></category>
		<category><![CDATA[Text mining]]></category>
		<category><![CDATA[Voice of the Customer]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/2007/10/05/text-mining-applications-as-per-attensity-and-clarabridge/</guid>
		<description><![CDATA[Besides asking them technical questions, I surveyed Attensity and Clarabridge last week about text mining application trends, getting generously detailed answers from Michelle De Haaff of Attensity and Justin Langseth of Clarabridge. Perhaps the most important point to emerge was that it&#8217;s not just about particular apps. Enterprises are doing text mining POCs (Proofs of [...]]]></description>
			<content:encoded><![CDATA[<p style="margin-bottom: 0in">Besides asking them technical 	questions, I surveyed Attensity and Clarabridge last week about text 	mining application trends, getting generously detailed answers from 	Michelle De Haaff of Attensity and Justin Langseth of Clarabridge.  	Perhaps the most important point to emerge was that it&#8217;s not just 	about particular apps.  Enterprises are doing text mining POCs 	(Proofs of Concept) around specific apps, commonly in the CRM area, 	but immediately structuring the buying process in anticipation of a 	rollout across multiple departments in the enterprise.</p>
<p style="margin-bottom: 0in">Other highlights of what they said included:<span id="more-131"></span></p>
<ul>
<li><strong>Voice of the Customer</strong> remains hot, hot, hot.</li>
<li>Closely allied with <strong>Voice of the Customer,</strong> and also hot, is <a href="http://www.texttechnologies.com/2007/10/05/nice-new-phrase-voice-of-the-market/" ><strong>Voice of the Market</strong></a> and/or more direct <strong>competitive intelligence</strong>.</li>
<li><strong>Classical </strong><strong>warranty analysis</strong> is quiet but not wholly dead.  Attensity, historically strong in that application, sees it as merging into Voice of the Customer.  Clarabridge, previously not so strong there (if I recall correctly), is getting at least a little of the traditional-style warranty business.</li>
<li><strong>Human resources</strong> (especially <strong>Voice of the Employee</strong> – I detect a trend in application-naming here) gets mentioned a fair amount.  It&#8217;s usually not the first text mining application an enterprise deploys, but it&#8217;s a common follow-on.</li>
<li><strong>Antifraud</strong> isn&#8217;t just for insurance companies.  Retailing and money-laundering also got mentioned as areas where text mining helped combat fraud.</li>
<li><strong>Insurance industry</strong> use of text mining for claims analysis, I gather, goes well beyond just fraud detection.</li>
<li><strong>Intelligence </strong>is obviously a huge market for Attensity (not so much for Clarabridge), but I didn&#8217;t focus on the classified stuff.  That said, I was reminded of Attensity&#8217;s awkward phrase<em> link analysis,</em> which has nothing to do with hypertext, but instead is the detection of relationships between entities.  This lies at the heart of a non-empty set of civilian <strong>law enforcement</strong> applications and the like.</li>
<li><strong>Investment research</strong> applications of text mining still seem nascent and experimental, at least if one talks with Clarabridge and Attensity.  That said, Factiva is a large subsidiary of Dow Jones now, and ClearForest a smaller one of Reuters, and they&#8217;re doing something or other.  Apparently, it&#8217;s much more document tagging for the sake of readers or search-style filters than it is for use in any kind of business intelligence/statistical mining kind of application.</li>
</ul>
<p>All this isn&#8217;t too different from <a href="http://www.texttechnologies.com/2007/07/22/text-analytics-marketplace-trends/" >what I posted back in July</a>, but I think text mining application trends is a subject that bears frequent revisiting.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2007/10/05/text-mining-applications-as-per-attensity-and-clarabridge/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Text analytics marketplace trends</title>
		<link>http://www.texttechnologies.com/2007/07/22/text-analytics-marketplace-trends/</link>
		<comments>http://www.texttechnologies.com/2007/07/22/text-analytics-marketplace-trends/#comments</comments>
		<pubDate>Sun, 22 Jul 2007 09:44:49 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Application areas]]></category>
		<category><![CDATA[ClearForest/Reuters]]></category>
		<category><![CDATA[Custom publishing]]></category>
		<category><![CDATA[Factiva/Dow Jones]]></category>
		<category><![CDATA[Mark Logic]]></category>
		<category><![CDATA[SAS]]></category>
		<category><![CDATA[Search engines]]></category>
		<category><![CDATA[Spam and antispam]]></category>
		<category><![CDATA[Text Analytics Summit]]></category>
		<category><![CDATA[Text mining]]></category>
		<category><![CDATA[Voice of the Customer]]></category>
		<category><![CDATA[nStein]]></category>
		<category><![CDATA[Assentor]]></category>
		<category><![CDATA[compliance]]></category>
		<category><![CDATA[Dow Jones]]></category>
		<category><![CDATA[ETL]]></category>
		<category><![CDATA[Factiva]]></category>
		<category><![CDATA[fraud detection]]></category>
		<category><![CDATA[reputation management]]></category>
		<category><![CDATA[Reuters]]></category>
		<category><![CDATA[spam]]></category>
		<category><![CDATA[StreamBase]]></category>
		<category><![CDATA[Text analytics]]></category>
		<category><![CDATA[warranty analysis]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/2007/07/22/text-analytics-marketplace-trends/</guid>
		<description><![CDATA[It was tough to judge user demand at the recent Text Analytics Summit because, well, very few users showed up. And frankly, I wasn&#8217;t as aggressive at pumping vendors for trends as I am some other times. That said, I have talked with most text analytics vendors recently,* and here are my impressions of what&#8217;s [...]]]></description>
			<content:encoded><![CDATA[<p style="margin-bottom: 0in">It was tough to judge user demand at the recent Text Analytics Summit because, well, very few users showed up.  And frankly, I wasn&#8217;t as aggressive at pumping vendors for trends as I am some other times.  That said, I <em>have </em>talked with most text analytics vendors recently,* and here are my impressions of what&#8217;s going on.  Any contrary – or confirming! &#8212; opinions would be most welcome.</p>
<p style="margin-bottom: 0in"><em><span>*Factiva is the most significant exception.   Hint, hint.</span></em></p>
<p style="margin-bottom: 0in">If you think about it, text analytics is a <span style="font-style: normal"><strong>“secret ingredient” in search, antispam, and data cleaning,</strong></span>* and this dominates all other uses of the technology.  A significant minority of the research effort at companies that do any kind of text filtering is – duh &#8212; text analytics.  Cold comfort for specialist text analytics vendors, to be sure, but that&#8217;s the way it is.</p>
<p style="margin-bottom: 0in"><em>*I.e., part of the “T” in “ETL” (Extract/Transform/Load).</em></p>
<p style="margin-bottom: 0in">Text-analytics-enhanced <span style="font-style: normal"><strong>custom publishing</strong></span> will surely at some point become a  must-have for business and technical publishers.  However, it appears that we&#8217;re not quite there yet, as large publishers make do with simple-minded search and the like.  In what I suspect is a telling market commentary, there&#8217;s no headlong rush among vendors to dump text mining for custom publishing, notwithstanding the examples of nStein and (sort of) ClearForest.  I don&#8217;t want to be overly negative – either my friends at Mark Logic are doing just fine or else they&#8217;re putting up a mighty brave front – but I don&#8217;t think the nonspecialist publishing market is there yet.<span id="more-119"></span></p>
<p style="margin-bottom: 0in">Two business publishers who have made major investments in owning text analytics technology are Dow Jones (now sole owners of Factiva) and Reuters (recent purchaser of ClearForest).  Beyond that, however, I don&#8217;t yet see a lot of activity in the <strong>investor/trading</strong> market, although ClearForest reported some activity last year and StreamBase reports that one customer is using them for text filtering, presumably alongside the ticker-munching traders usually use StreamBase for.</p>
<p style="margin-bottom: 0in">Obviously, the <span style="font-style: normal"><strong>intelligence</strong></span> market is what fueled the start of the text analytics business, and still provides the majority of revenue at multiple companies.  Certainly it&#8217;s still going strong.  But it&#8217;s tough to gauge the growth potential from here, especially since the details of usage are typically classified.</p>
<p style="margin-bottom: 0in">Similar things could be said about <span style="font-style: normal"><strong>pharmaceutical research.</strong></span><em> </em> Text analytics is totally accepted in that market, but what&#8217;s the growth potential from here?  And “here” isn&#8217;t actually very big (much smaller than intelligence).  The related category of <span style="font-style: normal"><strong>patient records analysis</strong></span> looks very promising, but is basically still at the research-project stage.  (In general, an explosion in biological IT can be expected when research methods are adapted for clinical use.)</p>
<p style="margin-bottom: 0in">The <span style="font-style: normal"><strong>warranty analysis</strong></span> market, so promising early on, is not showing a lot of growth and depth.   The same thing has happened many times before with innovative technologies sold to manufacturing companies&#8217; engineers.  It seems to be happening again now.</p>
<p style="margin-bottom: 0in"><span style="font-style: normal"><strong>Voice of the customer*</strong></span> is pretty much the same thing, but for service industries.   And the text analytics market for VotC is evidently stronger right now than that for warranty analysis.  This makes sense, because the obvious alternative to text analytics – multiple-choice coded forms – is less appealing, due to two application differences:</p>
<ul>
<li>
<p style="margin-bottom: 0in">VotC looks for opinion as well as fact.</p>
</li>
<li>
<p style="margin-bottom: 0in">VotC looks for input from people 	under no obligation to share it, and who hence can&#8217;t be compelled to play along with a structured form – let alone trained to fill it in accurately.</p>
</li>
</ul>
<p style="margin-bottom: 0in"><em>*Definitional note: </em><span style="font-style: normal">Voice of the customer </span><em>is when customers or prospects communicates with you directly, e.g. via a survey form or an angry email. </em><span style="font-style: normal">Reputation management </span><em>is when you web-scrape and find out what they&#8217;re saying to everybody else.  At least, I think marketers are still using the terms that way pretty consistently.</em></p>
<p style="margin-bottom: 0in"><span style="font-style: normal"><strong>Reputation management</strong></span><em> </em><span style="font-style: normal">is surely</span><em> </em>becoming a standard application for the biggest consumer brands.  How deep that market turns out to be, however, remains to be seen.</p>
<p>Text analytics for <span style="font-style: normal"><strong>fraud discovery</strong></span> seems poised to sweep the insurance industry, and then the rest of financial services.  Current activity, however, while decent, still seems to consist of more poising than sweeping.</p>
<p><span style="font-style: normal"><strong>Compliance</strong></span> is a minimum-acceptable-efforts kind of activity in most markets.  Accordingly search/clustering seems to be the preferred text-checking approach.  Where that&#8217;s not the case, the market seems to have gone to specialized products like Assentor (stock brokerage).</p>
<p style="margin-bottom: 0in"><strong>Human resources</strong> is a good area to sell follow-on applications, at least to enterprises with so many employees that they want to automate the reading of employee feedback.  I&#8217;m not aware of it being the first-sale app to very many enterprises, however.</p>
<p style="margin-bottom: 0in">SAS used to speak glowingly of text mining used directly for <span style="font-style: normal"><strong>ETL.</strong></span> However, nobody else has talked about this, and even from SAS I get the sense that some of the glow has worn off.  As noted above, text analytics is an important ingredient to the transformation part of ETL, but it I think it rarely would be the best option for doing the transformations directly.</p>
<p style="margin-bottom: 0in"><em></em></p>
<p style="margin-bottom: 0in"><em><br />
</em></p>
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2007/07/22/text-analytics-marketplace-trends/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>(A little) more on Business Objects/Inxight</title>
		<link>http://www.texttechnologies.com/2007/05/23/a-little-more-on-business-objectsinxight/</link>
		<comments>http://www.texttechnologies.com/2007/05/23/a-little-more-on-business-objectsinxight/#comments</comments>
		<pubDate>Wed, 23 May 2007 10:39:23 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Attensity]]></category>
		<category><![CDATA[Business Objects and Inxight]]></category>
		<category><![CDATA[ClearForest/Reuters]]></category>
		<category><![CDATA[Enterprise search]]></category>
		<category><![CDATA[SAS]]></category>
		<category><![CDATA[Search engines]]></category>
		<category><![CDATA[Text mining]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/2007/05/23/a-little-more-on-business-objectsinxight/</guid>
		<description><![CDATA[After missing what seems to have been an uninformative press conference anyway, I hooked up later with the Business Objects folks on the phone. I say that it was probably uninformative because in the short call, it was pointed out to me that they really weren&#8217;t at liberty to say much anyway. Here are a [...]]]></description>
			<content:encoded><![CDATA[<p>After <a href="http://www.texttechnologies.com/2007/05/22/business-objects-inxight/" >missing</a> what seems to have been an uninformative press conference anyway, I hooked up later with the Business Objects folks on the phone.  I say that it was probably uninformative because in the short call, it was pointed out to me that they really weren&#8217;t at liberty to say much anyway. Here are a couple of tidbits I picked up even so.</p>
<ul>
<li><em>Business Objects&#8217; text mining partnerships have been more demo/sales-cycle than actual sales up until now. </em>That said, they have a few deals each with Attensity and Inxight (but not with ClearForest, which <a href="http://www.texttechnologies.com/2007/04/30/clearforest-reuters-acquisition/" >pulled in its horns</a> prior to being acquired by Reuters).   I still think they&#8217;re the leading BI vendor in integrating with text mining, SAS perhaps aside (who if nothing else have a lot of fun using text mining for data cleaning).  The working Inxight partnership, by the way, was all about the specific app of email compliance, with the demo being based on the publicly available Enron corpus.</li>
<li><em>Inxight&#8217;s visualization technology is in the form of an SDK anyway.  So integrating it into BOBJ&#8217;s product line should be straightforward. </em>Note: Through the Excelsius acquisition, BOBJ has been trying to gain competitive advantage in the cool-visualization area.</li>
<li><em>Inxight&#8217;s &#8220;federation&#8221; capability for search is pretty primitive</em> (my term and opinion of course, not theirs).  It takes in search result sets from various sources, then clusters and/or refilters them.  What it does NOT do is the much harder task of taking actual relevancy rankings from various engines and somehow arbitrating between them.  Nor, I&#8217;m guessing, does it even assign higher or lower weights to various corpuses or anything like that.  Thus, it does not sound terribly competitive with the distributed search capabilities built into any state-of-the-art enterprise search engine.</li>
</ul>
<p><em></em></p>
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2007/05/23/a-little-more-on-business-objectsinxight/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>ClearForest, Reuters, Factiva, Dow Jones, and possible futures</title>
		<link>http://www.texttechnologies.com/2007/04/30/clearforest-reuters-acquisition/</link>
		<comments>http://www.texttechnologies.com/2007/04/30/clearforest-reuters-acquisition/#comments</comments>
		<pubDate>Mon, 30 Apr 2007 15:19:55 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[ClearForest/Reuters]]></category>
		<category><![CDATA[Factiva/Dow Jones]]></category>
		<category><![CDATA[Text mining]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/2007/04/30/clearforest-reuters-acquisition/</guid>
		<description><![CDATA[ClearForest is being acquired by Reuters. That ClearForest is being bought is unsurprising. The company recently pulled in its marketing horns dramatically, a common sign of putting oneself up for sale. The Reuters move, meanwhile, can be seen as a sequel to the divestiture of its half of Factiva to former 50-50 partner Dow Jones. [...]]]></description>
			<content:encoded><![CDATA[<p class="MsoNormal">ClearForest is being acquired by Reuters.<span> </span>That ClearForest is being bought is unsurprising.<span> </span>The company recently <a href="http://www.texttechnologies.com/2007/03/19/whats-going-on-at-clearforest/" >pulled in its marketing horns</a> dramatically, a common sign of putting oneself up for sale.<span> </span>The Reuters move, meanwhile, can be seen as a sequel to the divestiture of its half of Factiva to former 50-50 partner Dow Jones.</p>
<p class="MsoNormal">If the two main parts of the text mining market are custom publishing and <a href="http://www.texttechnologies.com/2006/07/27/application-processes-in-text-mining-%e2%80%93-finding-warning-signs/" >finding warning signs</a>, then both could actually be a good fit with Reuters.<span> </span>The custom publishing part is obvious. <span> </span>As for early warning – well, maybe ClearForest will lose its competitive edge in consumer product warranty analysis or something, but a significant fraction of the early warning market is tied to news articles, web postings, and other things that are a good fit for Reuters.</p>
<p class="MsoNormal">But the really interesting (at least to me) possibilities arise in the core Reuters and Dow Jones business of supporting investment decisions. <span id="more-104"></span> Factiva was prevented by partnership agreement from pursuing that market, which made things weird, since the parent firms weren’t doing what they could with text analytics either.<span> </span>Now each has its own text analytics subsidiary, and all possibilities are presumably on the table.<span> </span>And oh by the way – Factiva is arguably the biggest text mining/fact extraction company around, while Clearforest was making <a href="http://www.texttechnologies.com/2006/12/27/text-analytics-is-finally-being-used-for-investment-analysis/" >progress in the hedge fund market</a> shortly before marketing went radio-silent.</p>
<p class="MsoNormal">Hmm.<span> </span>I want to think about this a bit before posting a list of possibilities.<span> </span>Or maybe even talk with Factiva, something I’ve – amazingly – almost never done.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2007/04/30/clearforest-reuters-acquisition/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>What&#8217;s going on at ClearForest?</title>
		<link>http://www.texttechnologies.com/2007/03/19/whats-going-on-at-clearforest/</link>
		<comments>http://www.texttechnologies.com/2007/03/19/whats-going-on-at-clearforest/#comments</comments>
		<pubDate>Tue, 20 Mar 2007 00:33:29 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[ClearForest/Reuters]]></category>
		<category><![CDATA[Text mining]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/2007/03/19/whats-going-on-at-clearforest/</guid>
		<description><![CDATA[I tried to invite Jay Henderson so speak on the Text Analytics Summit marketing panel, but got no answer to my e-mail. The company phone directory didn&#8217;t work so well for him either. I sent e-mail to a general PR company e-mail address, and that didn&#8217;t get returned. And Ravi tells me he has had [...]]]></description>
			<content:encoded><![CDATA[<p>I tried to invite Jay Henderson so speak on the <a href="http://www.texttechnologies.com/2007/03/07/three-crucial-issues-in-text-analytics/" >Text Analytics Summit marketing panel</a>, but got no answer to my e-mail.  The company phone directory didn&#8217;t work so well for him either.  I sent e-mail to a general PR company e-mail address, and that didn&#8217;t get returned.  And Ravi tells me he has had similar difficulties reaching them.<span id="more-92"></span></p>
<p>Does anybody know what&#8217;s going on at ClearForest these days?  A comparison of the <a href="http://clearforest.com/AboutUs/ManagementTeam.asp" onclick="javascript:pageTracker._trackPageview('/outbound/article/clearforest.com');">current management team listing</a>  and <a href="http://web.archive.org/web/20060423122232/http://www.clearforest.com/AboutUs/ManagementTeam.asp" onclick="javascript:pageTracker._trackPageview('/outbound/article/web.archive.org');">one from last April</a> does suggest there&#8217;s been substantial turnover in sales, marketing, and development management.</p>
<p>Technorati Tags: <a href="http://technorati.com/tag/ClearForest" onclick="javascript:pageTracker._trackPageview('/outbound/article/technorati.com');" rel="tag">ClearForest</a></p>
<p><em>Want to continue getting great research about search, text mining, and other hot text technology topics? Then <a href="http://www.monash.com/blogs.html" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.monash.com');">subscribe to our feed</a>, by RSS/Atom or e-mail! We do recommend taking the integrated feed for all our blogs, so that you learn about other subjects as well, such as database management, business intelligence, and computing appliances.  But blog-specific feeds are also easily available.</em></p>
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2007/03/19/whats-going-on-at-clearforest/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Text analytics is finally being used for investment analysis</title>
		<link>http://www.texttechnologies.com/2006/12/27/text-analytics-is-finally-being-used-for-investment-analysis/</link>
		<comments>http://www.texttechnologies.com/2006/12/27/text-analytics-is-finally-being-used-for-investment-analysis/#comments</comments>
		<pubDate>Wed, 27 Dec 2006 19:06:24 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[ClearForest/Reuters]]></category>
		<category><![CDATA[Investment research and trading]]></category>
		<category><![CDATA[Text mining]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/2006/12/27/text-analytics-is-finally-being-used-for-investment-analysis/</guid>
		<description><![CDATA[Jay Henderson of ClearForest tells me that hedge funds are one of their more interesting growth areas. It&#8217;s about time. I think a lot of the reason for investment firms not making more use of text analytics has been structural &#8212; Factiva, the (relatively speaking) mammoth joint venture of Reuters and Dow Jones, is forbidden [...]]]></description>
			<content:encoded><![CDATA[<p>Jay Henderson of ClearForest tells me that hedge funds are one of their more interesting growth areas.  It&#8217;s about time.</p>
<p>I think a lot of the reason for investment firms not making more use of text analytics has been structural &#8212; Factiva, the (relatively speaking) mammoth joint venture of Reuters and Dow Jones, is forbidden by its parent companies from meeting investment firms&#8217; needs.  And that&#8217;s kind of a pity, as it&#8217;s probably the best-positioned firm to do so.  It&#8217;s good to hear that the little guys are finally filling the gap.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2006/12/27/text-analytics-is-finally-being-used-for-investment-analysis/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Telling Attensity and ClearForest apart</title>
		<link>http://www.texttechnologies.com/2006/12/27/telling-attensity-and-clearforest-apart/</link>
		<comments>http://www.texttechnologies.com/2006/12/27/telling-attensity-and-clearforest-apart/#comments</comments>
		<pubDate>Wed, 27 Dec 2006 19:02:26 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Application areas]]></category>
		<category><![CDATA[Attensity]]></category>
		<category><![CDATA[ClearForest/Reuters]]></category>
		<category><![CDATA[Custom publishing]]></category>
		<category><![CDATA[Mark Logic]]></category>
		<category><![CDATA[TEMIS]]></category>
		<category><![CDATA[Text mining]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/2006/12/27/telling-attensity-and-clearforest-apart/</guid>
		<description><![CDATA[So far as I can tell, Attensity’s strategy when the company was originally founded was rather like ClearForest’s strategy today – and vice-versa. That said, here’s where they seem to stand at this time: Attensity wants to make text analytics very easy to integrate into business intelligence and data mining – at the moment, they’re [...]]]></description>
			<content:encoded><![CDATA[<p class="MsoNormal">So far as I can tell, Attensity’s strategy when the company was originally founded was rather like ClearForest’s strategy today – and vice-versa.  That said, here’s where they seem to stand at this time:</p>
<ul>
<li class="MsoNormal">Attensity      wants to make text analytics very easy to integrate into business      intelligence and data mining – at the moment, they’re not too focused on      the differences between those two disciplines – and is trying to deliver      the best possible fact extraction consistent with that charter.</li>
<li class="MsoNormal">ClearForest      wants to provide really great information extraction &#8212; to the limits of      what can be done without excessive knowledge engineering – and is trying      to integrate as well as possible with other technologies, the better to      serve the customers who need what they offer.</li>
</ul>
<p><span id="more-62"></span>The guy I usually talk with at ClearForest, Jay Henderson, believes that text analytics is a collection of dozens of niche markets.  Not coincidentally, a lot of ClearForest’s customers are in the publishing sector (I’ve remarked on ClearForest’s synergy with Mark Logic before).  Attensity obviously is trying a broader play.  In Jay’s view, Inxight and TEMIS are more analogous to ClearForest than Attensity is, except that Inxight is focused on different markets (e.g. OEM and/or search), and he thinks ClearForest is just better than Temis except in a couple of specific kinds of understanding (e.g., life sciences, sentiment).</p>
<p class="MsoNormal">That said, both Attensity and ClearForest credibly claim to do large fractions of what the other one does.  ClearForest, as the currently nichier player, takes the traditional stance “We do everything they do, and more.  Most of our customers are ones who really appreciate the difference.”  Attensity conveys the equally traditional attitude “We do most of what they do, and a bunch of other stuff besides.  And it’s better-packaged too.  As for what they do that we don’t – not a lot of customers really have a need for it.”</p>
<p class="MsoNormal">Frankly, most enterprises that have a need for this technology should put both <a href="http://www.texttechnologies.com/2006/07/27/more-on-attensity/" >Attensity</a> and <a href="http://www.texttechnologies.com/2006/07/23/introduction-to-clearforest/" >ClearForest</a> on their short lists.  But here’s one technical note that may help predict who you’ll wind up actually selecting:  Attensity’s lead strategy for integration is to dump everything into relational tables, for conventional analytics-stack products like Business Objects’ and Teradata’s to manipulate.  ClearForest’s lead strategy for integration has more of an SOA/XML flavor, grown out of conventional OO.  If one of those sounds like an obviously better fit to your situation than the other, then that’s the vendor you absolutely, positively should not leave out of your evaluation process.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2006/12/27/telling-attensity-and-clearforest-apart/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Mark Logic and the custom publishing business</title>
		<link>http://www.texttechnologies.com/2006/08/26/mark-logic-and-the-custom-publishing-business/</link>
		<comments>http://www.texttechnologies.com/2006/08/26/mark-logic-and-the-custom-publishing-business/#comments</comments>
		<pubDate>Sat, 26 Aug 2006 09:48:14 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Application areas]]></category>
		<category><![CDATA[ClearForest/Reuters]]></category>
		<category><![CDATA[Custom publishing]]></category>
		<category><![CDATA[Mark Logic]]></category>
		<category><![CDATA[Search engines]]></category>
		<category><![CDATA[Specialized search]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/2006/08/26/mark-logic-and-the-custom-publishing-business/</guid>
		<description><![CDATA[I talked again with Mark Logic, makers of MarkLogic Server, and they continue to have an interesting story. Basically, their technology is better search/retrieval through XML. The retrieval part is where their major differentiation lies. Accordingly, their initial market focus (they’re up to 46 customers now, including lots of big names) is on custom publishing. [...]]]></description>
			<content:encoded><![CDATA[<p class="MsoNormal">I talked again with Mark Logic, makers of MarkLogic Server, and they continue to have an interesting story.  Basically, their technology is better search/retrieval through XML.  The retrieval part is where their major differentiation lies.  Accordingly, their initial market focus (they’re up to 46 customers now, including lots of big names) is on custom publishing.  And by the way, they’re a good partner for fact-extraction companies, at least in the case of <a href="http://www.texttechnologies.com/2006/07/23/introduction-to-clearforest/" >ClearForest</a>.</p>
<p class="MsoNormal">Here, as best I understand, is the story of the custom publishing business. <span id="more-50"></span> Its core market is publishers of high-cost material sold to people with high-priced time – i.e., scientific/engineering/medical/legal/business/etc. Other markets are general publishing, internal document preparation (e.g., intelligence community), and of course maintenance manuals (maintenance/repair has been a flagship market for just about everything, from expert systems to generic text search to, of course, text mining now as well).</p>
<p class="MsoNormal">The phrase “custom publishing,” however, obscures the distinction between two different paradigms.  One of these is what we might call <em>true custom publishing</em> – assembling paragraphs, articles, chapters whatever from various sources, in an assembly customized for specific reader needs, roles, or preferences.  On the revenue side, that’s a fascinating subject.  But technically, I’m more interested in the other view:  <em>search results plus.</em></p>
<p class="MsoNormal">We all know lots of problems with search engines.  One of the many is this:  Except on rare occasions, getting the benefit of a successful search involves a whole lot of link-clicking and scrolling.  But what if the relevant passages were all assembled together for you?  Link-clicking would be eliminated, and scrolling might be minimized as well.  The potential is huge.  But I don’t know what level of precision is needed before the theoretical benefits become real.</p>
<p class="MsoNormal">The two paradigms can be blended, of course.  A publishing product or dashboard or personal web page with a topic filter might get the results in custom-document rather than link-of-lists form.   Once again, the problem is conciseness.  The more concise the “complete” results can be, the more useful this kind of technology will ultimately prove.</p>
<p class="MsoNormal">For more on Mark Logic, and more insight about the industry in general, see CEO <a href="http://marklogic.blogspot.com/" onclick="javascript:pageTracker._trackPageview('/outbound/article/marklogic.blogspot.com');">Dave Kellogg’s blog.</a> For a technical discussion of MarkLogic, see my <a href="http://www.dbms2.com/2006/08/26/mark-logic-and-the-marklogic-server/" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.dbms2.com');">DBMS2.com write-up</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2006/08/26/mark-logic-and-the-custom-publishing-business/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>

