<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Text Technologies &#187; Structured search</title>
	<atom:link href="http://www.texttechnologies.com/category/storage-search/structured-search/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.texttechnologies.com</link>
	<description>Understanding technology ... in both senses of the phrase</description>
	<lastBuildDate>Wed, 18 Jan 2012 17:02:59 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.3</generator>
		<item>
		<title>Data marts in the world of text</title>
		<link>http://www.texttechnologies.com/2009/09/20/data-marts-in-the-world-of-text/</link>
		<comments>http://www.texttechnologies.com/2009/09/20/data-marts-in-the-world-of-text/#comments</comments>
		<pubDate>Sun, 20 Sep 2009 09:08:53 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Enterprise search]]></category>
		<category><![CDATA[Ontologies]]></category>
		<category><![CDATA[Search engines]]></category>
		<category><![CDATA[Specialized search]]></category>
		<category><![CDATA[Structured search]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/?p=334</guid>
		<description><![CDATA[CMS/search (Content Management System) expert Alan Pelz-Sharpe recently decried &#8220;Shadow IT&#8221;, by which he seems to mean departmental proliferation of data stores outside the control of the IT department. In other words, he&#8217;s talking about data marts, only for documents rather than tabular data. Notwithstanding the manifest virtues of centralization, there are numerous reasons you [...]]]></description>
			<content:encoded><![CDATA[<p style="margin-bottom: 0in;">CMS/search (Content Management System) expert Alan Pelz-Sharpe recently <a href="http://www.intelligententerprise.com/blog/archives/2009/08/shadow_it_and_e.html" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.intelligententerprise.com');">decried &#8220;Shadow IT&#8221;</a>, by which he seems to mean departmental proliferation of data stores outside the control of the IT department. In other words, he&#8217;s talking about data marts, only for documents rather than tabular data.</p>
<p style="margin-bottom: 0in;">Notwithstanding the manifest virtues of centralization, there are numerous reasons you might want data marts,  in the tabular and document worlds alike.  For example:</p>
<ul>
<li><strong>Price/performance.</strong> Your 	main/central data manager might be too expensive to support 	additional large specialized databases. Or different databases and 	applications might have sufficiently different profiles so as to get 	great price/performance from different kinds of data managers. This 	is particularly prevalent in the relational world, where each of 	column stores, sequentially-oriented row stores, and random 	I/O-oriented row stores have compelling use cases.</li>
<li><strong>Different SLAs</strong> (Service-Level Agreements). Similarly, different applications may 	have very different requirements for uptime, response time, and the 	like.  (In the relational world, think of operational data stores.)</li>
<li><strong>Different security 	requirements.</strong> Different subsets of the data may need different 	levels of security. This is particularly prevalent in the document 	world, where security problems are not as well-solved as in the 	tabular arena, and where it&#8217;s common for a search engine to index 	across different corpuses with radically different levels of 	sensitivity.</li>
<li><strong>Integrated application and user 	interfaces.</strong> In the relational world, there&#8217;s a pretty clean 	separation between data management and interface logic; most serious 	business intelligence tools can talk to most DBMS. The document 	world is quite different. Some search engines bundle, for example, 	various kinds of faceted or parameterized search interfaces. What&#8217;s 	more, in public-facing search, a major differentiator is the 	facilities that the product offers for skewing search results.</li>
<li><strong>Different text applications 	require different thesauruses or taxonomy management systems</strong>. 	Ideally, those should all be integrated &#8212; but <a href="../2005/12/11/the-text-technologies-market-3-heres-whats-missing/">the 	requisite technology still doesn&#8217;t exist</a>.</li>
</ul>
<p style="margin-bottom: 0in;">Bottom line: <strong>Text data marts, much like relational data marts, are almost surely here to stay.</strong></p>
<p style="margin-bottom: 0in;"><em><strong>Related link</strong></em></p>
<ul>
<li>
<p style="margin-bottom: 0in;"><a href="http://www.dbms2.com/2009/06/08/the-future-of-data-marts/" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.dbms2.com');">The 	future of data marts</a></p>
</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2009/09/20/data-marts-in-the-world-of-text/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Where &#8220;semantic&#8221; technology is or isn&#8217;t important</title>
		<link>http://www.texttechnologies.com/2008/12/29/where-semantic-technology-is-or-isnt-important/</link>
		<comments>http://www.texttechnologies.com/2008/12/29/where-semantic-technology-is-or-isnt-important/#comments</comments>
		<pubDate>Tue, 30 Dec 2008 00:59:55 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Enterprise search]]></category>
		<category><![CDATA[Ontologies]]></category>
		<category><![CDATA[Search engines]]></category>
		<category><![CDATA[Specialized search]]></category>
		<category><![CDATA[Structured search]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/?p=301</guid>
		<description><![CDATA[At Lynda Moulton&#8217;s behest, I spoke a couple of times recently on the subject of where &#8220;semantic&#8221; technology is or isn&#8217;t likely to be important.  One was at the Gilbane conference in early December.  The slides were based on my previously posted deck for a June talk I gave on a text analytics market overview. [...]]]></description>
			<content:encoded><![CDATA[<p>At Lynda Moulton&#8217;s behest, I spoke a couple of times recently on the subject of where &#8220;semantic&#8221; technology is or isn&#8217;t likely to be important.  One was at the Gilbane conference in early December.  The slides were based on my previously posted deck for a June talk I gave on a <a href="http://www.texttechnologies.com/2008/06/19/text-analytics-marketplace-competitive-landscape-trends/" >text analytics market overview</a>. The actual Gilbane slides may be found <a href="http://www.monash.com/uploads/Gilbane-December-2008.ppt" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.monash.com');">here</a>.</p>
<p>My opinions about the applicability of semantic technology include:</p>
<ul>
<li>The big bucks in web search are for &#8220;transactional&#8221; web search, and semantics isn&#8217;t the issue there. <em>(Slides 3-4)</em></li>
<li>When UIs finally go beyond the simple search box &#8212; e.g. to clusters/facets or to voice &#8212; semantics should have a role to play. <em>(Slide 5)</em></li>
<li>Public-facing site search depends &#8212; more than any other area of text analytics &#8212; on hand-tagging. <em>(Slide 7)</em></li>
<li>&#8220;Enterprise&#8221; search that searches specialized external databases could benefit from semantic technologies. <em>(Slide <img src='http://www.texttechnologies.com/wp-includes/images/smilies/icon_cool.gif' alt='8)' class='wp-smiley' /> </em></li>
<li>True enterprise search could benefit from semantic technologies in multiple ways, but has other problems as well. <em>(Slides 10-11)</em></li>
<li>Semantics &#8212; specifically extraction &#8212; is central to custom publishing. <em>(Slide 12 &#8212; upon review I regret using the word &#8220;sophisticated&#8221;)</em></li>
<li>Semantics is central to text mining. <em>(Slide 18)</em></li>
<li>Semantics could play a big role in all sorts of exciting future developments. <em>(Slide 19)</em></li>
</ul>
<p>So what would your list be like?</p>
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2008/12/29/where-semantic-technology-is-or-isnt-important/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Worst search UI ever</title>
		<link>http://www.texttechnologies.com/2008/10/05/worst-search-ui-ever/</link>
		<comments>http://www.texttechnologies.com/2008/10/05/worst-search-ui-ever/#comments</comments>
		<pubDate>Mon, 06 Oct 2008 01:48:34 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Search engines]]></category>
		<category><![CDATA[Structured search]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/?p=284</guid>
		<description><![CDATA[On the whole, the Barack Obama campaign has been very internet-savvy. Maybe their web site JohnMcCainRecord.com is yet another example of same. But to my eyes, it has such an appallingly bad search interface that people going to the site are apt to be annoyed. To wit: There a huge search box in the center [...]]]></description>
			<content:encoded><![CDATA[<p>On the whole, the Barack Obama campaign has been very internet-savvy.  Maybe their web site <a href="http://www.johnmccainrecord.com/" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.johnmccainrecord.com');">JohnMcCainRecord.com</a> is yet another example of same.  But to my eyes, it has such an appallingly bad search interface that people going to the site are apt to be annoyed.  To wit:</p>
<ul>
<li>There a huge search box in the center of the screen.</li>
<li>All the search box ever does is take you to one of the 13 categories listed right below it.</li>
<li>Usually, it doesn&#8217;t even do that.  Instead, it just fails.  For example, I entered <em>terrorism</em> and hit &#8220;Go&#8221;, and got no response.  Ditto <em>nuclear energy.</em></li>
<li>When it does give you an answer, it&#8217;s apt not to be what you were looking for. For example, entering <em>Iran</em> takes you to the <em>Foreign Policy</em> page, which contains nothing about Iran.<em><br />
</em></li>
</ul>
<p>And then, of course, there&#8217;s the funny stuff.  For example, if you search on <em>foo,</em> you are taken to <em>Rural Issues.</em></p>
<p>In general terms, I like the idea of the site.  But absent some serious changes, JohnMcCainRecord.com should <em>not</em> have a search interface.</p>
<p><em>Edit:  More here in my post on <a href="http://www.networkworld.com/community/node/33622" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.networkworld.com');">The Obama campaign&#8217;s Search Engine to Nowhere</a></em></p>
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2008/10/05/worst-search-ui-ever/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Attivio update</title>
		<link>http://www.texttechnologies.com/2008/09/20/attivio-update/</link>
		<comments>http://www.texttechnologies.com/2008/09/20/attivio-update/#comments</comments>
		<pubDate>Sat, 20 Sep 2008 05:00:06 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Application areas]]></category>
		<category><![CDATA[Attivio]]></category>
		<category><![CDATA[Enterprise search]]></category>
		<category><![CDATA[Lucene]]></category>
		<category><![CDATA[Structured search]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/?p=283</guid>
		<description><![CDATA[I talked w/ Andrew McKay of Attivio for 2 ½ hours Thursday. I&#8217;ve also been working with some Attivio engineers on a blog search engine. I think it&#8217;s time to post about Attivio. In its full conception, the Attivio Intelligence Engine is something like Endeca + RDBMS + search engine + XML store + cool [...]]]></description>
			<content:encoded><![CDATA[<p style="margin-bottom: 0in;">I talked w/ Andrew McKay of Attivio for 2 ½ hours Thursday.  I&#8217;ve also been working with some Attivio engineers on a blog search engine.  I think it&#8217;s time to post about Attivio. <img src='http://www.texttechnologies.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  <span id="more-283"></span></p>
<p style="margin-bottom: 0in;">In its full conception, the Attivio Intelligence Engine is something like Endeca + RDBMS + search engine + XML store + cool extra features.  And all with seamless, lightweight, integrated installation and administration.  That&#8217;s the goal, anyway.  At this point, naturally, each individual piece is far from complete. For example:</p>
<ul>
<li>Sufficient SQL support to handle 	most BI tools is still a matter for future releases &#8212; apparently in 	2009, although Attivio is one of those agile companies for which 	pinning down product releases is somewhat difficult.</li>
<li>The same goes some basic GUI 	features (such as  most non-programmatic search tuning).</li>
<li>ACID compliance is not a high 	priority for Attivio. I actually think it should be higher, just 	because it&#8217;s increasingly become an “OK, we don&#8217;t have to worry 	about THAT” checkmark item.</li>
</ul>
<p style="margin-bottom: 0in;">Even in its early days, Attivio has had some nice-sounding customer successes.  There are 8 paying Attivio customers, including 2 &gt; $1 million deals, one half-millionish dollar deal, and 1 large OEM.  3 represent actual deployments, with the rest in development.  More sales are on the way, as are permissions to disclose customer names that people will actually recognize.  Customer application stories Andrew told me about include:</p>
<ul>
<li>A web-business parameterized, 	adjustable-weight search that&#8217;s starting with tabular data and only 	getting to free-text later.</li>
<li>An enterprise that&#8217;s using Attivio 	for content management, enterprise search, public-facing search, <em>and</em> data warehousing.</li>
<li>Something 	big/mysterious/classified, with large document volumes.</li>
<li>Something to do with compliance, 	about which Andrew was going to forward a lot more detail that 	evening (Hint, hint).</li>
</ul>
<p style="margin-bottom: 0in;">Since the major RDBMS (Oracle, Microsoft SQL Server, DB2) all have text search and XML subsystems, they can in principle do everything Attivio does on the back end, and with a lot more features and maturity.  The same would go for Marklogic.   Performance and overhead might be different matters, however; Andrew certainly believes so.</p>
<p style="margin-bottom: 0in;">Except that Lucene is included on the search side, I haven&#8217;t actually figured out how Attivio stores data.  The fact that SQL features are being added incrementally suggests Attivio is rolling its own relational database capability, but how it&#8217;s organized I don&#8217;t really know.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2008/09/20/attivio-update/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>The Text Analytics Marketplace: Competitive landscape and trends</title>
		<link>http://www.texttechnologies.com/2008/06/19/text-analytics-marketplace-competitive-landscape-trends/</link>
		<comments>http://www.texttechnologies.com/2008/06/19/text-analytics-marketplace-competitive-landscape-trends/#comments</comments>
		<pubDate>Thu, 19 Jun 2008 07:35:39 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Audio and video search]]></category>
		<category><![CDATA[BI integration]]></category>
		<category><![CDATA[Custom publishing]]></category>
		<category><![CDATA[Enterprise search]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Natural language processing (NLP)]]></category>
		<category><![CDATA[Nuance]]></category>
		<category><![CDATA[Progress and EasyAsk]]></category>
		<category><![CDATA[Search engines]]></category>
		<category><![CDATA[Social software and online media]]></category>
		<category><![CDATA[Spam and antispam]]></category>
		<category><![CDATA[Speech recognition]]></category>
		<category><![CDATA[Structured search]]></category>
		<category><![CDATA[Text Analytics Summit]]></category>
		<category><![CDATA[Text mining]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/?p=249</guid>
		<description><![CDATA[As I see it, there are eight distinct market areas that each depend heavily on linguistic technology. Five are off-shoots of what used to be called “information retrieval”: 1. Web search 2. Public-facing site search 3. Enterprise search and knowledge management 4. Custom publishing 5. Text mining and extraction Three are more standalone: 6. Spam [...]]]></description>
			<content:encoded><![CDATA[<p style="margin-bottom: 0in;">As I see it, there are eight distinct market areas that each depend heavily on linguistic technology. Five are off-shoots of what used to be called “information retrieval”:</p>
<p style="margin-bottom: 0in; font-style: normal; padding-left: 30px;">1.  Web search</p>
<p style="margin-bottom: 0in; font-style: normal; padding-left: 30px;">2.  Public-facing site search</p>
<p style="margin-bottom: 0in; font-style: normal; padding-left: 30px;">3.  Enterprise search and knowledge management</p>
<p style="margin-bottom: 0in; font-style: normal; padding-left: 30px;">4.  Custom publishing</p>
<p style="padding-left: 30px;">5.  Text mining and extraction</p>
<p style="margin-bottom: 0in; font-style: normal;">Three are more standalone:</p>
<p style="margin-bottom: 0in; font-style: normal; padding-left: 30px;">6.  Spam filtering</p>
<p style="margin-bottom: 0in; font-style: normal; padding-left: 30px;">7.  Voice recognition</p>
<p style="margin-bottom: 0in; font-style: normal; padding-left: 30px;">8.  Machine translation</p>
<p><span id="more-249"></span></p>
<p style="margin-bottom: 0in;">This list comes from a talk I gave Monday at the Text Analytics Summit called <em>The Text Analytics Marketplace: Competitive landscape and trends. </em>In half an hour, I covered the first five areas (in Sue Feldman&#8217;s word, at a “gallop”). The slide deck has been uploaded to the link below.  <span style="font-style: normal;"><span>I plan to break out the material from the talk into a series of blog posts over the next few (or perhaps not-so-few) weeks. </span></span></p>
<p style="margin-bottom: 0in;"><em><strong>Slides:</strong></em></p>
<ul>
<li><a href="http://www.monash.com/Text-analytics-markets-June-2008.ppt " onclick="javascript:pageTracker._trackPageview('/outbound/article/www.monash.com');"><span>The Text Analytics Marketplace: Competitive landscape and trends</span></a></li>
</ul>
<p style="margin-bottom: 0in;"><strong><em>Other posts based on those slides:</em></strong></p>
<ul>
<li><span><a href="http://www.texttechnologies.com/2008/06/19/3-specialized-markets-for-text-analytics/" >Three specialized markets for text analytics</a> (based on Slide 2)</span></li>
<li><span><a href="http://www.texttechnologies.com/2008/06/19/6-trends-that-could-shake-up-the-text-analytics-market/" >6 trends that could shake up the text analytics market</a> (based on Slide 19)</span></li>
<li><span><a href="(in A World of Bytes)">Why search technologies are going to recombine</a> (in <em>A World of Bytes</em>, based on Slide 19)<br />
</span></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2008/06/19/text-analytics-marketplace-competitive-landscape-trends/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>How text search has evolved over the past 15 years</title>
		<link>http://www.texttechnologies.com/2008/06/15/how-text-search-has-evolved-over-the-past-15-years/</link>
		<comments>http://www.texttechnologies.com/2008/06/15/how-text-search-has-evolved-over-the-past-15-years/#comments</comments>
		<pubDate>Sun, 15 Jun 2008 07:26:50 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Enterprise search]]></category>
		<category><![CDATA[Ontologies]]></category>
		<category><![CDATA[Search engines]]></category>
		<category><![CDATA[Structured search]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/?p=239</guid>
		<description><![CDATA[I just stumbled across a brilliant summary of evolution in text search technology, written four years ago. It&#8217;s equally valid today (which in itself says something). I found it on the Prism Legal blog, but the actual author is Sharon Flank. My own comments are interspersed in bold. “There are several underlying important developments over [...]]]></description>
			<content:encoded><![CDATA[<p>I just stumbled across a brilliant summary of evolution in text search technology, written four years ago.  It&#8217;s equally valid today (which in itself says something).  I found it on the <a href="http://www.prismlegal.com/wordpress/index.php?m=200407#post-190" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.prismlegal.com');">Prism Legal</a> blog, but the actual author is Sharon Flank.  My own comments are interspersed in bold.<span id="more-239"></span></p>
<blockquote><p>“There are several underlying important developments over the last decade or so:</p>
<ul>
<li>Incorporating user feedback to refine search results, usually indirectly rather than explicitly, making results better through machine learning. [Amazon.com is the most-often cited example of this with it’s “if you like A, you’ll also like B.”]  <strong>[CAM] Technically, that&#8217;s not a search example, but the general point is correct even so.</strong></li>
<li>Assessments based on usage or referral. This is what makes Google so useful and popular. This approach gives higher rankings if other web sites point to a target or if that target gets a lot of hits.</li>
<li>Various approaches to using taxonomies. The better applications use taxonomies as a navigation guide but don’t force it or require administrators to implement it. Vivisimo.com is an example of interesting, automated clustering approach. <strong>[CAM] &#8220;Faceted search&#8221; seems to be the buzzword here. It&#8217;s a big part of what I call &#8220;structured search.&#8221; But taxonomy use is probably more trivial in search than it is in, say, text mining.</strong></li>
<li>Better handling of phrases. Google automatically parses phrases and deals with search terms as phrases. This now seems natural but in the AltaVista days, you couldn’t tell a Venetian blind from a blind Venetian [example courtesy of Prof. George Miller, Princeton Univ. - too good not to cite].</li>
<li>Context-sensitive search is now an emerging trend. Systems track what users have previously searched for and infer interest in the same domain to refine search result. So if you look for “line” and a system knows you’ve just looked for “tacklebox,” then it infers you mean “fishing line.” Or if you search for bagels and the system knows you are in 20009, it tells you that you can buy them at Comet Liquors (which happens to sell bagels).  <strong>[CAM] That happens a lot with ad serving.  But I&#8217;m not convinced it hit actual search until Google&#8217;s personal search kicked off, and that was quite recent.</strong></li>
<p>“More generally in natural language processing, the statistical and linguistic approaches are converging in a new way: use massive amounts of data (i.e. the Web) to get statistical answers to deep linguistic questions, like “How do we figure out what the most likely referent is for the pronoun ‘they’?” Or “How do we determine the correct sense for ambiguous words?” These things aren’t in search engines yet, but you can expect to see more “intelligent” features coming out of this approach.</p>
<p>“Looking at this list, you can see that the conceptual changes (breakthroughs?), with the exception of better phrase handling, are primarily focused around Web searches. When dealing with one-of-a-kind document collections behind the corporate firewall, many of these developments turn out not to add much to older approaches. So, at least for enterprise search, I too remain partial to some of the older products you mention, though I am disappointed that most of the old-time vendors have not updated their approaches beyond adding taxonomy support.” <strong>[CAM] Yep, web search and enterprise search are <a href="http://www.texttechnologies.com/2008/01/14/enterprise-search-versus-web-search/" >very different things</a>.</strong></ul>
</blockquote>
<p>The original blog post did have one error &#8212; Sharon&#8217;s PhD isn&#8217;t in Computational Linguistics, but rather Slavic Linguistics, as I recently noted in my post about <a href="http://www.texttechnologies.com/2008/06/10/text-analytics-technology-jobs-humanities-majors/" >text analytics careers for humanities majors</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2008/06/15/how-text-search-has-evolved-over-the-past-15-years/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Powerset is mildly interesting</title>
		<link>http://www.texttechnologies.com/2008/05/12/powerset-is-mildly-interesting/</link>
		<comments>http://www.texttechnologies.com/2008/05/12/powerset-is-mildly-interesting/#comments</comments>
		<pubDate>Mon, 12 May 2008 14:17:22 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Powerset]]></category>
		<category><![CDATA[Search engines]]></category>
		<category><![CDATA[Structured search]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/?p=220</guid>
		<description><![CDATA[Powerset has done a great job of generating buzz for it&#8217;s version of smart search. That said, its current demo is mediocre &#8212; and that&#8217;s being polite. Powerset currently indexes little more than just Wikipedia, and the quality of its search results is about comparable to that of Wikipedia&#8217;s justly reviled internal search engine. To [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.powerset.com" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.powerset.com');">Powerset</a> has done a great job of generating buzz for it&#8217;s version of smart search.  That said, its current demo is mediocre &#8212; and that&#8217;s being polite. Powerset currently indexes little more than just Wikipedia, and the quality of its search results is about comparable to that of Wikipedia&#8217;s justly reviled internal search engine.  To determine this, I did searches on both sites on five strings.  Wikipedia typically had more total junk ranking higher, but it also put the very best hits of all higher than Powerset did.  The strings were:</p>
<ul>
<li>Drosophila research</li>
<li>Bill Clinton foreign policy</li>
<li>Home run hitters</li>
<li>Innocents on death row</li>
<li>Text data mining</li>
</ul>
<p><span id="more-220"></span>Powerset does have a nice set of UI features in terms of automatic faceted search and so on, but these days who doesn&#8217;t?</p>
<p><em><strong>Some discussion of Powerset:</strong></em></p>
<ul>
<li>Michael Arrington seems <a href="http://www.techcrunch.com/2008/05/11/powerset-launches-showcase-for-user-search-experience/" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.techcrunch.com');">impressed with Powerset</a></li>
<li>Dan Farber thinks <a href="http://www.news.com/8301-13953_3-9940887-80.html" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.news.com');">Microsoft may be impressed</a></li>
<li>Vanessa Fox definitely <a href="http://www.vanessafoxnude.com/2008/05/11/powersets-new-factz-from-wikipedia/" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.vanessafoxnude.com');">isn&#8217;t</a></li>
<li>VentureBeat is taking a <a href="http://venturebeat.com/2008/05/12/powerset-opens-to-everyone-now-whats-next/" onclick="javascript:pageTracker._trackPageview('/outbound/article/venturebeat.com');">wait and see</a> attitude</li>
<li>So is Om Malik, who notes that <a href="http://gigaom.com/2008/05/11/powerset-is-live/" onclick="javascript:pageTracker._trackPageview('/outbound/article/gigaom.com');">Powerset performance is a bear</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2008/05/12/powerset-is-mildly-interesting/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Implications of Microsoft&#8217;s bid for Yahoo</title>
		<link>http://www.texttechnologies.com/2008/02/01/microsoft-yahoo-takeover/</link>
		<comments>http://www.texttechnologies.com/2008/02/01/microsoft-yahoo-takeover/#comments</comments>
		<pubDate>Fri, 01 Feb 2008 13:32:22 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Enterprise search]]></category>
		<category><![CDATA[FAST]]></category>
		<category><![CDATA[Microsoft]]></category>
		<category><![CDATA[Search engines]]></category>
		<category><![CDATA[Structured search]]></category>
		<category><![CDATA[Yahoo]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/2008/02/01/microsoft-yahoo-takeover/</guid>
		<description><![CDATA[As I write this, Microsoft has just announced an offer to acquire Yahoo. Early responses from the likes of Danny Sullivan, Henry Blodget, the Download Squad, TechCrunch, Raven SEO, Mashable, and others seem to boil down to: Wow. Both sides needed it. Yahoo wasn&#8217;t going anywhere fast on its own. Microsoft wasn&#8217;t going anywhere fast [...]]]></description>
			<content:encoded><![CDATA[<p>As I write this, Microsoft has just announced an offer to acquire Yahoo.  Early responses from the likes of <a href="http://searchengineland.com/080201-064343.php" onclick="javascript:pageTracker._trackPageview('/outbound/article/searchengineland.com');">Danny Sullivan</a>, <a href="http://www.alleyinsider.com/2008/02/microsoft-bids-31-a-share-for-yahoo-msftyhoo.html" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.alleyinsider.com');">Henry Blodget</a>, the <a href="http://www.downloadsquad.com/2008/02/01/breaking-news-microsoft-seeking-to-acquire-yahoo/" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.downloadsquad.com');">Download Squad</a>, <a href="http://www.techcrunch.com/2008/02/01/wow-microsoft-offers-446-billion-to-acquire-yahoo/" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.techcrunch.com');">TechCrunch</a>, <a href="http://raven-seo-tools.com/blog/105/microsoft-looks-to-upset-the-search-engine-balance-with-offer-to-buy-yahoo" onclick="javascript:pageTracker._trackPageview('/outbound/article/raven-seo-tools.com');">Raven SEO</a>, <a href="http://mashable.com/2008/02/01/microsoft-wants-to-acquire-yahoo-for-446-billion/" onclick="javascript:pageTracker._trackPageview('/outbound/article/mashable.com');">Mashable</a>, and others seem to boil down to:</p>
<ul>
<li>Wow.</li>
<li>Both sides needed it.</li>
<li>Yahoo wasn&#8217;t going anywhere fast on its own.</li>
<li>Microsoft wasn&#8217;t going anywhere fast in search on its own.</li>
<li>This may be enough critical mass to matter.</li>
<li>Conference call at 8:30 am</li>
</ul>
<p>I&#8217;ll try to be a bit more analytical than that, but this is still going to be quick.  Assuming the deal goes through:</p>
<ol>
<li>Microsoft will recombine both parts of the old <a href="http://www.texttechnologies.com/2008/01/08/microsoft-fast-prohibition/" >FAST/alltheweb.com</a>  Therefore, Microsoft will be able to use the same technology for web and enterprise search, <a href="http://www.texttechnologies.com/2008/01/14/enterprise-search-versus-web-search/" >to the extent that such commonality makes sense</a>.</li>
<li>I&#8217;d expect Microsoft to try to differentiate its technology via faceted/structured search.  That&#8217;s a FAST strength.</li>
<li>The old FAST <a href="http://www.texttechnologies.com/2007/02/01/what%e2%80%99s-interesting-about-the-fast-venture-in-bi/" >search-as-BI</a> dream might become pretty appealing to Microsoft/Yahoo.</li>
<li>In a non-search point, Microsoft is strong in games and Yahoo is strong in fantasy sports.  Look for some synergies.</li>
<li>There sure would be a whole lot of non-Windows technology inside Microsoft. <img src='http://www.texttechnologies.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </li>
</ol>
<p>Basically, Microsoft is a company that&#8217;s a lot more sophisticated in its thinking about user interfaces and experiences than Yahoo is.  That&#8217;s where the really interesting competitive innovation would be most likely to occur.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2008/02/01/microsoft-yahoo-takeover/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>More on Microsoft in enterprise search</title>
		<link>http://www.texttechnologies.com/2008/01/08/more-on-microsoft-in-enterprise-search/</link>
		<comments>http://www.texttechnologies.com/2008/01/08/more-on-microsoft-in-enterprise-search/#comments</comments>
		<pubDate>Tue, 08 Jan 2008 19:24:50 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Enterprise search]]></category>
		<category><![CDATA[FAST]]></category>
		<category><![CDATA[Microsoft]]></category>
		<category><![CDATA[Search engines]]></category>
		<category><![CDATA[Structured search]]></category>
		<category><![CDATA[SharePoint]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/2008/01/08/more-on-microsoft-in-enterprise-search/</guid>
		<description><![CDATA[Following up on my prior posts about Microsoft&#8217;s impending acquisition of FAST, they&#8217;ve now had the conference call. By custom and indeed antitrust law, such calls are very light on content. But here are a few tidbits and takeaways, all from Jeff Raikes of Microsoft: Jeff talked solely about FAST as adding to enterprise search, [...]]]></description>
			<content:encoded><![CDATA[<p>Following up on my <a href="http://www.texttechnologies.com/2008/01/08/microsoft-fast-prohibition/" >prior</a> <a href="http://www.texttechnologies.com/2008/01/08/microsoft-in-enterprise-search/" >posts</a> about Microsoft&#8217;s impending acquisition of FAST, they&#8217;ve now had the conference call.  By custom and indeed antitrust law, such calls are very light on content.   But here are a few tidbits and takeaways, all from Jeff Raikes of Microsoft:</p>
<ol>
<li>Jeff talked solely about FAST as adding to enterprise search, and rightly contrasted that with web search.</li>
<li>However, he deflected questions about web search with &#8220;We aren&#8217;t talking about that much detail right now&#8221; rather than with a firm &#8220;Well, we aren&#8217;t allowed to use FAST that way.&#8221;</li>
<li>Specifically, enterprise search is all about integration with SharePoint (portal).</li>
<li>Jeff said Microsoft&#8217;s current search could handle millions or maybe tens of millions of documents, but thought there was demand for FAST&#8217;s ability to handle billions.</li>
<li>He positioned FAST as an application development platform, giving an example of structured search (the actual word was &#8220;pivot&#8221;) in consumer electronics.  &#8230; Well, at least he&#8217;s looking in the right direction.</li>
</ol>
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2008/01/08/more-on-microsoft-in-enterprise-search/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Danny Sullivan thinks blended vertical search is a game-changer</title>
		<link>http://www.texttechnologies.com/2007/12/02/danny-sullivan-thinks-blended-vertical-search-is-a-game-changer/</link>
		<comments>http://www.texttechnologies.com/2007/12/02/danny-sullivan-thinks-blended-vertical-search-is-a-game-changer/#comments</comments>
		<pubDate>Mon, 03 Dec 2007 00:33:37 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Google]]></category>
		<category><![CDATA[Search engine optimization (SEO)]]></category>
		<category><![CDATA[Search engines]]></category>
		<category><![CDATA[Specialized search]]></category>
		<category><![CDATA[Structured search]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/2007/12/02/danny-sullivan-thinks-blended-vertical-search-is-a-game-changer/</guid>
		<description><![CDATA[Danny Sullivan thinks blended vertical search &#8212; which he&#8217;s calling Search 3.0 &#8212; is a game changer. (In this context, &#8220;vertical&#8221; search denotes alternate result types such as video, image, map coordinates, or product listings.) In saying that, he&#8217;s focused on search marketers, who now have a lot more ways to try to get their [...]]]></description>
			<content:encoded><![CDATA[<p>Danny Sullivan thinks <a href="http://searchengineland.com/071127-091128.php" onclick="javascript:pageTracker._trackPageview('/outbound/article/searchengineland.com');">blended vertical search &#8212; which he&#8217;s calling Search 3.0 &#8212; is a game changer</a>.  (In this context, &#8220;vertical&#8221; search denotes alternate result types such as video, image, map coordinates, or product listings.)   In saying that, he&#8217;s focused on search marketers, who now have a lot more ways to try to get their messages onto Google searchers&#8217; top result pages.   But I presume what he&#8217;s really saying is that there will be a feedback effect &#8212; if Google tells all web searchers about videos and product listings, then internet marketers will be more motivated to post videos and product listings, and hence there will be more interesting choices of videos and product listings &#8212; which Google will naturally wind up featuring more prominently in its search results.  And so on.</p>
<p>Given the Youtube explosion, I find it hard to argue with his claim.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2007/12/02/danny-sullivan-thinks-blended-vertical-search-is-a-game-changer/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

