Search engines

Analysis of search technology, products, services, and vendors. Related subjects include:

November 25, 2012

The future of search

I believe there are two ways search will improve significantly in the future. First, since talking is easier than typing, speech recognition will allow longer and more accurate input strings. Second, search will be informed by much more persistent user information, with search companies having very detailed understanding of searchers. Based on that, I expect:

My reasoning starts from several observations:

In principle, there are two main ways to make search better:

The latter, I think, is where significant future improvement will be found.

Read more

March 29, 2010

Google’s version of an old joke

Search Google for “recursion” and it helpfully offers a link to let you search on — you guessed it — “recursion.”  The joke has been implemented in German as well.

This idea is not, to put it mildly, new. I first saw the definition

Recursion: See recursion

in the glossary to Intellicorp’s KEE documentation, in 1984 or so. And I’d guess the joke is actually a lot older than that.

For another variation of the same idea, see this link.

September 20, 2009

Data marts in the world of text

CMS/search (Content Management System) expert Alan Pelz-Sharpe recently decried “Shadow IT”, by which he seems to mean departmental proliferation of data stores outside the control of the IT department. In other words, he’s talking about data marts, only for documents rather than tabular data.

Notwithstanding the manifest virtues of centralization, there are numerous reasons you might want data marts,  in the tabular and document worlds alike.  For example:

Bottom line: Text data marts, much like relational data marts, are almost surely here to stay.

Related link

May 30, 2009

MEN ARE FROM EARTH, COMPUTERS ARE FROM VULCAN

The newsletter/column excerpted below was originally published in 1998.  Some of the specific references are obviously very dated.  But the general points about the requirements for successful natural language computer interfaces still hold true.  Less progress has been made in the intervening decade-plus than I would have hoped, but some recent efforts — especially in the area of search-over-business-intelligence — are at least mildly encouraging.  Emphasis added.

Natural language computer interfaces were introduced commercially about 15 years ago*.  They failed miserably.

*I.e., the early 1980s

For example, Artificial Intelligence Corporation’s Intellect was a natural language DBMS query/reporting/charting tool.  It was actually a pretty good product.  But it’s infamous among industry insiders as the product for which IBM, in one of its first software licensing deals, got about 1700 trial installations — and less than a 1% sales close rate.  Even its successor, Linguistic Technologies’ English Wizard*, doesn’t seem to be attracting many customers, despite consistently good product reviews.

*These days (i.e., in 2009) it’s owned by Progress and called EasyAsk. It still doesn’t seem to be selling well.

Another example was HAL, the natural language command interface to 1-2-3.  HAL is the product that first made Bill Gross (subsequently the founder of Knowledge Adventure and idealab!) and his brother Larry famous.  However, it achieved no success*, and was quickly dropped from Lotus’ product line.

*I loved the product personally. But I was sadly alone.

In retrospect, it’s obvious why natural language interfaces failed. First of all, they offered little advantage over the forms-and-menus paradigm that dominated enterprise computing in both the online-character-based and client-server-GUI eras.  If you couldn’t meet an application need with forms and menus, you couldn’t meet it with natural language either. Read more

May 29, 2009

Google Wave — finally a Microsoft killer?

Google held a superbly-received preview of a new technology called Google Wave, which promises to “reinvent communication.” In simplest terms, Google Wave is a software platform that:

If this all works out, Google Wave could play merry hell with Microsoft Outlook, Microsoft Word, Microsoft Exchange, Microsoft SharePoint, and more.

I suspect it will.

And by the way, there’s a cool “natural language” angle as well. Read more

April 3, 2009

Google has a lot more features than I realized

A features and syntax page reveals that the basic Google search box now gives you flight times, weather, stock quotes, sports scores, currency conversion, calculator results, and a lot more. Wow. I did not know.

Since the early 1980s, I’ve thought that natural language interfaces — spoken or otherwise — would someday win.  While this versatility isn’t natural lanaguage per se, it still in my opinion is evidence in favor of that belief.

April 3, 2009

Thoughts on the rumored Google/Twitter deal

Michael Arrington reports that Google and Twitter are contemplating both:

I have three initial thoughts on this:

1. Clearly, in Google’s mission to “organize all the world’s information,” there are several web areas it isn’t yet doing well in, and one of those is microblogs. What’s more, much as in the case of YouTube, it’s hard to see how Google would do that organizing any time soon unless it owned or otherwise was in bed with the leading platform for that kind of content — i.e., Twitter.

2. The YouTube example is apt in another way as well — it’s not clear where the monetization would come from. Google famously doesn’t make much advertising revenue from YouTube. And Twitter is even worse as an advertising platform; sticking ads into the tweetstream would quickly drive users elsewhere, and any other advertising scheme would likely fail because of the broad variety of interfaces — such as various mobile phones — Twitterers use to get at the service.

3. I’ve been suggesting all along that Twitter needs radical user experience enhancements. But when has Google ever made made user experience enhancements to a service? Its core search engine always looks pretty much the same. Ditto GMail. Ditto Blogger. Ditto YouTube.

March 31, 2009

Twitter shows some directions for growth

TechCrunch pointed out a Twitter jobs page. The specific job TechCrunch mentioned* isn’t up there any more, but at the moment I write this, 18 others are (copied below). That’s considerable growth, given that the same page says Twitter has fewer than 30 current employees. Note the emphasis on search and the mention of Japan.

*Care and feeding of celebrity tweeters. Celebrity tweeting is actually a subject I’ve written and even been interviewed about several times.

As of this writing, the full list is: Read more

March 7, 2009

Yet more NoFollow whining

Andy Beal has a blog post up to the effect that NoFollow is a bad thing. (Edit: Andy points out in the comment thread that his opposition to NoFollow isn’t as absolute as I was suggesting.) Other SEO types are promoting this is if it were some kind of important cause. I think that’s nuts, and NoFollow is a huge spam-reducer.

The weakness of Andy’s argument is illustrated by the one and only scenario he posits in support of his crusade:

The result is that a blog post added to a brand new site may well have just broken the story about the capture of Bin Laden (we wish!)–and a link to said post may have been Tweeted and re-tweeted–but Google won’t discover or index that post until it finds a “followed” link. Likely from a trusted site in Google’s index and likely hours, if not days, after it was first shared on Twitter.

Helloooo — if I post something here, it is indexed at least in Google blog search immediately. (As in, within a minute or so.) Ping, crawl, pop — there it is. The only remotely valid version of Andy’s complaint is that It might take some hours for Google’s main index to update — but even there there’s a News listing at the top. This simply is not a problem.

Now, I think it would be personally great for me if all the links to my sites from Wikipedia and Twitter and the comment threads of major blogs pointed back with “link juice.” On the other hand, even with NoFollow out there, my sites come up high in Google’s rankings for all sorts of keywords, driving a lot of their readership. I imagine the same is true for most other sites containing fairly unique content that people find interesting enough to link to.

So other than making it harder to engage in deceptive SEO, I fail to see what problems NoFollow is causing.

December 29, 2008

Where “semantic” technology is or isn’t important

At Lynda Moulton’s behest, I spoke a couple of times recently on the subject of where “semantic” technology is or isn’t likely to be important.  One was at the Gilbane conference in early December.  The slides were based on my previously posted deck for a June talk I gave on a text analytics market overview. The actual Gilbane slides may be found here.

My opinions about the applicability of semantic technology include:

So what would your list be like?

Next Page →

Feed including blog about text analytics, text mining, and text search Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.