May 8th, 2008 Curt Monash
As previously noted, we were de-indexed by Google, due to the injection of a whole lot of spammy hidden links. We’re back now, after about two weeks, even on the blog (this one) where there was no official de-indexing notice and hence no way to apply for re-consideration. And thus we once again have high rankings for search terms such as Netezza, DATAllegro, Clarabridge, and Attivio.
We’re designing a new blog theme — the current one is just an emergency stopgap — that will (among myriad more important virtues) be more SEO-friendly. I’ll be curious to see whether that makes much actual difference from a search ranking standpoint.
Posted in Google, Search engine optimization (SEO), Spam and antispam | 1 Comment »
April 25th, 2008 Curt Monash
As previously noted, we got hit with some hidden text, probably by SQL injection, and that lead to a Google de-listing. Of the three blogs affected by the attack, I got a de-indexing notice for one (DBMS2); another was de-listed without a notice (Text Technologies); and a third seems to have waltzed through still indexed (Software Memories). I also received a de-indexing notice for another site I have nothing to do with and indeed had never heard of before. Go figure …
We’ve now upgraded to Wordpress 2.5, which should close the vulnerability. (Thank you Melissa Bradshaw!) Fearing our old, buggy theme would degrade further, we upgraded to a new one, Biru, designed by Bob. There are some teething-pain stability issues, but if they don’t cause a reversion in the next day, I’ll apply to Google for re-inclusion. (Uh, does anybody have some boundaries around how long that’s likely to take?)
All these hours of aggravation because some criminal wanted a bit of SEO advantage …
Posted in Google, Search engine optimization (SEO), Spam and antispam | 1 Comment »
March 5th, 2008 Curt Monash
Google has begun to introduce a feature whereby, if your search obviously leads you to a single site (e.g., you searched on a company name), you get a second search box to search only within that site. More details at Google and Search Engine Land. Basically, this is Google Site Search made a lot easier to use.
I think this could be a really big deal. Read the rest of this entry »
Posted in Enterprise search, Google, Search and text storage, Specialized search engines | 4 Comments »
February 3rd, 2008 Curt Monash
Many – perhaps most — commentators on Microsoft’s bid for Yahoo are thoroughly missing the point. The most interesting part of Microsoft’s bid for Yahoo isn’t the horse-race retrospective “How did they screw up so much as to need each other?” It’s not the incipient bidding war for Yahoo. And it’s certainly not the antitrust implications.
The Microsoft/Yahoo combination could revolutionize the Internet. I’m serious. The opportunities for huge synergies might just be enough to blast the merged companies out of their current uncreative, Innovator’s Dilemma funks. Search is open for radical transformation in user interface, universal search relevancy, Web/enterprise integration, and just about everything to do with advertising and monetization. Email stands to be utterly reinvented. Portals and business intelligence have only scratched the surface of their potential. And social networking is of course in its infancy.
Here’s an overview of where some synergies and opportunities for a combined Microsoft/Yahoo lie.
Read the rest of this entry »
Posted in Enterprise search, Google, Microsoft and Windows Live Search, Search and text storage, Social networking, Social software and media, Spam and antispam, Web site filtering, Yahoo | 14 Comments »
January 18th, 2008 Curt Monash
I don’t know how pronounced this trend is, but Google web search seems to be putting more emphasis on phrases than it used to.
For starters, Google doesn’t always ignore stopwords. The Fly and Fly produce different search results. Beyond that, “or” is sometimes assumed to be a word you’re searching on, not an operator — for an example, try live free or die and see the line of text that comes back under the search box. (I’m not sure whether this ever works for “and” as well — even Sanford and Son returns the usual harangue that “the AND operator is unnecessary”.) This is all a pretty clear indicator that Google is looking at phrases. Bill Slawski’s patent-analysis-heavy SEO blog has a lot more to say on that subject, specifically on an indexing scheme that addresses the problems that indexing stopwords in might otherwise cause.
Also, there’s a direct series of patents on “Phrase-Based Indexing.”
Finally, although I don’t recall a link, there seems to be a belief that:
- Google is using or moving to Latent Semantic Indexing (LSI)
- Word-based LSI is patented by somebody else.
Posted in Google, Search and text storage | 3 Comments »
January 14th, 2008 Curt Monash
Stephen Spencer has a great interview with Matt Cutts of Google, from last month’s Pubcon. Almost all of it is SEO-related. But it also contains a few tidbits that may be interesting even if one doesn’t care about SEO, such as:
- Google now indexes up to 1/2 a megabyte per page, up from the old 101K limit.
- Google needs to do a fair amount of image recognition, but they’re going fairly plain-vanilla. For Flash they use an Adobe-supplied SDK. For detecting hidden text (e.g., white-on-white) they use what Matt characterizes as pretty simple heuristics.
- As I noted recently, Google seems to have a lot of heuristics for identifying particular types of pages. In this interview, the example was that a page that would otherwise seem spammy because it consisted only of links would be fine if it were serving as a true site map or archive.
SEO highlights included: Read the rest of this entry »
Posted in Google, Search and text storage, Search engine optimization (SEO) | No Comments »
January 14th, 2008 Curt Monash
Eric Lai wrote in this week’s Computerworld about “Why is enterprise search harder than Google Web search?” Highlights included:
- He described enterprise search as consisting mainly of a search box plus faceted searching, with maybe some automated tagging as well.
- He observed that off-page factors such as PageRank don’t work nearly as well in an enterprise as they do on the Web, and that manual tagging by enterprise users falls far short of closing the gap.
- He stumbled a bit compare/constrasting search engines and “structured” DBMS.
- He basically endorsed the worldview of Ali Riaz, late of FAST, now of Attivio.
On the whole, that’s not bad. If this were an easy subject to write about, I’d have explained it a lot more clearly in the past myself. OK. Let me get off my duff and give it a whirl now. Read the rest of this entry »
Posted in Attivio, Enterprise search, FAST, Google, Search and text storage | 12 Comments »
January 12th, 2008 Curt Monash
In a blog post focusing on SEOing for local search, some interesting claims are argued, including:
- Google knows what a review is. (This seems to be “everybody knows it” conventional wisdom.)
- Google knows how many stars a review got. (Ditto.)
- Google tracks who the reviewer is and how many other reviews s/he wrote (that’s the big insight of the post and related conversation).
Pretty interesting. Text mining companies are paying a lot of attention to Voice-of-the-Market these days; even so, I question whether then can do the same things out of the box.
Posted in Google, Search and text storage, Voice of the Market/competitive intelligence | 1 Comment »
December 2nd, 2007 Curt Monash
Danny Sullivan thinks blended vertical search — which he’s calling Search 3.0 — is a game changer. (In this context, “vertical” search denotes alternate result types such as video, image, map coordinates, or product listings.) In saying that, he’s focused on search marketers, who now have a lot more ways to try to get their messages onto Google searchers’ top result pages. But I presume what he’s really saying is that there will be a feedback effect — if Google tells all web searchers about videos and product listings, then internet marketers will be more motivated to post videos and product listings, and hence there will be more interesting choices of videos and product listings — which Google will naturally wind up featuring more prominently in its search results. And so on.
Given the Youtube explosion, I find it hard to argue with his claim.
Stay informed! No hassle, no spam — all it takes is an email address or an RSS subscription! Get all our research, or just the text analytics part, or even just a very few notifications of our most important news.
Posted in Google, Search and text storage, Search engine optimization (SEO), Specialized search engines, Structured search | No Comments »