March 4, 2008

Over 80 percent of blog posts are probably spam

Doug Caverly highlights a Matt Mullenweg quote indicating that about 1/4 of all the blogs ever on were spam (aka splogs). Now, that’s probably a higher fraction than for the blogoverse overall, because:

But there’s one more factor. Splogs have much higher posting frequency than real ones. 10-20+ posts per day is not uncommon, and 50-100+ is not unheard of. That’s 5-10X the post frequency of even the more active human-written blogs. So let’s assume:

In that case, over 80% (and indeed probably over 90%) of all blog posts are made by machines rather than by human beings.

January 26, 2008

Anatomy of spam blogs

A post that gives you a clear sense of how gobbledydook is automatically generated (from another knowledgeable black-hat SEO who can’t be bothered to get his permalink structure sensible ;) )

January 16, 2008

Automation secrets of black hat SEO

XMCP writes one of the better black hat SEO blogs. In a post last November, he laid out a ton of advice about automating black hat SEO. Personally, I don’t approve of doing black hat SEO. Still, it’s an intellectually interesting subject. What’s more, black hat SEOs create a large fraction of all websites, and certainly of all blog comments, links, and so on. So it’s interesting to track them.

Most interesting to me and probably to most readers here is the part that shows where black hat SEOs get their content: Read more

January 14, 2008

An interesting Matt Cutts interview from December

Stephen Spencer has a great interview with Matt Cutts of Google, from last month’s Pubcon. Almost all of it is SEO-related. But it also contains a few tidbits that may be interesting even if one doesn’t care about SEO, such as:

SEO highlights included: Read more

January 8, 2008

A very fast splogger

The first post ever on Strategic Messaging went up at 2:49 am. Within four hours, I had my first splog trackbacks, all from the same site. The domain itself had just repropagated through DNS hours earlier, and had no incoming links other than Whois and the like.

Pretty impressive spamming. Not that it did him any good, of course, except insofar as he was stealing a bit of my content …

December 31, 2007

I’m getting mailbombed again

Shortly after my first reference to Shoemoney’s DMOZ issues — who did you think I meant with “shoe in his mouth“? — I got mailbombed big time. Things calmed down after a month or so, although I did change web hosting companies in the fallout.

Starting Christmas Eve — which coincidentally was shortly after a forum mention of various Shoemoney flaps, and of the first attack — I got hit again. And there was another wave right after Christmas. A fair amount of email was lost forever, possibly both professional and personal. My blogs also were down for a while, as were other sites on the same server. (And if you sent me any email over that time period, please resend it.)

It seems that I should move my email/MX record to a different service than hosts my websites, perhaps one that has invested in technology to efficiently deflect DDOS attacks. (Or perhaps I should move one domain with it, if a traditional hosting deal seems best.) Does anybody have any recommendations of such services? Read more

December 8, 2007

Windows Live search is rather different from MSN

Until the middle of this year, I got negligible search engine traffic from either MSN or Yahoo, or indeed any other search engine except Google. We’re literally talking a 90-95% share for Google, on each of my three main blogs, most months.

But in November, the Windows Live share was 19% on DBMS2, 29% on Text Technologies, and 41% on the Monash Report. And those aren’t blips; in each case there was steady August-November monthly growth. But on the other hand, early December month-to-date figures are all back down. Weird. Read more

December 2, 2007

Danny Sullivan thinks blended vertical search is a game-changer

Danny Sullivan thinks blended vertical search — which he’s calling Search 3.0 — is a game changer. (In this context, “vertical” search denotes alternate result types such as video, image, map coordinates, or product listings.) In saying that, he’s focused on search marketers, who now have a lot more ways to try to get their messages onto Google searchers’ top result pages. But I presume what he’s really saying is that there will be a feedback effect — if Google tells all web searchers about videos and product listings, then internet marketers will be more motivated to post videos and product listings, and hence there will be more interesting choices of videos and product listings — which Google will naturally wind up featuring more prominently in its search results. And so on.

Given the Youtube explosion, I find it hard to argue with his claim.

November 29, 2007

An Occurrence at Owl Creek Bridge and other SEO spam explained

I average upwards of 100 spam comments per day per blog, very little of which actually gets through (although that very little is obviously enough to be quite annoying!). Recent research from Sunbelt explains part of what’s going on. (More here in Computerworld.) What’s going on is this:

1. Aggressive black-hat SEO is being done for all kind of long-tail terms and phrases, by posting comment spam filled with little except links on those phrases. For example, one of the first spams I checked for this post consists simply of 10 links to the same .cn, with anchor text, with anchor text and subdomain name being the same keyphrase. Keyphrases included “an occurrence at owl creek bridge”, “allegheny assessment county tax”, and “am been hate i ive who who.” As this kind of spam came by, I’d been wondering why people bothered, since it didn’t seem terribly easy to monetize. Read more

September 30, 2007

A tip for submitting to DMOZ — make your site description clear

I just picked out a few of the many unreviewed sites in my DMOZ categories to evaluate, and listed most of those I reviewed.

How did I choose them to get screened? Mainly, I picked out ones with focused descriptions, titles, and so on, that just seemed likely to be listable based on that info (which is the essence of what I see on the page where all the various submitted sites are linked). I correctly guessed that I’d be able to quickly understand what I was seeing and judge whether to list the site or not, quickly write the official site description, and so on. Read more

