February 28, 2007

SAP’s “search” strategy isn’t about search

I caught up with Dennis Moore today to talk about SAP’s search strategy. And the biggest thing I learned was – it’s not about the search. Rather, it’s about a general interface, of which search and natural language just happen to be major parts.

Dennis didn’t actually give me a lot of details, at least not ones he’s eager to see published at this time. That said, SAP has long had a bare-bones search engine TREX. (TREX was also adapted to create the columnar relational data manager BI Accelerator.) But we didn’t talk about TREX enhancements at all, and I’m guessing there haven’t really been many. Rather, SAP’s focus seems to be on:

A. Finding business objects.

B. Helping users do things with them.

Categories: BI integration, Enterprise search, Language recognition, Natural language processing (NLP), SAP, Search engines

2 Comments

February 23, 2007

Has Google hit 10 petabytes yet?

I’ve been musing about how big Google’s core database might be. Figuring that out is not a trivial problem, unless they’ve published the answer somewhere that I’m not aware of. But here’s a big clue, from an announcement about their n-gram data:

We processed 1,024,908,267,229 words of running text

Read more

Categories: Google, Search engines

InQuira’s and Mercado’s approaches to structured search

InQuira and Mercado both have broadened their marketing pitches beyond their traditional specialties of structured search for e-commerce. Even so, it’s well worth talking about those search technologies, which offer features and precision that you just don’t get from generic search engines. There’s a lot going on in these rather cool products.

In broad outline, Mercado and InQuira each combine three basic search approaches:

Generic text indexing.
Augmentation via an ontology.
A rules engine that helps the site owner determine which results and responses are shown under various circumstances.

Of the two, InQuira seems to have the more sophisticated ontology. Indeed, the not-wholly-absurd claim is that InQuira does natural-language processing (NLP). Both vendors incorporate user information in deciding which search results to show, in ways that may be harbingers of what generic search engines like Google and Yahoo will do down the road. Read more

Categories: InQuira, Mercado, Natural language processing (NLP), Ontologies, Search engines, Structured search

2 Comments

February 7, 2007

Is DMOZ the cure to Wikipedia’s spam problem?

Joost de Valk makes an interesting suggestion, namely that Wikipedia should drop all external links other than to DMOZ, and rely on DMOZ as the outside link directory. As division of labor, it makes perfect sense. However, it’s a total non-starter until at least two problems are solved. Read more

Categories: Categorization and filtering, Directories, ODP and DMOZ, Ontologies, Spam and antispam

5 Comments

February 7, 2007

Does anybody actually use Technorati?

I just did some Technorati searches, and my blog posts come up near the top of the search results for a bunch of small companies’ names and similar words — Attensity, ClearForest, Netezza, DATAllegro, Crossbeam, DMOZ, ODP, and surely many others.

But judging by my referrer logs, nobody cares. I get lots of visitors via classic search engines — largely Google, but also the others — but bubkus from Technorati.

Technorati Tags: Technorati

Categories: Search engines, Specialized search

4 Comments

February 6, 2007

Social networking architecture of the future continued

Responding to a question by Jon Udell a few hours ago, I argued that private social networking “walled gardens” aren’t needed. The whole thing can be done publicly as well, assuming there’s a central database to help with things like access control, as in the hypothetical service I named “Linkerati.”

Some other comments on his post raise issues like “Yes, but what if a walled garden is the easiest way to get people to post the needed information?” I have a quick reply: Just let all the needed information be entered in the central database, and you’re clearly better off than in a walled garden. Read more

Categories: Social software and online media

Fact and Fiction: DMOZ and the ODP

DMOZ is dead. Fiction!
New site submissions are being processed. Partial fact.
Pending site submissions were lost in the outage. Partial fact.
Other non-public ODP data was lost in the outage too. Partial fact.
New editor applications aren’t being processed yet. Fact.
ODP editors are corrupt. Fiction!
The ODP is secretive and deceptive. Largely fiction.
If a DMOZ category doesn’t have a listed editor, it’s unlikely to get much attention. Part fact, part fiction.
ODP editors hate search engine optimization. Partial fact.
ODP editors hate SEOs. Partial fact.

I shall explain. Read more

Categories: Categorization and filtering, Directories, ODP and DMOZ, Search engine optimization (SEO)

7 Comments

February 6, 2007

A hobbit writes from the ODP Entmoot

Before saying anything about the Open Directory Project or the DMOZ directory it produces, I should offer several disclaimers.

No editor speaks for the ODP, let alone for Time Warner/AOL/Netscape.
No single editor’s opinions or choices control any edits in DMOZ, even if s/he is the sole listed editor of a category. Any of us can be overruled on any editing decision at any time.
I’m effectively as new as they come, or at least was at the time DMOZ editing came back online (late December). There have been no new editors since the well-publicized outage, and I had next to no involvement with the project prior to the outage.
Notwithstanding point #2, I’m quite opinionated, which I’m sure surprises approximately nobody. And my opinions quite often are different from those of the ODP mainstream.

Categories: Categorization and filtering, ODP and DMOZ

1 Comment

February 6, 2007

What is LinkedIn needed for? Absolutely nothing. And the same goes for MySpace.

Jon Udell asks whether private social networks such as LinkedIn are needed, or whether they can be completely refactored across the public internet. I say the latter. In social networking as in almost everything else, there’s no long-term need for an internet walled garden.

Categories: Social software and online media

2 Comments

February 3, 2007

Can Hakia hack it?

Hakia purports to be a new search engine that indexes “semantically,” which I presume means on phrases or concepts or something. But I’ve run a few queries side by side on Hakia and Google, and they’re not doing well. I think they’re not making sufficiently good use of page reputation. Try “web hosting forum” for an example of this, looking at the top two hits in both cases.

When I queried on “Viagra,” Hakia did — as it were — outperform Google. But that’s the only case I, uh, came up with. On less snigger-worthy searches, Google seemed to do as well as or better than Hakia.

Categories: Google, Search engines

Comments Off

Monash Research blogs

DBMS 2 covers database management, analytics, and related technologies.
Text Technologies covers text mining, search, and social software.
Strategic Messaging analyzes marketing and messaging strategy.
The Monash Report examines technology and public policy issues.
Software Memories recounts the history of the software industry.

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.

Links
- Monash Research
- White Papers
Admin
- Log in

SAP’s “search” strategy isn’t about search

Has Google hit 10 petabytes yet?

InQuira’s and Mercado’s approaches to structured search

Is DMOZ the cure to Wikipedia’s spam problem?

Does anybody actually use Technorati?

Social networking architecture of the future continued

Fact and Fiction: DMOZ and the ODP

A hobbit writes from the ODP Entmoot

What is LinkedIn needed for? Absolutely nothing. And the same goes for MySpace.

Can Hakia hack it?

Search our blogs and white papers

Monash Research blogs

User consulting

Vendor advisory

Monash Research highlights

Recent posts

Categories

Date archives

Admin