Ontologies

Analysis of ontologies, of the role they play in text analytics, and of technology and techniques to build and manage them.

June 15, 2008

How text search has evolved over the past 15 years

I just stumbled across a brilliant summary of evolution in text search technology, written four years ago. It’s equally valid today (which in itself says something). I found it on the Prism Legal blog, but the actual author is Sharon Flank. My own comments are interspersed in bold. Read more

June 11, 2008

Expert System S.p.A. update

I chatted with Brooke Aker, the new CEO of Expert System’s US subsidiary, for quite a while last week. Unfortunately, we had some cell phone problems, and email followup hasn’t gone well, so I’m hazy on a few details. But here are some highlights, as best I understood them.

Read more

January 31, 2008

The biggest text analytics company you probably never heard of

I caught up with Expert System S.p.A. last week. They turn out to be doing $10 million in text technology annual revenue. That alone is surprising (sadly), but what’s really remarkable is that they did it almost entirely in the Italian market. As you might guess, that figure includes a little bit of everything, from search engines to Italian language filters for Microsoft Office to text mining. But only $3 ½ million of Expert System’s revenue is from the government (and I think that includes civilian agencies), and under 30% is professional services, so on the whole it seems like a pretty real accomplishment. Oh yes – Expert Systems says it’s entirely self-funded.

As of last year, Expert System also has English-language products, and a couple of minor OEM sales in the US (for mobile search and semantic web applications). German- and Arabic-language products are in beta test. The company says that its market focus going forward is national security – surely the reason for the Arabic – and competitive intelligence. It envisions selling through partners such as system integrators, although I think that makes more sense for the government market than it does vis-a-vis civilian companies. In February the company is introducing a market intelligence product focused on sentiment analysis.

Expert System is a bit of a throwback, in that it talks lovingly of the semantic network that informs its products.

Read more

October 6, 2007

The Clarabridge approach to text mining

And for my sixth text mining post this weekend, here are some highlights of the Clarabridge technology story. (Sorry if it sounds clipped, but I’m a bit burned out …)

Read more

April 30, 2007

Wise Crowds of Long-Tailed Ants, or something like that

Baynote sells a recommendation engine whose motto appears to be “popularity implies accuracy.” While that leads to some interesting technological ideas (below), Baynote carries that principle to an unfortunate extreme in its marketing, which is jam-packed with inaccurate buzzspeak. While most of that is focused on a few trendy meme-oriented books, the low point of my briefing today was the probably the insistence against pushback that “95%” of Google’s results depend on “PageRank.” (I think what Baynote really meant is “all off-page factors combined,” but anyhow I sure didn’t get the sense that accuracy was an important metric for them in setting their briefing strategy. And by the way, one reason I repeat the company’s name rather than referring to Baynote by a pronoun is that on-page factors DO matter in search engine rankings.)

That said, here’s the essence of Baynote’s story, as best I could figure it out.

Read more

March 26, 2007

So THAT’S why Andrew Orlowski still has a job (Part 2)

Andrew Orlowski is an over-the-top jerk, and a pretty sloppy reporter and analyst to boot. But he occasionally makes a good point even so. In the most recent instance, he confronted Tim Berners-Lee. As the article makes clear, Berners-Lee reacted badly to Orlowski, reflecting an attitude that is probably shared by 99% of the people who encounter the guy, and in the future will probably be adopted by sentient computers as well. Even so, Orlowski’s underlying point is valid: If the Semantic Web is going to be any more spam-free than the current Web, nobody has adequately explained why.

February 15, 2007

InQuira’s and Mercado’s approaches to structured search

InQuira and Mercado both have broadened their marketing pitches beyond their traditional specialties of structured search for e-commerce. Even so, it’s well worth talking about those search technologies, which offer features and precision that you just don’t get from generic search engines. There’s a lot going on in these rather cool products.

In broad outline, Mercado and InQuira each combine three basic search approaches:

Of the two, InQuira seems to have the more sophisticated ontology. Indeed, the not-wholly-absurd claim is that InQuira does natural-language processing (NLP). Both vendors incorporate user information in deciding which search results to show, in ways that may be harbingers of what generic search engines like Google and Yahoo will do down the road.

Read more

February 7, 2007

Is DMOZ the cure to Wikipedia’s spam problem?

Joost de Valk makes an interesting suggestion, namely that Wikipedia should drop all external links other than to DMOZ, and rely on DMOZ as the outside link directory. As division of labor, it makes perfect sense. However, it’s a total non-starter until at least two problems are solved. Read more

November 11, 2006

Text mining and search, joined at the hip

Most people in the text analytics market realize that text mining and search are somewhat related. But I don’t think they often stop to contemplate just how close the relationship is, could be, or someday probably will become. Here’s part of what I mean:

  1. Text mining powers search. The biggest text mining outfits in the world, possibly excepting the US intelligence community, are surely Google, Yahoo, and perhaps Microsoft.
  2. Search powers text mining. Restricting the corpus of documents to mine, even via a keyword search, makes tons of sense. That’s one of the good ideas in Attensity 4.
  3. Text mining and search are powered by the same underlying technologies. For starters, there’s all the tokenization, extraction, etc. that vendors in both areas license from Inxight and its competitors. Beyond that, I think there’s a future play in integrated taxonomy management that will rearrange the text analytics market landscape.

Read more

August 3, 2006

Principles of enterprise text technology architecture

My August Computerworld column starts where July’s left off, and suggests principles for enterprise text technology architecture. This will not run Monday, August 7, as I was originally led to believe, but rather in my usual second-Monday slot, namely August 14. Thus, I finished it a week earlier than necessary, and I apologize to those of you I inconvenienced with the unnecessary rush to meet that deadline.

The principles I came up with are:

I’ll provide a link when the column is actually posted.

Next Page →

Feed including blog about text analytics, text mining, and text search Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Recent white paper

The Explosion in DBMS Choice

August, 2008

Recent webcast

What leading database vendors don't want you to know

Originally broadcast April 9, 2008

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.