December 11, 2005

The text technologies market 4: Requirements for an industry-altering ontology management system

In previous posts I argued that what’s holding the text technology industry back is the lack of a viable ontology management system. The obvious objection to such a suggestion is: Who would use it? There is no business process for ontology management, even less than there is for “knowledge management,” and for that matter less than there was for “knowledge engineering” during the expert systems bubble of the 1980s. Enterprises do not have anything like a “chief ontologist.” Indeed, that job title sounds like a joke — a touchy-feely liberal-artsy nonstarter.

The only way a successful product category of ontology management systems can emerge is if the products are usable by ordinary IT personnel. Vendor-supplied product training can be required, of course. Some day there can be certifications, and maybe a single class in a computer science curriculum. But almost nobody is going to buy a product whose use requires a masters degree in library science or “ontology management.”

So here are some very high-level requirements I think an ontology management system needs to meet.

1. Basic knowledge representation has to be flexible. It has to accommodate semantic net kinds of relationships (is_an_instance_of, is_a_subcategory_of). It also has to accommodate machine learning/statistical kinds of evidence (both positive and negative evidence).

2. There has to be strong layering/versioning. Pieces of the ontology will come from the vendor. Pieces will come from frequently-updated machine-learning exercises against an enterprise’s own corpus(es). Pieces will be added by hand, through a collaboration between IT and (at first) power users. It will have to be possible to reverse any of those pieces out, to apply different pieces for different specific applications, and so on.

3. There need to be standard, open ways for different kinds of applications to use the ontologies. UIMA could be a starting point.

4. The product needs to be industrial-strength – reliable, scalable, secure, sufficiently easy to administer, available on a sufficient range of platforms, and compliant with general standards (not just the text-specific ones).

Obviously, these requirements are nontrivial to achieve. But if some vendor does do a good job on them, the payoff could be huge. Dominance of the enterprise text technologies market – which would be a greatly expanded market – is at stake.

I think it will happen.

Comments

4 Responses to “The text technologies market 4: Requirements for an industry-altering ontology management system”

  1. Text Technologies»Blog Archive » Notes from the Second Annual Text Analytics (formerly Text Mining) Summit on June 23rd, 2006 4:59 am

    […] Nobody is doing anything about the platform advances I think are necessary. However, when prodded, they admit that something like that is needed, and the technology really isn’t finished or a commodity after all. But some other company should do it, because they aren’t going to. Arggh. […]

  2. Text Technologies»Blog Archive » Procter & Gamble on text mining projects on June 24th, 2006 8:52 pm

    […] Terry McFadden of Procter & Gamble made a number of interesting points in his Text Analytics Summit talk, in the area of how to build and “amass” (his word) lexicons. Above all, I’m thrilled that he recognized the necessity of amassing lexicography that can be reused from one app to the next. Beyond that, specific comments and tips included: […]

  3. Text Technologies»Blog Archive » Should ontology management be open sourced? on July 19th, 2006 11:51 am

    […] I’ve argued previously that enterprises need serious ontologies, and that this lack is holding back growth in multiple areas of text technology – search, text mining and knowledge extraction, various forms of speech recognition, and so on. The core point was: The ideal ontology would consist mainly of four aspects: […]

  4. DBMS2 — DataBase Management System Services»Blog Archive » Informatica’s general story on July 26th, 2006 3:14 am

    […] Data cleaning/quality versatility. Informatica acquired the Similarity product some months ago, which they assert is more modern than some competitors, and hence better suited to handle data beyond names/addresses. A key example would be product hierarchies/ taxonomies. I suggested they explore whether this could be leveraged for enterprises’ text technology architectures, specifically in the area of ontology management. […]

Leave a Reply




Feed including blog about text analytics, text mining, and text search Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.