I talked again with Mark Logic, makers of MarkLogic Server, and they continue to have an interesting story. Basically, their technology is better search/retrieval through XML. The retrieval part is where their major differentiation lies. Accordingly, their initial market focus (they’re up to 46 customers now, including lots of big names) is on custom publishing. And by the way, they’re a good partner for fact-extraction companies, at least in the case of ClearForest.
Here, as best I understand, is the story of the custom publishing business. Its core market is publishers of high-cost material sold to people with high-priced time – i.e., scientific/engineering/medical/legal/business/etc. Other markets are general publishing, internal document preparation (e.g., intelligence community), and of course maintenance manuals (maintenance/repair has been a flagship market for just about everything, from expert systems to generic text search to, of course, text mining now as well).
The phrase “custom publishing,” however, obscures the distinction between two different paradigms. One of these is what we might call true custom publishing – assembling paragraphs, articles, chapters whatever from various sources, in an assembly customized for specific reader needs, roles, or preferences. On the revenue side, that’s a fascinating subject. But technically, I’m more interested in the other view: search results plus.
We all know lots of problems with search engines. One of the many is this: Except on rare occasions, getting the benefit of a successful search involves a whole lot of link-clicking and scrolling. But what if the relevant passages were all assembled together for you? Link-clicking would be eliminated, and scrolling might be minimized as well. The potential is huge. But I don’t know what level of precision is needed before the theoretical benefits become real.
The two paradigms can be blended, of course. A publishing product or dashboard or personal web page with a topic filter might get the results in custom-document rather than link-of-lists form. Once again, the problem is conciseness. The more concise the “complete” results can be, the more useful this kind of technology will ultimately prove.