September 20, 2009

Data marts in the world of text

CMS/search (Content Management System) expert Alan Pelz-Sharpe recently decried “Shadow IT”, by which he seems to mean departmental proliferation of data stores outside the control of the IT department. In other words, he’s talking about data marts, only for documents rather than tabular data.

Notwithstanding the manifest virtues of centralization, there are numerous reasons you might want data marts, in the tabular and document worlds alike. For example:

Price/performance. Your main/central data manager might be too expensive to support additional large specialized databases. Or different databases and applications might have sufficiently different profiles so as to get great price/performance from different kinds of data managers. This is particularly prevalent in the relational world, where each of column stores, sequentially-oriented row stores, and random I/O-oriented row stores have compelling use cases.
Different SLAs (Service-Level Agreements). Similarly, different applications may have very different requirements for uptime, response time, and the like. (In the relational world, think of operational data stores.)
Different security requirements. Different subsets of the data may need different levels of security. This is particularly prevalent in the document world, where security problems are not as well-solved as in the tabular arena, and where it’s common for a search engine to index across different corpuses with radically different levels of sensitivity.
Integrated application and user interfaces. In the relational world, there’s a pretty clean separation between data management and interface logic; most serious business intelligence tools can talk to most DBMS. The document world is quite different. Some search engines bundle, for example, various kinds of faceted or parameterized search interfaces. What’s more, in public-facing search, a major differentiator is the facilities that the product offers for skewing search results.
Different text applications require different thesauruses or taxonomy management systems. Ideally, those should all be integrated — but the requisite technology still doesn’t exist.

Bottom line: Text data marts, much like relational data marts, are almost surely here to stay.

Related link

The future of data marts

Categories: Enterprise search, Ontologies, Search engines, Specialized search, Structured search

Subscribe to our complete feed!

Comments

2 Responses to “Data marts in the world of text”

Nat Benson on March 10th, 2010 1:00 pm

If you think of the smallest form of a datamart, it is often times the survey response data. But that in many ways is the atomic form of disconnected data in the enterprise and it is left unprocessed and in its own silo for days..

We ran into this tool(http://insight-magnet.com) when wanted to be able to analyze our survey datamart quickly and economically. You load the file and the tool lets you dice and slice the data any which way you want. The best part is that it actually reads through your open-ended responses and tells you the categories of feedback you received.

They have a short tour on their website as well – http://insight-magnet.com/tour
Curt Monash on March 12th, 2010 7:12 am

@Nat,

Who is the “we” in your comment? I see that your URL points at the site you’re recommending.

Leave a Reply

Search our blogs and white papers

Monash Research blogs

DBMS 2 covers database management, analytics, and related technologies.
Text Technologies covers text mining, search, and social software.
Strategic Messaging analyzes marketing and messaging strategy.
The Monash Report examines technology and public policy issues.
Software Memories recounts the history of the software industry.

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.

Links
- Monash Research
- White Papers
Admin
- Log in

Data marts in the world of text

Comments

Search our blogs and white papers

Monash Research blogs

User consulting

Vendor advisory

Monash Research highlights

Recent posts

Categories

Date archives

Admin