November 30, 2006

Does web text mining need to be cloaked?

One semi-flagship use for text mining is to track sentiment across news articles, websites, etc. Should this be done openly, or is there a danger of being spoofed? (I doubt it; probably no more than a few of the sites would ever be motivated to do so.) But what if you’re making many hits against the same site, to the point that your traffic is unwelcome? Or maybe the site is a direct competitor. In such cases, hiding your tracks may be more relevant.

If any of this is an issue for you, you should take a look at Anonymizer’s growing enterprise offering. Apparently, there are commercial enterprises using thousands of seats each of Anonymizer’s cloaking service.

Categories: Text mining

Site and feed changes coming soon

We’re going to upgrade access to our research in various cool ways in the near future.

Right now, please bear with me in what is essentially a test post. ~~In theory, I’ve switched the feeds here over to Feedburner. Now I’m going to test if that really has happened.~~

EDIT: That didn’t work. I’m going to put things back the way they were.

Categories: About this blog

1 Comment

November 11, 2006

Text mining and search, joined at the hip

Most people in the text analytics market realize that text mining and search are somewhat related. But I don’t think they often stop to contemplate just how close the relationship is, could be, or someday probably will become. Here’s part of what I mean:

Text mining powers search. The biggest text mining outfits in the world, possibly excepting the US intelligence community, are surely Google, Yahoo, and perhaps Microsoft.
Search powers text mining. Restricting the corpus of documents to mine, even via a keyword search, makes tons of sense. That’s one of the good ideas in Attensity 4.
Text mining and search are powered by the same underlying technologies. For starters, there’s all the tokenization, extraction, etc. that vendors in both areas license from Inxight and its competitors. Beyond that, I think there’s a future play in integrated taxonomy management that will rearrange the text analytics market landscape.

Categories: Attensity, Business Objects and Inxight, Enterprise search, FAST, Google, IBM and UIMA, Ontologies, Open source text analytics, Search engines, Text mining

3 Comments

Search our blogs and white papers

Monash Research blogs

DBMS 2 covers database management, analytics, and related technologies.
Text Technologies covers text mining, search, and social software.
Strategic Messaging analyzes marketing and messaging strategy.
The Monash Report examines technology and public policy issues.
Software Memories recounts the history of the software industry.

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.

Links
- Monash Research
- White Papers
Admin
- Log in

Does web text mining need to be cloaked?

Site and feed changes coming soon

Text mining and search, joined at the hip

Search our blogs and white papers

Monash Research blogs

User consulting

Vendor advisory

Monash Research highlights

Recent posts

Categories

Date archives

Admin