November 25, 2012

The future of search

I believe there are two ways search will improve significantly in the future. First, since talking is easier than typing, speech recognition will allow longer and more accurate input strings. Second, search will be informed by much more persistent user information, with search companies having very detailed understanding of searchers. Based on that, I expect:

My reasoning starts from several observations:

In principle, there are two main ways to make search better:

The latter, I think, is where significant future improvement will be found.

Read more

January 18, 2012

SOPA’s potentially chilling effect on public debate

SOPA (Stop Online Piracy Act) is getting blasted all over the Internet. Even so, one of its major dangers has not yet been widely discussed. People seem to realize that SOPA can create censorship by governments, or businesses, or as collateral damage when governments and businesses pursue other interests. But they may not yet grasp that SOPA can allow individuals to stifle free speech as well.

To quote the owner of a popular sports fan discussion forum (emphasis mine):

The problem is several of the provisions in SOPA will force ISPs hosting websites (ie: the company that hosts our servers) to potentially disconnect us from the Internet if there’s a claim – unsubstantiated or not – that we’re infringing against copyright, regardless of if it has not been fully proved in court. The argument is that this would make it easy for someone to make false or weak claims against the site to take a us offline until we went to court.

That’s a headache I’m not prepared to deal with. The number of threats I get each year via e-mail from angry members from other teams we remove are pretty unreal and obviously you guys don’t see them, so giving any additional ammunition backed up by a law like this would be a potentially huge issue. I’ve been talking with other sites and it’s a very real concern that we’re all potentially going to be faced with if this goes through, unless it’s rewritten to better target the sites that are really the ones they’re looking to address.

And that’s just from the passions of sports fandom. The passions of the politics — or the commercial interests of those being criticized — are of even greater concern.

Indeed, SOPA-like legislation creates an easy way to take down any forum, blog, or other site that allows user-generated content: flood it with copyrighted content, then run to the regulators. We must never, ever, ever accept a legal regime in which publishers may be censored before they are PROVED to be guilty of wrongdoing.

January 17, 2012

Freemium journalism business models, or the Launch of the Spawn of TechCrunch

In case you missed it, Sarah Lacy has launched Pando Daily, aka “Spawn of TechCrunch”. It has a clear mission statement, which she phrased as

the site-of-record for that startup root-system and everything that springs up from it, cycle-after-cycle

and mentor/investor/board member Mike Arrington simply called

to be the paper of record for Silicon Valley

That, I believe, is in the form a journalistic mission statement should take:

But there’s a problem with that template. One would ideally wish a mission statement of the form “We do the best A” to be followed up by “and, obviously, people will pay lots of money for A”. Journalistic mission statements don’t have that nice property.

Fortunately, at least in the case of tech blogging, they do tend to have a nice substitute. Let me explain.

Read more

September 14, 2011

Social technology in the enterprise

The recent Dreamforce conference (i.e,’s extravaganza) focused attention on “the social enterprise” or, more generally, enterprises’ uses of social technology. salesforce is evidently serious about this push, with development/acquisition investment (e.g. Chatter, Radian 6), marketing focus (e.g. much of Dreamforce) and sales effort (Mark Benioff says he got thrown out of a CIO’s office because he wouldn’t stop talking about the “social” subject) all aligned.

Denis Pombriant obviously attended the same Marc Benioff session I did. Dion Hinchcliffe blogged the whole story in considerable detail.

It’s a cool story, and worthy of attention. But I’d like to step back and remind us that there are numerous different ways to use social technology in the enterprise, which probably shouldn’t be confused with each other. And then I’d like to discuss one area of social technology that’s relatively new to me: integration between social and operational applications.

Read more

May 12, 2011

The Text Analytics Summit needs to be replaced

I wasn’t asked to moderate a panel at the Text Analytics Summit because the guy running it — NOT Seth Grimes — didn’t feel “comfortable” with me doing so.  (I wanted real discussion; Ezra evidently just wanted to buy off sponsors and partners with marketing-opportunity slots.)  I also wasn’t given a press pass.* (Although uninterested in the sessions, I was interested in stopping by and meeting some newer vendors.)

*This is although I’ve spoken at four prior versions of the event, and responded to their request for free consulting as recently as this year.

OK, that might have been personal in some way — but Nick Patience tweets a very similar story. Even Seth himself tweets that

They have a business model that does not apply well to the IT conference space.

The Text Analytics Summit has been troubled for years, but evidently things have gotten worse.

This is more than an incidental problem. Interest in text data is exploding, and marketplace confusing about text analytic technology abounds. More clarity is needed, but too few folks have found an economic model for providing it. (The industry shares some of the blame for that.) I’m glad Seth is doing other conference work — notably on sentiment analysis — but yet more is needed.

If I get into the conference business — and it seems natural that I would — I’ll try to help fill the gap. But if somebody else beats me to the punch, more power to you, and please let me know how I can help.

December 1, 2010

The state of the art in text analytics applications

Text analytics application areas typically fall into one or more of three broad, often overlapping domains:

For several years, I’ve been distressed at the lack of progress in text analytics or, as it used to be called, text mining. Yes, the rise of sentiment analysis has been impressive, and higher volumes of text data are being processed than were before. But otherwise, there’s been a lot of the same old, same old. Most actual deployed applications of text analytics or text mining go something like this:

Often, it seems desirable to integrate text analytics with business intelligence and/or predictive analytics tools that operate on tabular data is. Even so, such integration is most commonly weak or nonexistent. Apart from the usual reasons for silos of automation, I blame this lack on a mismatch in precision, among other reasons. A 500% increase in mentions of a subject could be simple coincidence, or the result of a single identifiable press article. In comparison, a 5% increase in a conventional business metric might be much more important.

But in fairness, the text analytics innovation picture hasn’t been quite as bleak as what I’ve been painting so far. Read more

October 24, 2010

Notes, links, and comments, October 24, 2010

Time for a notes/links/comments post just for Text Technologies:  Read more

September 28, 2010

A framework for thinking about New Media journalism

Jonathan Stray reminds us of an excellent point:

New Media journalism should be thought of as a product that people use, not as collection of stories or other pieces.

In particular, he argues:

I am in vehement agreement with much of what Stray has to say, although I think he understates the importance of general knowledge and the often serendipitous benefits of pursuing same. Read more

September 26, 2010

How to preserve investigative reporting in the New Media Era

It is common to say that “On the whole, journalism will be fine even as the media industry is disrupted – but the investigative part of journalism may not fare so well.” Indeed, I took something like that stance in my May, 2009 post on where the information ecosystem is headed and even more directly in an earlier piece that month. However, I’ve changed my mind in an optimistic direction, and now believe:

There are still some things we need to do to preserve and extend the societal benefits of investigative reporting. But they are straightforward and very likely to happen.

Specifically, I recommend:  Read more

April 4, 2010

Ike Pigott on the future of reporting

Ike Pigott argues that, as the number of conventional journalists plummets, corporations will have to hire their own “embedded” journalists to fill the void. Read more

Next Page →

Feed including blog about text analytics, text mining, and text search Subscribe to the Monash Research feed via RSS or email:


Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.