July 29, 2006

Web search and enterprise search are coming together

Web search and enterprise search are in many ways fundamentally different problems. The biggest problem in web search is screening out pages that deliberately pretend to be relevant to a search. The second biggest problem is picking out the crème de la crème from a long list of essentially good hits. In enterprise search, on the other hand, the biggest problem is finding a single document, or single fact, that is lonely at best, and if you’re unlucky doesn’t exist in the corpus at all. Document structures are also completely different, as are linking structures and almost every other input to the ranking algorithms except the raw words themselves.

Even so, the businesses and technologies of web and enterprise search are beginning to combine. Google’s attack on the low end of enterprise search is well-known, of course, as are Microsoft’s increasingly well-publicized ambitions. But enterprise search companies are also reaching out to the Web. Convera has gotten the most press for this strategy, offering focused web search to the same customers (mainly intelligence law enforcement agencies) that bought its enterprise product RetrievalWare. This is a great fit for Convera, both in customers (a lot of what those agencies have done all along is filter news information) and technology (their key differentatior is their detailed taxonomies, and those can help in any kind of search).

But it’s not just Convera. FAST of course sold alltheweb.com, which is now owned by Yahoo, and is barred by non-compete agreement from getting back into the web search business. Even so, it is spidering and analyzing and perhaps filtering billions of web pages, and offering the results as a service to its enterprise customers. These customers then have a huge leg up in deciding which pages to spider themselves and index with FAST’s enterprise technology, and they have access to FAST’s metadata banks to help with the ranking of those pages once spidered. Clever!

I think that Autonomy is doing something along these lines too, but I’m devoid of any details.

EDIT: Actually, Convera later sold its search technology to FAST, and started OEMing FAST’s technology instead. Microsoft now inherits that relationship with its acquisition of FAST.

 

Comments

4 Responses to “Web search and enterprise search are coming together”

  1. Text Technologies»Blog Archive » Convera aka Excalibur aka ConQuest on July 29th, 2006 8:16 am

    [...] Now the company offers RetrievalWare, augmented by some pattern-matching technology – e.g., what they think is a better form of fuzzy word tokenization, and some color/shape/texture image matching as well. They also have introduced a web search product. (This is confusingly called Excalibur, but they told me last week that a much-needed rebranding is underway.) Maybe this strategy will be the one that finally works out for them. • • • [...]

  2. Text Technologies»Blog Archive » Enterprise-specific web search: High-end web search/mining appliances? on October 22nd, 2006 8:28 pm

    [...] FAST, Convera, Google, and Microsoft all have the potential to introduce such a product package, although I’m not aware of any specific initiatives that exactly match what I have in mind. The closest may be Convera, which is providing a standard vertical-market-specific sub-Web designed for its government intelligence/law-enforcement customers. (I’ve forgotten whether this is on-site or on a SaaS basis.) [...]

  3. Text Technologies»Blog Archive » 41 differences between web and enterprise search on January 31st, 2007 1:26 pm

    [...] Edit: But the separation isn’t absolute. • • • [...]

  4. Andy Black on March 20th, 2007 6:19 am

    Will social networks and vertical search combine to challenge Google?

    Publishers and advertising agencies have a very difficult challenge ahead as traditional “horizontal” media like newspapers, TV channels and magazines see their traditional demographics and advertising revenue streams fragmented by the increasing preference of consumers for online access and the huge presence of Google eroding their audiences and potential future revenues.

    Perhaps they should remember the words of Sun Tsu, who once said “When the enemy is too strong to attack directly, then attack something he holds dear. Know that in all things he cannot be superior. Somewhere there is a gap in the armour, a weakness that can be attacked instead.” Google’s major strength – the clean search box and the ease of use, commoditised ad revenues, perhaps masks its principal weakness. As media content and advertising revenues fragment to serve thousands and thousands of “vertical” online communities based on lifestyle or profession, Google may suddenly seem standardised, commoditised and lacking a sense of unique community. Is Google becoming Wal-Mart, while vertical communities may prefer Harrods?

    Whilst “horizontal” media companies are similar to supermarkets, specialist professional “vertical” publishers are very specific in serving niche communities with totally relevant content and requirements. However, the publisher’s principal operating difficulty in becoming adaptive to this asymmetric Web 2.0 opportunity is that most tend to run each of their print, exhibition and online titles/businesses as separate profit and loss items on their balance sheet. As a by-product the vast majority tend not to have a centralised IT infrastructure or the human IT skill sets to manage a large scale data centre or web spidering facility – the prerequisites needed to datamine and aggregate open source, user generated and blog content to create vertical slices of the Web that are relevant for their audiences. Publishers will also need to integrate this content into the online extensions of their print brands and thereby allowing advertisers the opportunity to target high value communities. In addition, the datamining, crawling and hosting to identify relevant open source content will also need to be a continual process due to the continual growth of user generated and open source content.

    Convera have two very large data centres, an extensive web spidering capability and a web index. Convera are now partnering with a significant number of specialist B2B publishers to create a range of vertical websites for specific professional communities. The first example of this is Searchmedica.com with UBM.

    In building the deep vertical search portals, the key is to reach into the specific professional community in a number of ways. First, you can combined the trade publisher’s knowledge and contacts in the profession with community appeals that engage the specific audience in a way that general search cannot, and also by taking special care to use the taxonomies common to the targeted profession in organizing search results so that the user feels more at home and among peers. Building a good vertical engine can be costly and time consuming, and getting a critical mass of users to de-Google their search habits into more specialized engines is potentially a tough sell. However, in tests with focus groups from different professional communities to test these vertical search properties against Google, the results are hugely encouraging.

    In building the beta test sites, the specialist publishers are providing Convera with “white lists” of data sources online and websites that would be most relevant to its readers so that the searches are restricted to reliable and trusted information. Publishers are also securing agreements with owners of key proprietary content not normally crawled by Google by leveraging some of its contacts and resources so that Convera can crawl and deliver some of their proprietary content. Another key consideration is getting the user community engaged in the process as co-developers. No matter how bad the results at Google or Yahoo may be for a given professional segment, the interface is familiar and the destination is always at hand. Getting users to think of a specialized brand as the go-to place for business information is the challenge.

    A number of publishers are actively assessing the potential of adding social networking to the mix in order to get professionals interacting with each other and adding weekly podcasts by industry experts on issues affecting the community – these additional services will create more community loyalty and also additional advertising and sponsorship opportunities.

    The publishers can also use their print titles to drive the audience to the new online areas and this will also assist the transition of their high value print ad revenues to online. Publishers also have exhibitions, seminars, events and email newsletters to assist this transition – and recent research suggests that professional communities will actively attend seminars and events to meet peers and other members of their community. The theory goes that once you get some professionals involved then the viral mechanism or behavioural “Hive Mind” also kicks in and professional workers start referring to the vertical portal as a community source. It is also allows advertisers and public relations organisations access to a clearly defined, affluent, influential and stable audience.

    Google does not allow you to have a beer with a potential business partner – it doesn’t have that sense of community. But Google is fighting back – the recent launch of Google Custom Search and acquisition of teenage social network sites indicates they are aware of their weakness – but specialist publishers see this as a Trojan Horse. Social networks for teenagers are highly transient and target a demographic that is volatile, unpredictable and has a low level of disposable income – whereas a social network alongside a vertical search service for 22,000 bio-chemists, 55,000 UK GP’s, 55,000 insurance risk assessors or 120,000 US psychiatrists is stable, affluent and attractive for advertisers.

Leave a Reply




Feed including blog about text analytics, text mining, and text search Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.