QL2 – web text extraction and more

Curt Monash — Fri, 07 Dec 2007 21:18:01 +0000

Here are some highlights of the QL2 story, per exec Mike McDermott.

QL2’s main business is scraping price and other product offering data from the web for high-speed competitive analysis. For example, of their 250ish customers overall, over 90 are airlines. Online retailers are another big chunk of their customer base.
QL2 also commonly partners with text mining companies in applications such as Voice of the Market or competitive intelligence. E.g., QL2 has been brought into a few deals each by Attensity, Clarabridge, and especially Temis.
QL2 goes well beyond basic crawling. Notably, the system fills in forms with parameters. And of course it monitors pages for changes.
QL2’s scripting language is, Mike tells me, very SQL-like. Hence the “QL” in the name.
QL2 rolls its own filters, rather than using INSO or whoever. (Actually, what are the main file-reading filter choices these days? I’ve lost track.) Indeed, Mike fondly believes QL2 does a better job with PDFs than Adobe does.
QL2 doesn’t want to be thought of as web-only. Rather, Mike likes my formulation of “text data ETL, web or otherwise.” That said, he freely admits QL2’s strength is in Extract rather than in Transform or Load.

This all sounds very much in line with a post I made about the smart scraping market 2 ½ years ago.

QL2 – Text Technologies