<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Text Technologies &#187; Speech recognition</title>
	<atom:link href="http://www.texttechnologies.com/category/natural-language-speech-recognition/speech-recognition/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.texttechnologies.com</link>
	<description>Understanding technology ... in both senses of the phrase</description>
	<lastBuildDate>Wed, 18 Jan 2012 17:02:59 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.3</generator>
		<item>
		<title>MEN ARE FROM EARTH, COMPUTERS ARE FROM VULCAN</title>
		<link>http://www.texttechnologies.com/2009/05/30/men-are-from-earth-computers-are-from-vulcan/</link>
		<comments>http://www.texttechnologies.com/2009/05/30/men-are-from-earth-computers-are-from-vulcan/#comments</comments>
		<pubDate>Sat, 30 May 2009 06:15:44 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[BI integration]]></category>
		<category><![CDATA[IBM and UIMA]]></category>
		<category><![CDATA[Language recognition]]></category>
		<category><![CDATA[Natural language processing (NLP)]]></category>
		<category><![CDATA[Progress and EasyAsk]]></category>
		<category><![CDATA[Search engines]]></category>
		<category><![CDATA[Speech recognition]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/?p=331</guid>
		<description><![CDATA[The newsletter/column excerpted below was originally published in 1998.  Some of the specific references are obviously very dated.  But the general points about the requirements for successful natural language computer interfaces still hold true.  Less progress has been made in the intervening decade-plus than I would have hoped, but some recent efforts &#8212; especially in [...]]]></description>
			<content:encoded><![CDATA[<p><em>The newsletter/column excerpted below was originally published in 1998.  Some of the specific references are obviously very dated.  But the general points about the requirements for successful natural language computer interfaces still hold true.  Less progress has been made in the intervening decade-plus than I would have hoped, but some recent efforts &#8212; especially in the area of search-over-business-intelligence &#8212; are at least mildly encouraging.  Emphasis added.<br />
</em></p>
<p>Natural language computer interfaces were introduced commercially about 15 years ago*.  They failed miserably.</p>
<p><em>*I.e., the early 1980s</em></p>
<p style="margin-bottom: 0in;">For example, Artificial Intelligence Corporation&#8217;s Intellect was a natural language DBMS query/reporting/charting tool.  It was actually a pretty good product.  But it&#8217;s infamous among industry insiders as the product for which IBM, in one of its first software licensing deals, got about 1700 trial installations &#8212; and less than a 1% sales close rate.  Even its successor, Linguistic Technologies&#8217; English Wizard*, doesn&#8217;t seem to be attracting many customers, despite consistently good product reviews.</p>
<p style="margin-bottom: 0in;"><em>*These days (i.e., in 2009) it&#8217;s owned by Progress and called EasyAsk. It still doesn&#8217;t seem to be selling </em>well.</p>
<p style="margin-bottom: 0in;">Another example was HAL, the natural language command interface to 1-2-3.  HAL is the product that first made Bill Gross (subsequently the founder of Knowledge Adventure and idealab!) and his brother Larry famous.  However, it achieved no success*, and was quickly dropped from Lotus&#8217; product line.</p>
<p style="margin-bottom: 0in;"><em>*I loved the product personally. But I was sadly alone.</em></p>
<p style="margin-bottom: 0in;"><strong>In retrospect, it&#8217;s obvious why natural language interfaces failed.</strong> First of all, <strong>they offered little advantage over the  forms-and-menus paradigm</strong> that dominated enterprise computing in  both the online-character-based and client-server-GUI eras.  If you  couldn&#8217;t meet an application need with forms and menus, you couldn&#8217;t meet it with natural language either.<span id="more-331"></span></p>
<p style="margin-bottom: 0in;">Even worse, NL actually had a couple of clear disadvantages versus traditional interfaces.  First of all,<strong> it required (ick!) typing,</strong> often more typing than the forms and menus did.  Second, <strong>forms and menus tell the user exactly what he can do.</strong> Natural language, however, lets him give orders the computer doesn&#8217;t know how to follow.  This is inefficient, not to mention frustrating.</p>
<p style="margin-bottom: 0in;">However, even in 1983, it was obvious that the typing objection would go away some day, because of speech recognition &#8212; once desktop computers reached 100 MIPs or so.  (Effective keyboard-replacement speech recognition <span style="font-family: Arial Unicode MS;">&#8211; </span>as opposed to true natural language understanding &#8212; is mainly a matter of processing power.)  15 years later, standard PCs exceed 100 MIPs (assuming that 1 MIPs = a couple of megahertz for these purposes), and speech recognition is indeed getting practical.</p>
<p style="margin-bottom: 0in;">In fact, as become increasingly evident recently, speech recognition is now a hot technology.  Bill Gates has been talking it up for a couple of years.  Increasingly, the press has swung to believing him &#8230; And my parents just bought a PC with two speech recognition products on it.</p>
<p style="margin-bottom: 0in;">That said, speech recognition is as misunderstood (no pun intended) as most artificial intelligence technologies.  Yes, it beats typing, in a number of circumstances:</p>
<ul>
<li>On the telephone (duh!)</li>
<li>&#8220;Busy hands&#8221; and/or &#8220;busy eyes&#8221; applications and locales (doctors<span style="font-family: Arial Unicode MS;">&#8216; </span>offices, trading floors, warehouses, etc. <span style="font-family: Arial Unicode MS;">&#8211; </span>and, some day in the future, your kitchen and car)</li>
<li>People simply reluctant to type (e.g., anybody with sufficient wrist or back problems, and many males over the age of 45)</li>
</ul>
<p>But before our computers talk back and forth with us in the voice of Majel Barrett Roddenberry, applications are going to have to add several important elements required for truly functional natural-language  interfaces:</p>
<ul>
<li><strong>Intuitively clear names for 	everything on (or just behind) the screen</strong></li>
<li><strong>Application-specific 	disambiguation logic</strong></li>
</ul>
<p style="margin-bottom: 0in;">For most practical purposes, the latter requirement equates to</p>
<ul>
<li>
<p style="margin-bottom: 0in;">A new generation of document 	selection technology</p>
</li>
</ul>
<p style="margin-bottom: 0in;">THE RULE OF NAMES</p>
<p>According to legend, knowing something&#8217;s name gives you power over it.  When that &#8220;something&#8221; is a button or menu choice on a speech-enabled computer, the legend is literally true.  But when a feature doesn&#8217;t have an obvious name, you can&#8217;t easily invoke it.</p>
<p>When applications consisted mainly of forms and menus, this was rarely a problem.  Everything had a clear role and label.  But web pages are less organized.  Hyperlinks can be scattered all over the place, with little rhyme or reason.</p>
<p>Frankly, I don&#8217;t think this is a hard problem to solve.  It wouldn&#8217;t take a lot of XML to divide the page into clear regions, so that commands like &#8220;Show me article #3&#8243; (on a search results list) could be interpreted in the obvious way.  But it does take at least some discipline; random web pages will not necessarily be easy to &#8220;talk&#8221; to.</p>
<p>CYBERNETIC LISTENING SKILLS</p>
<p><strong>The bigger challenge is to make sure that the application can respond in some useful way, no matter what command it&#8217;s given. </strong> This is even more difficult than it was 15 years ago, because of the radical increase in &#8220;casual&#8221; computer usage.  In the old days, we could assume the user had some clear business reason for using the application, and if necessary that s/he had time to be trained (even if people rarely sat still for as much training as they really needed).  Therefore, we could at least assume that the users had at least a general idea of what the application did, and hence of which commands the computer could obey.  From an NL standpoint, we could assume that what they actually &#8220;said&#8221; (which in those days meant &#8220;typed&#8221;) was at least reasonably close to what they were &#8220;supposed&#8221; to say.</p>
<p>Now, however, some of the most important applications are internet e-commerce and portals, competing and begging for the user&#8217;s attention.  The user is there strictly on a voluntary basis, and if he doesn&#8217;t get immediate gratification, he<span style="font-family: Arial Unicode MS;">&#8216;</span>s gone, history, hasta la bye-bye.  Site-specific training isn&#8217;t even a consideration. And even if somebody did actually take a class on &#8220;How to use Excite,&#8221; the knowledge would be obsolete in six months.  So <strong>applications, if they are to have natural language interfaces that please and respond to users, have to be able to respond pretty much to any command.</strong></p>
<p>Ideally, voice-enabled systems would be like the computers on Star Trek, which can return information from vast archives, brew a pot of Earl Grey tea, play three parts of a quartet, create self-aware life forms, or answer questions like &#8220;Computer, what is the nature of the universe?&#8221;  More realistically, they should be able, for example, to respond to a command like &#8220;Tell me about flights to Miami&#8221; by automatically giving the user a travel-reservation application or web page, and entering Miami in the appropriate form field.</p>
<p>If one thinks about the complications in such a system, it becomes clear that there are only two possible ways an application system can be designed to respond meaningfully to an enormous range of reasonable possible requests.</p>
<p>1. It can do the equivalent of saying &#8220;I&#8217;m sorry, I didn&#8217;t understand that,&#8221; &#8220;I&#8217;m sorry, I can&#8217;t do that,&#8221; and so on.</p>
<p>2. It can interpret many commands as text-search strings, and return appropriate results.</p>
<p>The first strategy <span style="font-family: Arial Unicode MS;">&#8211; </span>application-specific disambiguation logic, clear responses to &#8220;errors,&#8221; etc. &#8212; is absolutely necessary.  No software is perfectly intelligent; <strong>the user will have to be asked for disambiguation help from time to time</strong> (just as clerks today ask customers to repeat their requests!). I&#8217;m not going to go into much detail about how that works because, frankly, it&#8217;s a tricky thing to get right.  Users hate unnecessary disambiguation steps. They also hate the incorrect responses that result from ambiguity, and do tolerate being asked for help when it&#8217;s truly needed.  In short, whatever you build the first time around will probably be wrong.  So build something fast; then run, don&#8217;t walk, to the nearest usability lab, find out how you screwed up, and redo your system until you get it right.</p>
<p>I&#8217;m convinced that the second strategy &#8212; <strong>heavy reliance on text search technology &#8212; is a requirement as well. </strong> Just try to name a major web site that doesn&#8217;t use text search.  True, text search has gotten a bad rap recently, mainly because a whole generation of search engines didn&#8217;t really work.  But it will stage a comeback.</p>
<p><em><strong>Related links</strong></em></p>
<ul>
<li>My <a href="http://www.texttechnologies.com/2007/12/02/voice-dictation-nuance-dragon-naturallyspeaking/" >December, 2007 survey of speech recognition technology</a></li>
<li><a href="http://www.monashreport.com/2009/05/12/star-trek-companions/" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.monashreport.com');">Star Trek fun</a></li>
</ul>
<p style="margin-bottom: 0in;">
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2009/05/30/men-are-from-earth-computers-are-from-vulcan/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Lukewarm review of Yahoo mobile search</title>
		<link>http://www.texttechnologies.com/2008/11/11/review-yahoo-mobile-search/</link>
		<comments>http://www.texttechnologies.com/2008/11/11/review-yahoo-mobile-search/#comments</comments>
		<pubDate>Tue, 11 Nov 2008 23:01:36 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Language recognition]]></category>
		<category><![CDATA[Search engines]]></category>
		<category><![CDATA[Specialized search]]></category>
		<category><![CDATA[Speech recognition]]></category>
		<category><![CDATA[Yahoo]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/?p=293</guid>
		<description><![CDATA[Stephen Shankland reviewed Yahoo&#8217;s mobile voice search, which works by taking voice input and returning results onscreen (in his case on his Blackberry Pearl). He found: There are plenty of times when voice is a more convenient form of input than typing. Voice recognition was good but far from perfect. Editing search strings was annoyingly [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://news.cnet.com/8301-1023_3-10092659-93.html" onclick="javascript:pageTracker._trackPageview('/outbound/article/news.cnet.com');">Stephen Shankland</a> reviewed Yahoo&#8217;s mobile voice search, which works by taking voice input and returning results onscreen (in his case on his Blackberry Pearl).  He found:</p>
<ul>
<li>There are plenty of times when voice is a more convenient form of input than typing.</li>
<li>Voice recognition was good but far from perfect.</li>
<li>Editing search strings was annoyingly difficult.</li>
<li>Search results themselves aren&#8217;t 100% perfect.</li>
</ul>
<p>No big surprises there. <img src='http://www.texttechnologies.com/wp-includes/images/smilies/icon_biggrin.gif' alt=':D' class='wp-smiley' /> </p>
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2008/11/11/review-yahoo-mobile-search/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>TechCrunchIT rants against voice recognition</title>
		<link>http://www.texttechnologies.com/2008/07/07/techcrunchit-rants-against-voice-recognition/</link>
		<comments>http://www.texttechnologies.com/2008/07/07/techcrunchit-rants-against-voice-recognition/#comments</comments>
		<pubDate>Mon, 07 Jul 2008 08:17:08 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Language recognition]]></category>
		<category><![CDATA[Speech recognition]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/?p=257</guid>
		<description><![CDATA[TechCrunchIT ranted yesterday against voice recognition. Parts of the argument have validity, but I think the overall argument was overstated. Key points included: 1. Microsoft and Bill Gates have been overoptimistic about voice recognition. 2. Who needs voice when you have keyboards big and small? 3. The office environment is too noisy for voice recognition [...]]]></description>
			<content:encoded><![CDATA[<p>TechCrunchIT ranted yesterday <a href="http://www.techcrunchit.com/2008/07/06/will-we-ever-bury-voice-recognition/" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.techcrunchit.com');">against voice recognition</a>.  Parts of the argument have validity, but I think the overall argument was overstated.</p>
<p>Key points included:</p>
<p>1.  Microsoft and Bill Gates have been overoptimistic about voice recognition.</p>
<p>2.  Who needs voice when you have keyboards big and small?</p>
<p>3.  The office environment is too noisy for voice recognition to work.</p>
<p><span id="more-257"></span>In particular, TechcrunchIT wrote:</p>
<blockquote><p>In a real-world enterprise environment, it is impossible to imagine a room full of people all using voice dictation at their computers. The background noise is difficult to filter out, and the modern office environment is full of interruptions with phones ringing, instant messages, new emails and more.</p></blockquote>
<p>That part of the argument can be refuted in one word &#8212; <em>headphones &#8212; </em>but other parts carry a bit more weight.  For example, so long as it is true that:</p>
<blockquote><p>When typing at a keyboard, you can easily multi-task and stop/start easily while switching between programs. With voice recognition, you need to pause or stop recording and specifically tell the application when you are actually speaking to it by pressing a button.</p></blockquote>
<p>voice recognition won&#8217;t grow beyond niche status.  But it will remain true until computers have effective command-line interfaces that work seamlessly among multiple applications.  And I&#8217;m not aware that such interfaces have shown much progress to date.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2008/07/07/techcrunchit-rants-against-voice-recognition/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>3 specialized markets for text analytics</title>
		<link>http://www.texttechnologies.com/2008/06/19/3-specialized-markets-for-text-analytics/</link>
		<comments>http://www.texttechnologies.com/2008/06/19/3-specialized-markets-for-text-analytics/#comments</comments>
		<pubDate>Thu, 19 Jun 2008 07:44:09 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Language recognition]]></category>
		<category><![CDATA[Natural language processing (NLP)]]></category>
		<category><![CDATA[Spam and antispam]]></category>
		<category><![CDATA[Speech recognition]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/?p=250</guid>
		<description><![CDATA[In the previous post, I offered a list of eight linguistics-based market segments, and a slide deck surveying them. And I promised a series of follow-up posts based on the slides. Let me begin by explaining what I mean by some of that list (taken from Slide 2), starting from the bottom. Machine translation is [...]]]></description>
			<content:encoded><![CDATA[<p style="margin-bottom: 0in;"><span style="font-style: normal;"><span>In the <a href="http://www.texttechnologies.com/2008/06/19/text-analytics-marketplace-competitive-landscape-trends/#more-249" >previous post</a>, I offered a list of eight linguistics-based market segments, and a <a href="http://www.monash.com/Text-analytics-markets-June-2008.ppt" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.monash.com');">slide deck</a> surveying them.  And I promised a series of follow-up posts based on the slides.</span></span><span id="more-250"></span></p>
<p style="margin-bottom: 0in;"><span style="font-style: normal;"><span>Let me begin by explaining what I mean by some of that list (taken from Slide 2), starting from the bottom.</span></span></p>
<ul>
<li><span style="font-style: normal;"><span><strong>Machine translation</strong> is a small business, with small specialized vendors. Lernout &amp; Hauspie attempted to combine it with voice recognition in a complex financial play, but that collapsed in a miasma of stock fraud. The remnants turned into what became Nuance Communications.</span></span></li>
<li><span style="font-style: normal;"><span>Nuance is a roll-up of most of the important independent <strong>voice recognition </strong>vendors. So far voice recognition has worked best in two areas: “Hands-free” computer use/dictation, and IVR (interactive voice response). While both are important, neither is exactly a mainstream enterprise computer software business. So voice recognition is not closely integrated with the other market segments.</span></span></li>
<li><strong>“</strong><span style="font-style: normal;"><span><strong>Natural language processing”</strong> other than voice recognition isn&#8217;t much of a business at this time (with apologies to Progress EasyAsk). It doesn&#8217;t make the list at all.</span></span></li>
<li><span style="font-style: normal;"><span><strong>Spam filtering</strong> is obviously a major business, whether or not it is getting combined into more general security and/or messaging product suites. Antispam vendors actually perform a lot of machine learning, much like text miners do. But the types of rules they wind up with are quite different. And their hardest problems aren&#8217;t linguistic ones, usually, as the spammers have gone beyond text to, e.g., words depicted in graphical images. Besides, even where linguistics are involved, it&#8217;s a very different problem to identify words used by bad guys trying to spoof you (and the rest of the world) than it is to understand your particular users.</span></span></li>
</ul>
<p style="margin-bottom: 0in;"><span style="font-style: normal;"><span>Why and to what extent I see the other five as separate markets was explained in connection with the subsequent 17 slides.</span></span></p>
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2008/06/19/3-specialized-markets-for-text-analytics/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>The Text Analytics Marketplace: Competitive landscape and trends</title>
		<link>http://www.texttechnologies.com/2008/06/19/text-analytics-marketplace-competitive-landscape-trends/</link>
		<comments>http://www.texttechnologies.com/2008/06/19/text-analytics-marketplace-competitive-landscape-trends/#comments</comments>
		<pubDate>Thu, 19 Jun 2008 07:35:39 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Audio and video search]]></category>
		<category><![CDATA[BI integration]]></category>
		<category><![CDATA[Custom publishing]]></category>
		<category><![CDATA[Enterprise search]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Natural language processing (NLP)]]></category>
		<category><![CDATA[Nuance]]></category>
		<category><![CDATA[Progress and EasyAsk]]></category>
		<category><![CDATA[Search engines]]></category>
		<category><![CDATA[Social software and online media]]></category>
		<category><![CDATA[Spam and antispam]]></category>
		<category><![CDATA[Speech recognition]]></category>
		<category><![CDATA[Structured search]]></category>
		<category><![CDATA[Text Analytics Summit]]></category>
		<category><![CDATA[Text mining]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/?p=249</guid>
		<description><![CDATA[As I see it, there are eight distinct market areas that each depend heavily on linguistic technology. Five are off-shoots of what used to be called “information retrieval”: 1. Web search 2. Public-facing site search 3. Enterprise search and knowledge management 4. Custom publishing 5. Text mining and extraction Three are more standalone: 6. Spam [...]]]></description>
			<content:encoded><![CDATA[<p style="margin-bottom: 0in;">As I see it, there are eight distinct market areas that each depend heavily on linguistic technology. Five are off-shoots of what used to be called “information retrieval”:</p>
<p style="margin-bottom: 0in; font-style: normal; padding-left: 30px;">1.  Web search</p>
<p style="margin-bottom: 0in; font-style: normal; padding-left: 30px;">2.  Public-facing site search</p>
<p style="margin-bottom: 0in; font-style: normal; padding-left: 30px;">3.  Enterprise search and knowledge management</p>
<p style="margin-bottom: 0in; font-style: normal; padding-left: 30px;">4.  Custom publishing</p>
<p style="padding-left: 30px;">5.  Text mining and extraction</p>
<p style="margin-bottom: 0in; font-style: normal;">Three are more standalone:</p>
<p style="margin-bottom: 0in; font-style: normal; padding-left: 30px;">6.  Spam filtering</p>
<p style="margin-bottom: 0in; font-style: normal; padding-left: 30px;">7.  Voice recognition</p>
<p style="margin-bottom: 0in; font-style: normal; padding-left: 30px;">8.  Machine translation</p>
<p><span id="more-249"></span></p>
<p style="margin-bottom: 0in;">This list comes from a talk I gave Monday at the Text Analytics Summit called <em>The Text Analytics Marketplace: Competitive landscape and trends. </em>In half an hour, I covered the first five areas (in Sue Feldman&#8217;s word, at a “gallop”). The slide deck has been uploaded to the link below.  <span style="font-style: normal;"><span>I plan to break out the material from the talk into a series of blog posts over the next few (or perhaps not-so-few) weeks. </span></span></p>
<p style="margin-bottom: 0in;"><em><strong>Slides:</strong></em></p>
<ul>
<li><a href="http://www.monash.com/Text-analytics-markets-June-2008.ppt " onclick="javascript:pageTracker._trackPageview('/outbound/article/www.monash.com');"><span>The Text Analytics Marketplace: Competitive landscape and trends</span></a></li>
</ul>
<p style="margin-bottom: 0in;"><strong><em>Other posts based on those slides:</em></strong></p>
<ul>
<li><span><a href="http://www.texttechnologies.com/2008/06/19/3-specialized-markets-for-text-analytics/" >Three specialized markets for text analytics</a> (based on Slide 2)</span></li>
<li><span><a href="http://www.texttechnologies.com/2008/06/19/6-trends-that-could-shake-up-the-text-analytics-market/" >6 trends that could shake up the text analytics market</a> (based on Slide 19)</span></li>
<li><span><a href="(in A World of Bytes)">Why search technologies are going to recombine</a> (in <em>A World of Bytes</em>, based on Slide 19)<br />
</span></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2008/06/19/text-analytics-marketplace-competitive-landscape-trends/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Dr. Doolittle in silicon</title>
		<link>http://www.texttechnologies.com/2008/01/17/dr-doolittle/</link>
		<comments>http://www.texttechnologies.com/2008/01/17/dr-doolittle/#comments</comments>
		<pubDate>Thu, 17 Jan 2008 05:49:40 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Language recognition]]></category>
		<category><![CDATA[Speech recognition]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/2008/01/17/dr-doolittle/</guid>
		<description><![CDATA[The Reg passes along a Reuters story that Hungarian scientists have built a system to automatically understand canine vocalizations. I&#8217;d like to say it&#8217;s a woof-to-Magyar translator, but apparently all it does is recognize the doggies&#8217; emotional states. The story reports that the system has 43% accuracy, vs. 40% for humans. I must confess, however, [...]]]></description>
			<content:encoded><![CDATA[<p>The <em>Reg</em> passes along a Reuters story that <a href="http://www.theregister.co.uk/2008/01/16/dog_translator/" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.theregister.co.uk');">Hungarian scientists have built a system to automatically understand canine vocalizations</a>.   I&#8217;d like to say it&#8217;s a woof-to-Magyar translator, but apparently all it does is recognize the doggies&#8217; emotional states.  The story reports that the system has 43% accuracy, vs. 40% for humans.</p>
<p>I must confess, however, to being somewhat puzzled about how they measure success.  Does the pooch fill out a survey form afterwards?  Do they conclude that the beast wasn&#8217;t angry if the experimenter doesn&#8217;t get bitten?</p>
<p>I need to know a bit more about the research protocol before I know what to think about this.</p>
<p>EDIT:  The CBC has <a href="http://www.cbc.ca/technology/story/2008/01/15/science-dogs.html" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.cbc.ca');">a little more detail</a>.   The underlying  research paper is appearing in <em>Animal Cognition.</em></p>
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2008/01/17/dr-doolittle/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>So what&#8217;s the state of speech recognition and dictation software?</title>
		<link>http://www.texttechnologies.com/2007/12/02/voice-dictation-nuance-dragon-naturallyspeaking/</link>
		<comments>http://www.texttechnologies.com/2007/12/02/voice-dictation-nuance-dragon-naturallyspeaking/#comments</comments>
		<pubDate>Sun, 02 Dec 2007 23:04:15 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Language recognition]]></category>
		<category><![CDATA[Natural language processing (NLP)]]></category>
		<category><![CDATA[Nuance]]></category>
		<category><![CDATA[Speech recognition]]></category>
		<category><![CDATA[Sybase]]></category>
		<category><![CDATA[AnswersAnywhere[]]></category>
		<category><![CDATA[NaturallySpeaking]]></category>
		<category><![CDATA[voice recognition]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/2007/12/02/voice-dictation-nuance-dragon-naturallyspeaking/</guid>
		<description><![CDATA[Linda asked me about the state of desktop dictation technology. In particular, she asked me whether there was much difference between the latest version and earlier, cheaper ones. My knowledge of the area is out of date, so I thought I&#8217;d throw both the specific question and the broader subject of speech recognition out there [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.monash.com/barlow.html" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.monash.com');">Linda</a> asked me about the state of desktop dictation technology.   In particular, she asked me whether there was much difference between the latest version and earlier, cheaper ones.  My knowledge of the area is out of date, so I thought I&#8217;d throw both the specific question and the broader subject of speech recognition out there for general discussion.</p>
<p>Here&#8217;s much of what I know or believe about speech recognition:</p>
<ul>
<li>Most major independent commercial speech recognition efforts have wound up being merged into Nuance Communications.  That goes for both desktop and server-side stuff.    None was doing particularly well before its respective merger.<span id="more-145"></span></li>
<li>A folk dance buddy (Jonathan Young, once of Dragon Systems) taught me the essential principle of developing speech recognition systems, which probably applies more broadly to other language-understanding technologies as well:  &#8220;How do you make a good speech recognition product?  You start with a bad one and keep incrementally improving it.&#8221;</li>
<li>Linda tells me that a lot of novelists use dictation software, to reduce repetitive strain from typing.  However, this often leads to repetitive use strains on their throats.  I don&#8217;t know whether it makes a difference if one uses better microphones, talks more softly, and/or has access to software that is less demanding of carefully enunciated gaps between each word.</li>
<li>Perhaps due to accuracy concerns, and perhaps also due to concern about noise pollution in the workplace, ordinary computer control via voice is rare.  Most applications focus on specialized-circumstance dictation (hands-free, disabled users, users who are being harmed by typing, etc.) or telephone interaction.</li>
<li>Rich semantic technology isn&#8217;t yet used in speech recognition to nearly the extent it is in text search/mining/analytics.  The grammar in speech recognition systems is primitive at best.  And while there may be some hand-built semantic networks with small numbers of nodes, ala Sybase AnswersAnywhere, nobody&#8217;s ever hooked up (say) a WordNet equivalent or a good entity-extraction engine as part of a mainstream commercial speech recognition product.  (Please correct me if I&#8217;m wrong about this part!)</li>
<li>There are real challenges in voice recognition via remote microphones in small enclosed places (e.g., automobiles), especially when noisy.  But wearing headsets while driving is generally frowned on by the traffic police.  EDIT:  <a href="http://www.theregister.co.uk/2007/11/30/ford_chatface_kit_europe_hasselhoff_stylee/" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.theregister.co.uk');">It seems that those challenges are being overcome</a>.</li>
<li>Overall, I can&#8217;t think of anything wrong in<a href="http://en.wikipedia.org/wiki/Dragon_NaturallySpeaking" onclick="javascript:pageTracker._trackPageview('/outbound/article/en.wikipedia.org');"> this Wikipedia article</a> on Dragon NaturallySpeaking.  That said, the article is a bit sloppy, so I&#8217;d encourage people to see if they can edit it a bit and spruce it up.</li>
</ul>
<p>Any thoughts?  In particular, what version of Dragon NaturallySpeaking or a competitive product should Linda use, and why?</p>
<p><em><br />
</em></p>
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2007/12/02/voice-dictation-nuance-dragon-naturallyspeaking/feed/</wfw:commentRss>
		<slash:comments>11</slash:comments>
		</item>
		<item>
		<title>NEC simplifies the voice translation problem</title>
		<link>http://www.texttechnologies.com/2007/11/30/nec-simplifies-the-voice-translation-problem/</link>
		<comments>http://www.texttechnologies.com/2007/11/30/nec-simplifies-the-voice-translation-problem/#comments</comments>
		<pubDate>Fri, 30 Nov 2007 18:48:59 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Language recognition]]></category>
		<category><![CDATA[Natural language processing (NLP)]]></category>
		<category><![CDATA[Speech recognition]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/2007/11/30/nec-simplifies-the-voice-translation-problem/</guid>
		<description><![CDATA[NEC announced research-level technology that lets a cellphone automatically translate from Japanese into English. The key idea is that they are generating text output, not speech, which lets them sidestep pesky problems about accuracy. I.e. (emphasis mine): One second after the phone hears speech in Japanese, the cellphone with the new technology shows the text [...]]]></description>
			<content:encoded><![CDATA[<p>NEC announced <a href="http://afp.google.com/article/ALeqM5iDHBMepqAf5xgLucDaR9yNDrJVBw" onclick="javascript:pageTracker._trackPageview('/outbound/article/afp.google.com');">research-level technology that lets a cellphone automatically translate from Japanese into English</a>.  The key idea is that they are generating text output, not speech, which lets them sidestep pesky problems about accuracy.  I.e. (emphasis mine):</p>
<blockquote><p>One second after the phone hears speech in Japanese, the cellphone with the new technology shows the text on the screen. One second later, an English version appears.  &#8230;</p>
<p>&#8220;We would need to study how to recognise [sic] voices on the phone precisely. <strong>Another problem would be how the person on the other side of the line could know if his or her words are being translated correctly</strong>,&#8221; he said.</p></blockquote>
<p><a href="http://www.monash.com/signup.html" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.monash.com');"><span id="more-144"></span></a><em><a href="http://www.monash.com/signup.html" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.monash.com');">Stay informed!</a>  No hassle, no spam &#8212; all it takes is an email address or an RSS subscription!  Get all our research, or just the text analytics part, or just get a very few notifications of our most important news. </em></p>
<p><em><p>Technorati Tags: <a href="http://technorati.com/tag/NEC" onclick="javascript:pageTracker._trackPageview('/outbound/article/technorati.com');" rel="tag">NEC</a>, <a href="http://technorati.com/tag/machine+translation" onclick="javascript:pageTracker._trackPageview('/outbound/article/technorati.com');" rel="tag"> machine translation</a>, <a href="http://technorati.com/tag/speech+recognition" onclick="javascript:pageTracker._trackPageview('/outbound/article/technorati.com');" rel="tag"> speech recognition</a>, <a href="http://technorati.com/tag/NIPNY" onclick="javascript:pageTracker._trackPageview('/outbound/article/technorati.com');" rel="tag"> NIPNY</a></p></em></p>
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2007/11/30/nec-simplifies-the-voice-translation-problem/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Progress EasyAsk</title>
		<link>http://www.texttechnologies.com/2007/07/16/progress-easyask/</link>
		<comments>http://www.texttechnologies.com/2007/07/16/progress-easyask/#comments</comments>
		<pubDate>Mon, 16 Jul 2007 14:51:06 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[BI integration]]></category>
		<category><![CDATA[Language recognition]]></category>
		<category><![CDATA[Mercado]]></category>
		<category><![CDATA[Natural language processing (NLP)]]></category>
		<category><![CDATA[Progress and EasyAsk]]></category>
		<category><![CDATA[Speech recognition]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/2007/07/16/progress-easyask/</guid>
		<description><![CDATA[I dropped by Progress a couple of weeks ago for back-to-back briefings on Apama and EasyAsk. EasyAsk is Larry Harris&#8217; second try at natural language query, after the Intellect product fell by the wayside at Trinzic, the company Artificial Intelligence Corporation grew into.* After a friendly divorce from the company he founded, if my memory [...]]]></description>
			<content:encoded><![CDATA[<p style="margin-bottom: 0in">I dropped by Progress a couple of weeks ago for back-to-back briefings on <a href="http://www.dbms2.com/2007/07/16/progress-apama/" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.dbms2.com');">Apama</a> and EasyAsk.  EasyAsk is Larry Harris&#8217; second try at natural language query, after the Intellect product fell by the wayside at Trinzic, the company Artificial Intelligence Corporation grew into.*  After a friendly divorce from the company he founded, if my memory is correct, Larry was able to build EasyAsk very directly on top of the Intellect intellectual property.</p>
<p style="margin-bottom: 0in"><em>*Other company or product names in the mix at various times include AI Corp and English Wizard.  Not inappropriately, it seems that Larry has quite an affinity for synonyms &#8230;</em></p>
<p style="margin-bottom: 0in">EasyAsk is still a small business.  The bulk is still in enterprise query, but new activity is concentrated on e-commerce applications.  While Larry thinks that they&#8217;ve solved most of the other technical problems that have bedeviled him over the past three decades, the system still takes too long to implement. <span id="more-117"></span>  His rough rule of thumb is that implementation &#8212;  i.e., building the thesaurus – takes 10% as much effort as overall database design did in the first place.  That comment leads to what seems to me to be a pretty obvious suggestion:  Focus on selling to sites that have already installed a “semantic layer” (Business Objects&#8217; term) or the equivalent while setting up their BI system – <em>whether or not EasyAsk can get actual partnerships with BOBJ, Cognos, et al.  </em><span style="font-style: normal">And I&#8217;ll stop right there, because I&#8217;m not sure whether Larry&#8217;s comments on what they have or haven&#8217;t done in that regard were meant as general briefing material, or were under NDA in our client relationship. </span></p>
<p style="margin-bottom: 0in"><span style="font-style: normal">In the e-commerce area, EasyAsk is a direct competitor to <a href="http://www.texttechnologies.com/2007/02/15/inquira-mercado-structured-search/" >Mercado</a>, with a lot of analogous functionality.  As previously noted, they claim their users enjoy <a href="http://www.texttechnologies.com/2007/05/01/huge-e-commerce-gains-claimed-by-everybody/" >particularly strong revenue benefits</a>.  After talking about it with Larry, I now feel it&#8217;s a sincere and credible claim.  Neither he nor I would claim the case has been proven with shining statistical rigor (and my doubts as to its fundamental merits remain greater than his).  But Larry is a smart and honest man, and when we discussed it I didn&#8217;t happen to catch him in any obvious and uncharacteristic thinking errors.</span></p>
<p>Takeaways from the conversation included:</p>
<ul>
<li>
<p style="margin-bottom: 0in">Customers are often software OEMs, 	especially in the health care and human resource areas.</p>
</li>
<li>
<p style="margin-bottom: 0in">SQL generation times are down to a 	millisecond or so.</p>
</li>
<li>
<p style="margin-bottom: 0in">MDX is a future direction for 	them, as an alternative to SQL.  To the extent MDX syntax is ugly, 	that&#8217;s a plus for them, not a minus!</p>
</li>
<li>
<p style="margin-bottom: 0in">Larry feels they&#8217;ve gotten a lot 	better at navigating people to canned reports and the like, rather 	than just re-executing queries for them.  On the other hand, when I 	told him a vision I first pitched on his behalf (and to him) in 1984 	about application command-and-control, I didn&#8217;t get a lot of 	recognition.  Too bad.  That kind of thing is central to <a href="http://www.monashreport.com/2007/07/09/revolutionary-trends-in-the-analytics-market/" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.monashreport.com');">natural 	language&#8217;s potential ability to solve some of the inherent problems 	of dashboards</a>.</p>
</li>
<li>
<p style="margin-bottom: 0in">The obvious voice 	recognition/natural language pairing now works pretty well for known 	users, but isn&#8217;t good yet for web applications.</p>
</li>
<li>
<p style="margin-bottom: 0in">Since being acquired by Progress 	they&#8217;ve become much more multilingual.  (I guess the name “English 	Wizard” gives a clue as to how multilingual they used to be.)</p>
</li>
</ul>
<p><em>Keep getting great research about text analytics, data management and related technologies.  Get a <a href="http://www.monash.com/" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.monash.com');">FREE subscription</a> by RSS/Atom or e-mail!</em></p>
<p><em><p>Technorati Tags: <a href="http://technorati.com/tag/Progress+Software" onclick="javascript:pageTracker._trackPageview('/outbound/article/technorati.com');" rel="tag">Progress Software</a>, <a href="http://technorati.com/tag/EasyAsk" onclick="javascript:pageTracker._trackPageview('/outbound/article/technorati.com');" rel="tag"> EasyAsk</a>, <a href="http://technorati.com/tag/natural+language" onclick="javascript:pageTracker._trackPageview('/outbound/article/technorati.com');" rel="tag"> natural language</a>, <a href="http://technorati.com/tag/Mercado" onclick="javascript:pageTracker._trackPageview('/outbound/article/technorati.com');" rel="tag"> Mercado </a></p></em></p>
<p style="margin-bottom: 0in">&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2007/07/16/progress-easyask/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>That great linguist, Groucho Marx, and other stories</title>
		<link>http://www.texttechnologies.com/2006/06/09/that-great-linguist-groucho-marx-and-other-stories/</link>
		<comments>http://www.texttechnologies.com/2006/06/09/that-great-linguist-groucho-marx-and-other-stories/#comments</comments>
		<pubDate>Sat, 10 Jun 2006 02:25:56 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Humor]]></category>
		<category><![CDATA[Language recognition]]></category>
		<category><![CDATA[Natural language processing (NLP)]]></category>
		<category><![CDATA[Speech recognition]]></category>

		<guid isPermaLink="false">http://www.texttechnologies.com/2006/06/10/that-great-linguist-groucho-marx-and-other-stories/</guid>
		<description><![CDATA[If you&#8217;re reading this blog, you&#8217;re probably familiar with a saying that illustrates some of the basic challenges of disambiguation: Time flies like an arrow. Fruit flies like a banana. But did you know who said it first? I didn&#8217;t until recently. Turns out it was Groucho Marx. Incidently, Roger Schank&#8217;s lesser-known next-generation follow-up was: [...]]]></description>
			<content:encoded><![CDATA[<p>If you&#8217;re reading this blog, you&#8217;re probably familiar with a saying that illustrates some of the basic challenges of disambiguation:</p>
<p><em>Time flies like an arrow.  Fruit flies like a banana.</em></p>
<p>But did you know who said it first? I didn&#8217;t until recently.</p>
<p><span id="more-13"></span></p>
<p>Turns out it was Groucho Marx.</p>
<p>Incidently, Roger Schank&#8217;s lesser-known next-generation follow-up was:</p>
<p><em>John saw Mary with another teacher.  Mary saw John with another woman.</em></p>
<p>Think of the information encapsulated in that!</p>
<p>And surely you know of the early machine translation system that took a phrase from English to Russian and back and wound up transforming:</p>
<p><em>The spirit was willing but the flesh was weak.</em></p>
<p>to</p>
<p><em>The vodka was good but the meat was rotten.</em></p>
<p>I think that story was true.  But I&#8217;ll close with one that&#8217;s wholly aprocryphal, and not as well known in the linguistics community as the others are.</p>
<p>A<em> research project produced a prototype of a speech-operated tactical advisor.  Demo day came, and a General (with entourage) was ushered into a room with a tactical simulation.  The engineers did a good job of rattling off situation reports, as officers might at an actual staff meeting.  The project lead then gestured to the General that he should proceed.  He cleared his throat:</em></p>
<p><em>&#8220;So, computer, which course of action do you recommend?  Shall we maintain a defensive stance, undertake a frontal assault, or try a flanking maneuver.&#8221;</em></p>
<p><em>There was a pause, as tapes spun and things clacked and clattered.  (This is an OLD apocryphal story.)  After a while, a mechanical voice replied:</em></p>
<p><em>&#8220;Yes.&#8221;</em></p>
<p><em>Aggravated by this logical but useless literalism, the General shouted &#8220;Yes, what??&#8221;</em></p>
<p><em>Immediately the computer replied:</em></p>
<p><em>&#8220;Yes, SIR!!</em>&#8220;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.texttechnologies.com/2006/06/09/that-great-linguist-groucho-marx-and-other-stories/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
	</channel>
</rss>

