<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: More on Languageware</title>
	<atom:link href="http://www.texttechnologies.com/2008/10/10/more-on-languageware/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.texttechnologies.com/2008/10/10/more-on-languageware/</link>
	<description>Understanding technology ... in both senses of the phrase</description>
	<lastBuildDate>Thu, 19 Jan 2012 17:11:01 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.3</generator>
	<item>
		<title>By: Kas Thomas</title>
		<link>http://www.texttechnologies.com/2008/10/10/more-on-languageware/#comment-52464</link>
		<dc:creator>Kas Thomas</dc:creator>
		<pubDate>Thu, 23 Oct 2008 13:00:24 +0000</pubDate>
		<guid isPermaLink="false">http://www.texttechnologies.com/?p=286#comment-52464</guid>
		<description>I don&#039;t mean to be flip, but the mere fact that you have to ask the question points to the answer, I think. I don&#039;t know anyone who is seeing significant demand for UIMA. The folks at Nstein might have some insight into this.</description>
		<content:encoded><![CDATA[<p>I don&#8217;t mean to be flip, but the mere fact that you have to ask the question points to the answer, I think. I don&#8217;t know anyone who is seeing significant demand for UIMA. The folks at Nstein might have some insight into this.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Curt Monash</title>
		<link>http://www.texttechnologies.com/2008/10/10/more-on-languageware/#comment-52440</link>
		<dc:creator>Curt Monash</dc:creator>
		<pubDate>Thu, 23 Oct 2008 00:41:21 +0000</pubDate>
		<guid isPermaLink="false">http://www.texttechnologies.com/?p=286#comment-52440</guid>
		<description>Temis had a deal with, I think, Europol that specified UIMA.

Otherwise, I haven&#039;t heard of a lot of demand.</description>
		<content:encoded><![CDATA[<p>Temis had a deal with, I think, Europol that specified UIMA.</p>
<p>Otherwise, I haven&#8217;t heard of a lot of demand.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Bob Carpenter</title>
		<link>http://www.texttechnologies.com/2008/10/10/more-on-languageware/#comment-52435</link>
		<dc:creator>Bob Carpenter</dc:creator>
		<pubDate>Wed, 22 Oct 2008 21:49:59 +0000</pubDate>
		<guid isPermaLink="false">http://www.texttechnologies.com/?p=286#comment-52435</guid>
		<description>We typically tell our potential customers that not only don&#039;t we have any magic pixie dust, no one does.  It&#039;s best to cut out the hype up front!

I didn&#039;t understand Marie&#039;s point about how their key differetiator is that they&#039;ve &quot;completely decoupled the language model from the code that runs the analysis&quot;.  Doesn&#039;t everyone do this?  

Our product, LingPipe, has general high-level interfaces that are uniform across applications for everything from spelling correction to classification to tokenization, sentence detection, part-of-speech tagging and entity extraction.  

Uniformity and portability I understand (up to the problem of adapting tag sets and tokenization standards), but how could UIMA help with speed?  To code reusable components in UIMA requires translation into (and often out of) the common analysis stream (CAS) that handles &quot;data exchange&quot; among modules.  For third parties, this is prohibitive, because I need to translate our tokens into UIMA tokens and back again before I send them to a tagger.  Of course, I can just wrap tokenization, tagging, sentence extraction and entity extraction in a single UIMA module, but that defeats plug-and-play portability.

UIMA isn&#039;t unusual with respect to streaming.  We can do named entity annotation on multi-GB XML docs with very low memory overhead using the SAX parser and our generic entity extraction interface.  With a single model shared across threads.  

Is anyone seeing demand for UIMA outside of government-funded research?  We&#039;re still debating whether it&#039;s worth our time to write more general and complete UIMA wrappers for LingPipe than have been contributed by third parties.</description>
		<content:encoded><![CDATA[<p>We typically tell our potential customers that not only don&#8217;t we have any magic pixie dust, no one does.  It&#8217;s best to cut out the hype up front!</p>
<p>I didn&#8217;t understand Marie&#8217;s point about how their key differetiator is that they&#8217;ve &#8220;completely decoupled the language model from the code that runs the analysis&#8221;.  Doesn&#8217;t everyone do this?  </p>
<p>Our product, LingPipe, has general high-level interfaces that are uniform across applications for everything from spelling correction to classification to tokenization, sentence detection, part-of-speech tagging and entity extraction.  </p>
<p>Uniformity and portability I understand (up to the problem of adapting tag sets and tokenization standards), but how could UIMA help with speed?  To code reusable components in UIMA requires translation into (and often out of) the common analysis stream (CAS) that handles &#8220;data exchange&#8221; among modules.  For third parties, this is prohibitive, because I need to translate our tokens into UIMA tokens and back again before I send them to a tagger.  Of course, I can just wrap tokenization, tagging, sentence extraction and entity extraction in a single UIMA module, but that defeats plug-and-play portability.</p>
<p>UIMA isn&#8217;t unusual with respect to streaming.  We can do named entity annotation on multi-GB XML docs with very low memory overhead using the SAX parser and our generic entity extraction interface.  With a single model shared across threads.  </p>
<p>Is anyone seeing demand for UIMA outside of government-funded research?  We&#8217;re still debating whether it&#8217;s worth our time to write more general and complete UIMA wrappers for LingPipe than have been contributed by third parties.</p>
]]></content:encoded>
	</item>
</channel>
</rss>

