<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Site and feed changes coming soon</title>
	<atom:link href="http://www.texttechnologies.com/2006/11/17/site-and-feed-changes-coming-soon/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.texttechnologies.com/2006/11/17/site-and-feed-changes-coming-soon/</link>
	<description>Understanding technology ... in both senses of the phrase</description>
	<lastBuildDate>Thu, 19 Jan 2012 17:11:01 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.3</generator>
	<item>
		<title>By: Francesco Sclano</title>
		<link>http://www.texttechnologies.com/2006/11/17/site-and-feed-changes-coming-soon/#comment-2282</link>
		<dc:creator>Francesco Sclano</dc:creator>
		<pubDate>Mon, 20 Nov 2006 22:11:52 +0000</pubDate>
		<guid isPermaLink="false">http://www.texttechnologies.com/2006/11/17/site-and-feed-changes-coming-soon/#comment-2282</guid>
		<description>Hi everybody!
TermExtractor, my master thesis, is online at the
address http://lcl2.di.uniroma1.it.

TermExtractor is a FREE and high-performing software package for Terminology
Extraction. The software helps a web community to
extract and validate relevant domain terms in their
interest domain, by submitting an archive of
domain-related documents in any format
(txt, pdf, ps, dvi, tex, doc, rtf, ppt, xls, xml, 
html/htm, chm, wpd and also zip archives.)

TermExtractor extracts terminology consensually
referred in a specific application domain. The
software takes as input a corpus of domain documents,
parses the documents, and extracts a list of
&quot;syntactically plausible&quot; terms (e.g. compounds,
adjective-nouns, etc.).
Documents parsing assigns a greater importance
to terms with text layouts (title, bold, italic,
underlined, etc.). Two entropy-based measures, called
Domain Relevance and Domain Consensus, are then used.
Domain Consensus is used to select only the terms
which are consensually referred throughout the corpus
documents. Domain Relevance to select only the terms
which are relevant to the domain of interest, Domain
Relevance is computed with reference to a set of
contrastive terminologies from different domains.
Finally, extracted terms are further filtered using
Lexical Cohesion, that measures the degree of
association of all the words in a terminological
string. 

--
Francesco Sclano
home page: http://lcl2.di.uniroma1.it/~sclano
msn:       francesco_sclano@yahoo.it
skype:     francesco978</description>
		<content:encoded><![CDATA[<p>Hi everybody!<br />
TermExtractor, my master thesis, is online at the<br />
address <a href="http://lcl2.di.uniroma1.it" onclick="javascript:pageTracker._trackPageview('/outbound/comment/lcl2.di.uniroma1.it');" rel="nofollow">http://lcl2.di.uniroma1.it</a>.</p>
<p>TermExtractor is a FREE and high-performing software package for Terminology<br />
Extraction. The software helps a web community to<br />
extract and validate relevant domain terms in their<br />
interest domain, by submitting an archive of<br />
domain-related documents in any format<br />
(txt, pdf, ps, dvi, tex, doc, rtf, ppt, xls, xml,<br />
html/htm, chm, wpd and also zip archives.)</p>
<p>TermExtractor extracts terminology consensually<br />
referred in a specific application domain. The<br />
software takes as input a corpus of domain documents,<br />
parses the documents, and extracts a list of<br />
&#8220;syntactically plausible&#8221; terms (e.g. compounds,<br />
adjective-nouns, etc.).<br />
Documents parsing assigns a greater importance<br />
to terms with text layouts (title, bold, italic,<br />
underlined, etc.). Two entropy-based measures, called<br />
Domain Relevance and Domain Consensus, are then used.<br />
Domain Consensus is used to select only the terms<br />
which are consensually referred throughout the corpus<br />
documents. Domain Relevance to select only the terms<br />
which are relevant to the domain of interest, Domain<br />
Relevance is computed with reference to a set of<br />
contrastive terminologies from different domains.<br />
Finally, extracted terms are further filtered using<br />
Lexical Cohesion, that measures the degree of<br />
association of all the words in a terminological<br />
string. </p>
<p>&#8211;<br />
Francesco Sclano<br />
home page: <a href="http://lcl2.di.uniroma1.it/~sclano" onclick="javascript:pageTracker._trackPageview('/outbound/comment/lcl2.di.uniroma1.it');" rel="nofollow">http://lcl2.di.uniroma1.it/~sclano</a><br />
msn:       <a href="mailto:francesco_sclano@yahoo.it">francesco_sclano@yahoo.it</a><br />
skype:     francesco978</p>
]]></content:encoded>
	</item>
</channel>
</rss>

