<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
		>
<channel>
	<title>Comments on: Google miner</title>
	<atom:link href="http://hjalli.com/2003/10/14/google-miner/feed/" rel="self" type="application/rss+xml" />
	<link>http://hjalli.com/2003/10/14/google-miner/</link>
	<description>Technology and other wonders / TÃ¦kni og fleiri undur veraldar</description>
	<lastBuildDate>Mon, 06 Feb 2012 11:39:11 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
	<item>
		<title>By: °°gummi°°</title>
		<link>http://hjalli.com/2003/10/14/google-miner/#comment-15</link>
		<dc:creator><![CDATA[°°gummi°°]]></dc:creator>
		<pubDate>Wed, 30 Nov -0001 00:00:00 +0000</pubDate>
		<guid isPermaLink="false">http://hjalli.com/?p=117#comment-15</guid>
		<description><![CDATA[Yes it would be great to have a proper hands on tool to dig into the internet for data. But as for this google functionality I would think it would be hard to get anything more then the most obvious data from it, (for historical events, company info and most encyclopedic data it would be very difficult to surpass Atomica/Gurunet). But individual data, such as management and employees of companies (and dread I say emails addresses) as you point out might be extracted with more ease than with regular crawlers.

What I would be most interested in would be a datacollector/crawler with a proper query interface that utilizes not only the data on web pages but also the info they provide access to, phone directories and such. This would of course drive everybody up the wall, but it would be very cool. (Apparently Rumsfeld also thought this would be cool, but it looks like &lt;a href=&quot;http://www.eff.org/Privacy/TIA/overview.php&quot;&gt;TIA&lt;/a&gt; has been stopped/postponed)]]></description>
		<content:encoded><![CDATA[<p>Yes it would be great to have a proper hands on tool to dig into the internet for data. But as for this google functionality I would think it would be hard to get anything more then the most obvious data from it, (for historical events, company info and most encyclopedic data it would be very difficult to surpass Atomica/Gurunet). But individual data, such as management and employees of companies (and dread I say emails addresses) as you point out might be extracted with more ease than with regular crawlers.</p>
<p>What I would be most interested in would be a datacollector/crawler with a proper query interface that utilizes not only the data on web pages but also the info they provide access to, phone directories and such. This would of course drive everybody up the wall, but it would be very cool. (Apparently Rumsfeld also thought this would be cool, but it looks like <a href="http://www.eff.org/Privacy/TIA/overview.php">TIA</a> has been stopped/postponed)</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Hjalli</title>
		<link>http://hjalli.com/2003/10/14/google-miner/#comment-16</link>
		<dc:creator><![CDATA[Hjalli]]></dc:creator>
		<pubDate>Wed, 30 Nov -0001 00:00:00 +0000</pubDate>
		<guid isPermaLink="false">http://hjalli.com/?p=117#comment-16</guid>
		<description><![CDATA[I don&#039;t think this tool would surpass specialized databases like the ones found for example in &lt;a href=&quot;http://www.gurunet.com/&quot;&gt;Atomica/Gurunet&lt;/a&gt;, &lt;a href=&quot;http://www.cia.gov/cia/publications/factbook/&quot;&gt;CIA&#039;s World Factbook&lt;/a&gt; for country info, &lt;a href=&quot;http://www.imdb.com/&quot;&gt;IMDB&lt;/a&gt; for movie facts or &lt;a href=&quot;http://www.amazon.com/&quot;&gt;Amazon&lt;/a&gt; for book info. What the Google Miner could do is helping us building new and even more specialized such databases and take out a lot of the manual labor in doing so.

I totally agree on the value in linking to other sources of information as well (e.g. phone directories as you mention). This was partially what I meant by &quot;sites you believe are likely and reliable sources of the wanted information&quot;.

So I believe we have a new feature request for our application: One of the &quot;tricks&quot; should be allowing the user to define queries to available online (and probably also local) resources.]]></description>
		<content:encoded><![CDATA[<p>I don&#8217;t think this tool would surpass specialized databases like the ones found for example in <a href="http://www.gurunet.com/">Atomica/Gurunet</a>, <a href="http://www.cia.gov/cia/publications/factbook/">CIA&#8217;s World Factbook</a> for country info, <a href="http://www.imdb.com/">IMDB</a> for movie facts or <a href="http://www.amazon.com/">Amazon</a> for book info. What the Google Miner could do is helping us building new and even more specialized such databases and take out a lot of the manual labor in doing so.</p>
<p>I totally agree on the value in linking to other sources of information as well (e.g. phone directories as you mention). This was partially what I meant by &#8220;sites you believe are likely and reliable sources of the wanted information&#8221;.</p>
<p>So I believe we have a new feature request for our application: One of the &#8220;tricks&#8221; should be allowing the user to define queries to available online (and probably also local) resources.</p>
]]></content:encoded>
	</item>
</channel>
</rss>

