Google Needs to Open Google News to Crawlers

Google News has long had a robots.txt prohibiting others from crawling their content. Which of course is a bit hypocritical (and possibly evil) since they require others to remove robot blocks in order to build their service in the first place.

Google crawls news sites and grabs their content for republishing on Google News. They rely on the willingness of those news sites to get distribution on Google. But Google restricts others from crawling Google News itself via their robots.txt file and terms of use, which state that “you may not…use any robot, spider, other device or manual process to monitor or copy any content from the [Google News] Service.”



%d bloggers like this: