Facebook can crawl you, but you can’t crawl facebook.
Facebook crawls your like button enabled web pages which can include FBML.
Which is super cool of course.
When does Facebook scrape my Page? “Even if you specify a longer time, Facebook will scrape your page every 24 hours.”
The user agent of the scraper is: “facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php
However, you can’t crawl Facebook. They’re blocking everyone by default (via robots.txt) still.
Unless you’re one of the big boys, you can’t play with Facebook unless you’ve been explicitly whitelisted.
It’s not really the end of the world until you realize that the ENTIRE web would melt if everyone required explicit permission from millions of websites just to crawl the web.
It’s a beg for permission and not a beg for forgiveness model which doesn’t scale.
- Markus Frind, Founder of 'Plenty of Fish' - Either a Liar or a Fool
- Fast Caffeine Metabolism for Better Sleep with Sulforaphane
- Iron anemia cured in six days!
- Changing Linux Mount Options at Runtime (noatime)
- MySQL relay-log-space-limit vs. your page cache
- Ergotron Desk Laptop Mount
- Sublime Text High Level Navigation Emulated in Emacs
- RSS Is NOT Dying
- Java and In-Memory Data Structures - A Flat File Map Proposal
- 24 Hours with an SSD and MySQL