TypePad and WordPress should ship a blog directory.

Here’s an interesting problem. How do you get a handle on all WordPress or TypePad blogs? Right now you can’t. You could accept pings but Six Apart doesn’t send pings anymore. Most of the ping traffic is filled with spam anyway so you’ll end up wasting a ton of CPU time. TypePad also supports domain masking where the blog URL is feedblog.org and not feedblog.typepad.com. This means a lot more work is required to verify the ping is actually from TypePad.

If these guys were to simply push a static XML dump of all their blog URLs this would be 95% of the way there. It would make writing tools much easier for developer and I think yield a space for innovation. For example you could write a tool which shows hot new WordPress blogs. Or you could write a tool that was the Six Degress of Six Apart similar to the Six Degress of Wikipedia hack.

The XML format doesn’t matter. It could be as simple as:

echo "SELECT URL FROM WEBLOGS" | mysql --sql

That would get us 95% of the way there…

  1. You raise a good point… we’ve put work into making the Six Apart Update Stream as comprehensive as possible for just this reason. The goal is to not have spam in the stream in order to simplify/reduce your indexing/filtering burden as well as to improve the posting experience for bloggers so that ping failures won’t affect them.

    Overall, it’s pretty interesting to have a record of all the blogs, but probably more useful to know when they update and what the content is. So that’s where we’re focusing for now, but we’ll definitely make note that there’s requests for the comprehensive list as well.

