Technorati’s Poor Permalink Isolation

Why is it that Technorati has bugs which have been known for years yet they never seem to get around to fixing them?

It’s the only search engine I know that’s nondeterministic. How’d they pull that off? Search one minute and you’ll get “no results” but search 30 seconds later and you get 150 results. Fun.

I just did a link cosmos search for my blog and I get this (see below) which is a clear example of the problems they have. Two posts show twice (for a total or four items). Duplicate content penalty guys!

Update:

Both of these blogs use she rel-bookmark microformat. Ouch.

200612242038


  1. Totally! I seem to bitch about this to some extent or another at least once a month. They totally stopped indexing one blog of mine for 25ish days, and then it finally caught up :) (yet link cosmos for newer posts still worked fine) The way you jump from 0 results to 1000 results is pretty crazy too. I can only imagine they have a database system that’s spread across a flaky network or has trouble handling load.

  2. I’ve never really understood what Technorati is crawling and indexing. I regularly see, for my blog, the same links from blog home pages (not posts) given time stamps of what I guess is the date of crawling. They should look into separating blog *posts* from blogs (if they are intentionally indexing both), or fix their system so they just crawl one (posts). Recently, as well as returning no results, I’ve seen some very odd problems with their ability to order posts by time (page through results ordered by ‘freshness’ – though perhaps that again means ingest date, not the date of the post).

  3. Hey, let me know if there is something about my blog that is causing the problem. I was not aware that I was using the rel-bookmark or any other microformat.

  4. Adam….. The microformat was just part of your default template.

    I think this is more of a Technorati bug than a bug with your site.

    I wonder if Technorati does the same thing with my blog.






%d bloggers like this: