Archive for the ‘blogs’ Category

It might be true that the english blogosphere has peaked.

Check out the breakdown of the top wordpress blogs. Lots of non-english blogs there. Does Scoble’s site count as english or gibberish? (just joking Robert).

Is it possible that the portalsphere could flatten by 2010?

… we believe that the Internet is moving away from big centralized portals, which have gathered the lions share of Internet traffic, towards a pattern where traffic is generally much flatter. The mountains, if you will, continue to exist. But the foothills advance and take up more of the overall pie. Fred Wilson had a post earlier this week about the de-portalization of the Internet which is essentially making the same point when seen from the point of view of Yahoo.

I certainly believe the space will flatten out a great deal. I think the big opportunity is for smaller publishers to flatten the space. I think this is what Keith is probably focusing on. His quote of $180k per month seems to come from the publisher angle.

I’m not sure I agree that the portals will vanish. We’ve always seen a power law curve and I believe this is a natural state of the Internet. It may become less severe over time (and certainly this is a good thing) but even the blogosphere (which is much more open) we’re seeing a power law distribution.

It looks like Yahoo might have killed blog search.

Yet if you do a search it returns results from blogs:

200608261603

Isn’t Lifehacker a blog? I think its even a blog by Scoble’s definition! Scoble blog == sclob?

This is the elephant in the blog search corner that Technorati doesn’t want to talk about. Most consumers just want to search. They don’t think about blogs. If you’re searching Google for ‘Firefox’ and the top result is from a blog and the second result is mozilla.org why would you not show the first result?

The trick is to ship a decent algorithm that can build a composite index. Google does a very good job using a sorting function that factors in both time and relevance. Rojo’s blog search back in the day (the version I implemented) did a pretty good job at this as well.

Scoble puts forth his definition of blogs which I find far too narrow.

First. If you’re a blog you apparently have to send pings:

I would go as far as saying that a site that does not ping a pingserver, like weblogs.com, is NOT a blog (private Web sites don’t ping weblogs.com and are NOT discoverable by search engines).

That’s not a very healthy requirement. Were people blogging before ping servers? Yes. If I disable pinging am I a still a blogger?

The key issue with me is the fact that 80% of pings are spam and garbage. For the most part they’re useless. Someone really needs to spend some time hear and clean this up a bit.

If you’re thinking of writing a new blog search engine or RSS aggregator I’d recommend totally and completely ignoring pings. Tailrank only really uses pings to re-prioritize and we index often enough that it really doesn’t matter that much.

Syndicatable. I can use a news aggregator to read your content, which lets me read a lot more blogs. (I can’t do that with private spaces).

This is a requirement? Technorati doesn’t care if you have an RSS feed. Tailrank doesn’t. Google doesn’t. Having a feed is a great idea but I don’t see this as being a requirement.

Blogging is a social phenomenon not a technical one. Robert should know this. What was new and innovative about blogs was the permalink. The fact that people could link to their ideas and easily post a response.

Whether you send pings or support RSS is totally optional. Ask every 16 year old myspace user if they blog and they’ll respond with an astounding yes. Ask them if they know anything about RSS or pings and they’ll just stare at you…

Dave’s a smart guy so I’m sure I’m not following here:

Last night’s conversations were incredibly interesting, the next day I’d like nothing better than to continue them. One thing I wish I had said to Om, so we could have developed the idea (or perhaps he might have disagreed) is my belief that RSS did not come from the tech industry as so many assume — it came from the publishing industry. Why? Well, the ideas in RSS are hardly technologically revolutionary. As many have pointed out, ad nauseum, CDF had some of them, and as you can see in this post from Mary Hodder, there’s no doubt something like it would have come along eventually even if we hadn’t promoted it so aggressively in the late 90s and early 00s.

If wouldn’t have eventually come unless it was pushed.

If the MSM was left to their own accord there would never be feeds. There would be forced registration, robots.txt which blocks everything, horrible invalid HTML, and content without any links.

It’s their version of DRM…

Even today most MSM lacks any sort of “quality” feeds. For the MSM sites that do have feeds they’re generally of horrible quality. They’re not full-content. You’d be lucky to get a summary let alone any rich HTML.

We’re seeing now with MSM what’s being mirrored in the Entertainment industry. They’re being dragged onto the Internet kicking and screaming and they don’t like it. Things are going to have to get worse before they get better.

Update: Dave sent me an email about this pointing out that we’re now part of the publishing industry. Sort of like Apple is part of the music industry. Very Zen…

Update 2: The major question is whether TailRank is a publishing-savvy technology company or a technology-savvy publishing company :)

Go TypePad Go!

TypePad has been getting into some trouble lately about their service. They’ve had a few service outages which made a few bloggers upset.

They did something right today though and I have to say I was pretty impressed.

When I setup my blog I pointed a DNS A name to peerfear.typepad.com because I couldn’t figure out how to setup a CNAME with ZoneEdit. This means that if they ever were to change ISPs my website wouldn’t work. Since this doesn’t happen very often and I just wanted to get my blog working I didn’t really care.

Today they sent me an email warning me that they were about to change and that I should update my A name to a CNAME.

Very cool. Good catch guys. I honestly didn’t expect you to do this since it was my decision to add an A name (when you suggested a CNAME). Thanks for the heads up!

Here’s an idea. Right now all/most of the existing mashups run on external sites. For example the Craigslist/Google Maps mashup runs on housingmaps.com.

This seems like a major problem since the majority of Craigslist users will never be able to use this functionality. Most normal website users aren’t like us and don’t hunt down mashups.

What if the site could promote the mashup and even have it work inside the site? What if your Web 2.0 app could have skins, plugins, etc?. Officially supported Greasemonkey hacking.

Well I think I’m going to add this to TailRank. I’m just going to provide a mechanism for associate javascript and CSS with your account and have a simple plugin repository so that users can select the plugins they want and quickly enable them.

Want a different color? No problem? Just use the CSS colors plugin. Want TailRank to provide inline posting to your blog? Just enable the plugin. Want TailRank to open links in new windows? Just write a plugin.

Of course none of these aren’t written yet but you get the idea.

There are of course cross site scripting vulnerabilities here. Before you could make a plugin available to the public you’d want to verify that the author is reputable. This might be the biggest problem. A company like Amazon would have a harder time than someone like TailRank.

screenshotI had a look at Yahoo Blog Search today. I think Yahoo accidentally lifted the kimono without realizing it.

I was thinking about blogging it but since they were nice enough to give me a Yahoo Mail beta I figured it would be bad karma.

feedburnerThis just dawned on me today. Yahoo should buy FeedBurner.

1. Yahoo is obviously trying to get more into the open content space. Their purchase of blo.gs, RSS integration into My Yahoo are both solid examples.

2. FeedBurner is a good acquisition target as they’re growing fast and aren’t too big yet.

3. It would 1up Google in that they’re pathetic when it comes to the RSS market (wake up guys!).

4. Yahoo is trying to grow fast. I think Zawodny commented at the SDForum SIG that they have 100 open positions for which they’re actively hiring.

5. They’re obviously going to get into the blog search market. Competition from Google and rumors aside it’s just a good idea. While FeedBurner wouldn’t necessarily give them an advantage here they might be able to use their archive and ping data to help their indexer.

6. Yahoo has free coffee! More caffeine means better feed burning! :)

I haven’t really heard anyone mention this before and it just seems obvious now..

According to Niall, Google is going to release feed search tonight.

Niall speculates:

Feeds come prepackaged as individual items or entries allowing for easy digestion by parsers and indexers. Google would need to overhaul its indexer or design a new and separate indexer specific to blog posts if it would like to include more post content than it is currently pulling down from a page’s link alternate declared feed. Technorati indexes a blog’s HTML assisted by the declared RSS and Atom feed, so I am admittedly a bit biased.

I can’t believe this. Google has engineers who can overcome these problems. For the most part I’ve overcame the problems while I was at Rojo and for the most part it was just me working on this component. Of course Google only has 70 PhDs and I’m only equivalent to about five… ;)

Niall’s right though. Technorati does more. But this is certainly up their alley and a bit more competition.

The WSF has a nice fluff (in the best possible sense of the word) piece on blog search engines:

Web logs, online diaries written and published by everyone from college students to big media companies, are being created and updated at an astonishing rate — and established search companies such as Google Inc. and Yahoo Inc. don’t always catch them fast enough. Now, a handful of closely held upstarts such as Technorati Inc., Feedster Inc. and IceRocket.com LLC see an opportunity: Build a search engine that can track the information zipping through blogs, nearly in real time.

We have two main spaces now. RSS search engines and RSS aggregators. There’s a slight amount of overlap but the players that focus seem to be doing well (though there’s room for improvement).

I’d like to see more partnerships develop here. Bloglines using Technorati search or Rojo using Feedster search (and vice-versa). I would have thought we’d have seen this already but I guess not.

Feedburner seems to be the only one playing across the board. This is of course because users are driving their adoption which is a lot more decentralized if you think about it.

Looks like Ben should win a bloggy for this one:

Also waiting in the wings was the ex-Rojo-RokR Kay@Burton who deep in the corner of the club had a “hunch.. that they needed to ship this API as a way to integrate with a 3rd party and it leaked out.” It also wasn’t surprising that DJ Think Secret remixed The Mini…sampling – “The music player will ditch its hard drive and move entirely to solid state, flash media, a move that sources familiar with the new design say will shave 20 to 25 percent off the size of the unit.”

Wow. This is the funniest post about RSS news I’ve ever read. Except for the Winer Number post

Vegetarian Gyoza. lol.

Russell notes that his beloved Bloglines is having scaling problems:

Lots of Bloglines folk have pinged me about the fact that my site isn’t updating. Not much I can do. I’ve emailed them several times to fix their “jsession” bug (where they include Java Server Sessions as part of the URL) or to just delete every feed of mine except for the index.rss main feed. But that doesn’t seem to happen. I’ve done redirects on old feed URLs so it should work. I think Bloglines is starting to get crufty – there’s lots and lots of things that aren’t working, and the site is starting to bog down like crazy. I was really hoping that the move to Ask Jeeves would accelerate updates and improvements, not stall them.

I cite this not to pick on Bloglines but to point out that a lot of players in the space are having scaling problems. I wish Bloglines the best of luck. I’m staring to sense a trend.

Launch a site on commodity hardware and software. Commodity hardware holds up but the software doesn’t scale.

It’s time for Internet-scale databases. Enough is enough. Give me clustering or give me death.

screenshotI noticed this yesterday. It looks like Yahoo 360 requires authentication.

For example here’s a link to Don Loeb’s Yahoo 360 blog.

Some Yahoo! services, such as Yahoo! Mail and Yahoo! Address Book, require you to periodically enter your password even though you are already signed in. We do this to protect your personal information.

Which isn’t exactly compelling. It isn’t that I have a problem with Yahoo authentication. I certainly understand that they want to find out who’s reading their blogs but it’s really unnecessary from a user perspective.

Every time you put up a barrier for users it just increases transaction costs which means that less users will use your service. Speed is the same issue. If your pages are slow people won’t use them.

I can’t think of any other blogging services which require authentication.

Update:

I forgot. The entire reason for this blog post in the first place was to note that with required authentication Yahoo 360 won’t be indexed by anyone that requires HTML (Technorati and Google come to mind). This essentially makes Y360 Blogosphere dark matter since their links never count in many ranking systems.

technoratiNice to see Dave addressing Technorati’s performance and scalability issues directly:

We just weren’t expecting that kind of sudden growth, both on the posting side and also on the search side, and frankly we didn’t plan well enough to handle the load. We’ve been adding new machines to our datacenter – over 400 now – and more coming each week, and we’ve been fixing bugs and making performance enhancements on the web site as well.

We also made some pretty significant performance improvements to keyword search – most now returning in 1 to 2 seconds; you can see some details on those statistics and also a month view.

While improvements are nice 1-2 seconds just isn’t fast enough. You guys should be aiming for 200-300ms. I’d be the first to admit that this is easier said than done though.

Dave also links to webperformance.org which I haven’t seen before:

A repository of Web performance data on compression, caching, and measurement approaches.

Graph Month Hourly Site.Php

However, Cosmos search (or URL search) is still being worked on, and is often timing out under the increased load. Unfortunately this is also one of the searches that bloggers find most compelling, as it helps you to all know who is linking to your blog, and it is the very first type of search that Technorati made available, so it is near and dear to our hearts. Everyone here also uses it every day, so it really sucks when it isn’t working right

I’ve been playing a lot with Feedster and IceRocket. IceRocket is pretty fast (probably because they have fewer users) but they seem to have a problem with continually finding the same URLs over and over. They also include my own blog in the search results when I link to myself (not helpful).

Either way I still think this space has a lot of growing up to do. A rusting nail in the Trackback coffin I think. This is the problem with a centralized infrastructure. The second your favorite service has problems you’re out of luck.

logoNice. PubSub increased the space race here and released their PubSub LinkRanks 1000:

The PubSub LinkRanks 1000 is a list of the most consistently influential sites that publish feeds, based on their average LinkRank scores from August 1st to August 30th, 2005. To learn more about LinkRanks, click here.

To create this list, we’ve averaged the daily LinkRanks of more than 15 million sources. We’ve also included a 15-day average as well as daily LinkRank data for August 30th as additional points of comparison.

This is twice as good as the Feedster 500 and 10x as good as the Technorati Top 100! :). That said it’s still not long tail enough. There are still 14,999,900 blogs that aren’t ranked here (in PubSub’s index). Still pretty damn cool.

I’m no where to be seen on the list (feedblog.org). Hopefully it’s not personal and just because my blog is so new.

This is interesting. More meta-ping servers are coming at a fast pace.

Looks like we now have Feed Shark King Ping and Blogomatic

googleGoogle Blogoscoped did a random sampling of Blogger and found that a significant percentage (60%) were spam blogs:

Marty Kay made an interesting comment in regard to Splots (spam blogs) on Google’s Blogspot.com:
“Funniest thing I saw was a bunch of comments on one spam/link site, that was totally irrelevant but pointed to ANOTHER spam site. The spammers are spamming each other.”

Read the post though. Very shocking to see the numbers.

Also. Sorry about the lack of posts recently. I’ve been super busy working on related projects and personal time.

rojologoNice! Looks like Rojo has launched their scriptlets support.

So you want to add some Rojo features to your blog or web site? Well, you have come to the right place. Here you will learn how to add some simple scripts to your blog template or web page page that will make some of your Rojo experience available to readers on your site!

The official Rojo blog has more:

We are very happy to announce Rojo Scriptlets! Rojo scriptlets are one line scripts that bloggers and other publishers can use to re-post content from Rojo onto their blog. Rojo users can choose to show the most recent headlines from the feeds they subscribe to in Rojo.

I’m personally excited to see this released because I wrote a prototype of this functionality a while back. Of course 90% of the work is never the initial feature but supporting it and making sure it works in production so hats off to the Rojo gang for making this happen!

There’s more functionality here that Rojo has yet to release so I don’t want to let the cat out of the bag. Needless to say it’s pretty cool.

If you’re reading from RSS make sure to load this blog post in your browser and you’ll see that my blog is not hosting Rojo tags in the right sidebar.

Looks like Kottke is upset about a recent CNET story:

News.com ruminates about Google building a collection of tools that serve as a replacement OS. Where have we heard that recently? You’re welcome for the story idea and thanks for the non-link, guys…tech journalism at its finest. I hereby institute a policy of not linking to you for a year.

This is certainly pushing the envelope a little bit. I can’t claim to say where the story came from or where CNET found it’s motivation bit I’m willing to bet Kottke’s original post had something to do with it.

Even if it didn’t if CNET did even a small amount of research they would have realized that he blogged about it 24 hours earlier and should have at least linked to him.

Calacanis has more on this meme:

I’ve been giving CNET a hard time for stealing our stories without credit for a while now. It’s not just WIN either, bloggers like John Battelle, Rafat Ali, and Om Malik have caught CNET poaching red handed.

Luckily my blog isn’t good enough to worth poaching! :)

This is a bigger problem than just CNET. They obviously have a policy of never providing links which is just low. A lot of other MSM publishers seem to have the same policy including CNN, The New York Times, LA Times, etc.

What gives? It would be nice to have a way to protest. Maybe everyone could start using nofollow links to anyone who carries Blogosphere memes without contributing back.

Actually I kind of like that idea!