Google Duplicate Content Penalty Killing my Blog SEO?

The other day I did a blog post on ethernet latency.

The next morning I was number two in Google for this search term. I was actually planning a blog post about how quick I owned the keywords and how blogs are doing a great job at SEO.

Five minutes ago I did a search and I’m not even on the first page!

What is on the first page? Well my post as it was re-syndicated from MySQL Planet (which isn’t even a permalink to the real content btw).

I’m not on the second page either. Rojo pushed me down with their RojoLink of my content.

I am on the third page mind you but it’s not the link to the post but a link to my blog.

What’s going on here? This is clearly a duplicate content penalty. The odd thing is that they are clearly getting confused here and pushed my story (which they had obviously indexed first) down in the index.

This is a bit frightening. Could someone use this to censor Google? All it would take is a few bloggers to coordinate an attack and republish the same content. All the posts in the thread would get a duplicate content penalty and all the stories would be shifted down in the index.

One note regarding MySQL Planet. I think they/we should setup a robots.txt file blocking the site from Google if this is going to be a problem going forward.


  1. Planet MySQL has about 125,000 inbound links, including MySQL corporate, the Wikipedia page for MySQL, and more. How did you arrive at the dupe content conclusion? Perhaps you were just pushed around by a few ranked heavyweights.

  2. I’ve never been one to run from competition but certainly not from my own content.

    Google should have NO problem figuring that I’m the authoritative source for this since I was #1 in their index before.

    One explanation COULD be that they checked my site first, found the content, but checked Planet MySQL later and then gave them the benefit of the doubt.

    Kevin

  3. In my experience, Google results can vary widely from day to day. Are you sure you haven’t just hit a google dance during the two-step?

    I think adding a robots.txt file might be overkill for this situation. Let’s give it at least a week and revisit, ok?

  4. Kevin,

    Few items – first Blogs ranking in Google seems to be time based. Blogs often have discussions of current events so entries are considered more relevant then they are recent. Not always true for technical blogs but we’re minority :)

    Regarding Hurting competitors on Google – there are number of ways known, ie Google Bombing. The claims no-one can affect your raking but you is Marketing – of course Google can’t come out and say, yes we rank sites such way so you can be pushed down by your competitors.

    Regarding Rojo and such – I’m not sure if you given explicit permission to republish full content but there are many sites out where which abuse blogs by republishing content for SEO reasons masking it under providing “valuable service”

    I was all for my content fully syndicated on PlanetMySQL for a while because thay was free service. Now I see Google Adsense appeared and I can see MySQL is making money by republishing my posts which makes me less comfortable.

  5. It’s not google, it’s just what you get when your content is aggregated everywhere.

  6. Well, its pretty obvious the more sites you’re “synced” on, or reblogged on, your content sort of becomes “theirs” in the eyes of search engines. Especially when they’re ranked a lot higher than your blog.

    In actual fact, less people tend to then feed from your RSS as well. Which might play into the hands of negating your importance.

    I agree with Jay that adding a robots.txt might be a little overkill, but I think this is the “price we pay” for being synced on other planets and what not.

    FWIW, I just searched for “ethernet latency” and actually didn’t find either planetmysql or your blog within the first 20 pages. Its around number 26 or something (and still no planetmysql :P)

    Looks like google has “fixed” itself over time.

  7. This is a good example of why I think main stream web search needs to think differently about social media and non social media. The ecology (in terms of both linking semantics and content) is fundamentally different and yet they are mixing the two spaces up in their search results.

    See more here:

    http://datamining.typepad.com/data_mining/2006/11/the_big_web_spl.html

  8. Hi Kevin,

    Eventually, other blogs that are syndicated on PlanetMySQL are also going to experience the duplicate penalty issue. This problem gets bigger since now PlanetMySQL uses AdSense which can cause an even stiffer check of duplicate content.

    Robots.txt implementation on PlanetMySQL will be a sure shot save for blogs that are synched. Otherwise, at the very least, the Adsense ads from PlanetMySQL should go away especially since some bloggers, like me, also use Adsense on their own blogs.

    And yes, the Google dupe penalty is for real.

    Frank

  9. One more thing. The site that has more inbound links (PlanetMySQL) will be considered more authoritative than an individual blog by Google.

    Frank

  10. Peter,

    The AdSense on PlanetMySQL is, for now, an experiment to try to raise money to support ongoing community efforts. We’re doing it for a short time to see if there is concrete benefit, so for the time being, bear with us. Thanks!

    Jay

  11. Hey Jay.

    Free BEER would be a GREAT community benefit ;)

    Would be an awesome use of adsense ;)

  12. I have to agree with Jay Pipes. Google results can very from day to day, some people call this the “everflux”. Sometimes google results are more unsettled than at other times.

    There is a ‘timeliness’ factor in the ranking algorithms though this is usually given greater weight in the news or blog search portion of Google.

    The various international Googles can have different rankings for the same term. There is a home country advantage factor in the algorithm as well, though with everything to do with search engine rankings there is a great deal of speculation. This international variance may account for the different results observed by various blog readers.

  13. your up at #2 now.

    i guess you need to ask yourself, is it important that the people read your message, or read your blog.

    if you want your message/idea to get distributed, then aggregators are for you.

    you can also compare how many links you get from places like planetmysql vs you get for a google search on ‘ethernet latency’.

    my blog (which isn’t as popular as yours) gets more traffic from direct links, not google. and that happens because I’m syndicated in various aggregators and others read it there.

  14. Here’s a quick solution. When syndicating your content, include all your information at the bottom of each piece of content with author info, website, etc. So your site visitor don’t have to see this at the bottom of every piece of content, hide it using CSS positioning. When it is syndicated, your info will appear at the bottom because there is no link to the style sheet in the content. This is useful for print materials as well. Alter your print style sheet to include the information and you can ensure that whether someone copies and pastes the content or prints, you are credited for your work.






%d bloggers like this: