Posted on Dec 23, 2008

See Protected Content – be the Googlebot

I’m sure most of us have had the problem – search for something with Google, see a search result appear that is exactly what you’re looking for, only to find that when you click through to the page you get told that you must log in to see the content or (gasp!) pay to see it in full. Well, there is a simple way around this: pretend to be Google!

Sites like these detect when Google is crawling their site and show the full content, but don’t give regular people the same luxury. Sites generally detect this by looking at the user agent, which is a string that identifies what kind of browser/device is requesting the page. Therefore, if we change our user agent string to match that of the Googlebot (the system Google uses to find content on the web), we can see the content that was indexed to make the search result we found.

There are various ways to change the user agent that your browser is reporting, including the great User Agent Switcher extension for Firefox, or the IE7Pro plugin for Internet Explorer 7.

Whichever way you go about it, you want to change your user agent string to:

Googlebot/1.0 (googlebot@googlebot.com http://googlebot.com/)

You may have to restart your browser for the change to take effect, but after that you should be able to go back to the site you were trying to access and see exactly what Google does – the content!

  • I love this. Bravo
  • Absolutely Dave, that's a great tip. The cache doesn't always work so well, but it's an easy thing to try.
  • Dave
    Also, you may need to disable your CSS rendering to see the cache. This can be done with one click if you use the Web Developer Firefox extension - https://addons.mozilla.org/firefox/addon/60
  • Dave
    Another way to avoid their gotcha is to read the cached version instead. This is easy when coming from search results because you can just whack the 'Back' button and hit the Google cache link (if they haven't disabled it). I use the cache by reflex for sites like expertsexchange.com.

    As for the ethics of depriving them of a signup etc, it's great as far as I'm concerned because they're rorting their search ranking - never share with Google what you're not prepared to share with Google's users.
blog comments powered by Disqus