14

Google Blocked From My Site

For my next online “challenge” I received a very worrying message from the Google webmaster tools last week: Google had “noticed” that one of my health sites had blocked access to Googlebot over 13,000 times!

This is what the message said:  “Googlebot couldn’t crawl your URL because your server either requires authentication to access the page, or it is blocking Googlebot from accessing your site.”  They concluded by saying that if that was intentional, fine – otherwise I might like to fix it!

When we all spend so much time and effort trying to make our sites attractive to Google, it’s depressing to say the least, to find that something is blocking access. It’s even more depressing when it turns out you’re blocking Google yourself – which is what emerged.

That particular site is hosted with Hostgator, so I asked for their help and they replied to the effect that my .htaccess file was blocking Google and other search engines. It actually included some code to specifically do this!

Hostgator responded very quickly and were very helpful. They edited the file to remove the offending code and I managed to follow the Google webmaster instructions to inform Google that unblocking should have been completed.

Of course that didn’t solve the mystery of how it had happened in the first place!

My biggest worry was that there had been a breach of security at Hostgator and someone had been able to edit my .htaccess to insert that blocking code. It certainly wasn’t the type of thing I would have known how to insert even if I wanted to. My passwords and username are not easily guessable, although I suppose there are some clever pieces of hacking software out there.

I quizzed Hostgator further, they checked their logs for me and they said there had been no breach at their end, which I’m happy to believe as only one blog was affected – the blog where I had added a new plugin. Hostgator said it looked as if the plugin I had recently installed security had done the blocking and advised me to change the settings. I’m not going name the plugin because when I looked at the settings I couldn’t see any that would have had that effect. (I contacted the plugin support team for advice but had no reply – fair enough, it IS a free plugin.)

However, I would be interested to hear from the more experienced bloggers in the community which security plugin they recommend. It needs to be light-weight in both impact on this blog, which already runs like treacle, and on my brain – ditto!

Update: Someone on a help site I use has recommended Better WP Security, so I added that instead of the earlier one and keeping fingers crossed!

Which security plugin do YOU recommend?

Update April 2017: I never did solve the problem of security on my site until I moved to away from shared hosting. Read the story of my move to managed WordPress Hosting.

Please share
Joy
 

I left it too late to plan for a financially secure retirement. Don’t make my mistake. Start building an extra income with a part-time (or full-time) business online.

Think you don’t have time? Can’t afford the start-up cost? Can’t meet sales targets? The businesses I promote overcome all the problems you may have had with Internet Marketing before. Contact me for free advice (no obligation) on the best fit for your circumstances.

Click Here to Leave a Comment Below 14 comments
John Collins - March 26, 2014

Hi Joy,

Blocking Googlebot –

It might be in your robots.txt file which is in the same top level folder as the htaccess file is found, usually in public_html folder.

Inside the robots.txt file you will see – User-agent: – right after it an * asterisk or a robot/crawler name.

An * (asterisk) is a wildcard. Placed after (User-agent: *) means everything below applies to all robots and crawlers unless additional rules are declared via another User-agent: entry. And there could a Disallow blocking something in there you don’t want blocked from all robots and crawlers. Here’s a example of what is in a robots.txt file. With WordPress those are the areas I would disallow to all robots. Notice the asterisk after User-agent meaning this applies to all robots.

User-agent: *
Disallow: /wp-admin/
Disallow: /wp-content/plugins/
Disallow: /index.php
Disallow: /*.js$
Disallow: /*.inc$
Disallow: /*.css$
Etc…

You can also make rules for just one robot or crawler. It can be additional to a main entry like above making special rules for a specific robot. So you might find something like below.

To stop Googlebot from accessing/indexing everything –

User-agent: Googlebot
Disallow: /

Stop Googlebot from accessing/indexing a specific folder –

User-agent: Googlebot
Disallow: /anyfolder/

Stop Googlebot from accessing/indexing a specific folder except for one file in same folder –

User-agent: Googlebot
Disallow: /anyfolder/
Allow: /anyfolder/salespage.html

So you might want to check and see if the robots.txt file is the culprit.

Reply
    Joy Healey - March 28, 2014

    Hi John,

    Thanks for such a detailed explanation. I hadn’t realized those were possibilities too. I’ll check out that file too. The change of plug-in seems to have done the trick so far – but I certainly don’t want that happening again.

    Joy

    Reply
      John Collins - March 31, 2014

      Hi Joy, the 3 places you have access to that you could check yourself are the htaccess file, the robots.txt file and the meta tags. All of these could be modified by a plugin or a person. I use meta tags on my download pages to help hide them.

      To stop all robots and crawlers:

      To stop just Googlebot:

      noindex keeps it out of search and prevent a cached link
      nofollow says make all links on the page nofollow

      The thing is these 3 methods don’t really stop a robot or crawler unless they want to stop. They are more like a request. So the legitimate ones like Googlebot will respect the request but there may be Otherbots that will trespass regardless of your wishes.

      Reply
Sue Worthington - March 26, 2014

Sorry Joy can’t help with this – sounds terrible but I wouldn’t have a clue how to fix it!

Reply
    Joy Healey - March 28, 2014

    Hi Sue, I do get ’em don’t I LOL!! Anyway, Bonnie, John and a friend from another community came up the some help, so hopefully my site will live to fight another day 🙂

    Reply
Bonnie Gean - March 27, 2014

I believe Better WP Security should do just fine for you. You should be all set, as this is the plugin that was recommended to me by a WordPress know-it-all sometime ago. 🙂

Good luck with Google!

Reply
Sky Nealon - March 29, 2014

Hi Joy,

It sounds all rather complicated and sorry I can recommend anything as I’m still quite new to blogging, but from the comments, it seems like you have got the situation under control now thanks to the wonderful fellow bloggers who have helped out, it great a community to be with. Please do keep us informed about the progress of the new changes.

Kind regards
Sky

Reply
    Joy Healey - March 31, 2014

    Hello Sky,

    I’m famed for hitting problems that most people never even come up against! Yes, I’ve had lots of help from the people here. Better WP Security seems to be doing it fine for me at the moment.

    Joy

    Reply
Jan Kearney - March 29, 2014

Joy – you don’t half pick them! Glad you got the bot blocking sorted.

As for plugins – Better WP Security or WordFence (not both!)

Reply
    Joy Healey - March 31, 2014

    Thanks Jan 🙂 I know…. I just go from one crisis to the next!

    Anyway, as Better WP Security is “IN” I’ll still with that one and keep fingers crossed. But nice to know WordFence is another recommended one.

    Joy

    Reply
Caroline - April 28, 2014

I have just come across your blogs Joy and they make interesting reading – thank you for bringing this particular issue to light!

Reply

Leave a Reply: