The Web Design Group

... Making the Web accessible to all.

Welcome Guest ( Log In | Register )

 
Reply to this topicStart new topic
> Site under bot attack
Resident Evil
post Mar 30 2012, 08:24 PM
Post #1





Group: Members
Posts: 4
Joined: 30-March 12
Member No.: 16,823



I know that this is an HTML forum, but I thought I would ask anyway since you guys seem very knowledgeable and maybe you can offer any advice.

Anyway for the past few weeks my site has been getting a large amount of Bot Traffic. These are not spiderbots, they are registering as real visitors. They show up on a variety of different IP addresses, as well as valid ISP's like Comcast, etc. When I trace the IP addresses there is no mention of anything bad in reference to them. They also show up on a variety of IP addresses, so even if I knew which ones were the bots and banned those IP's, the bots would still show up.

What the bots do is go to my home page, stay on if for about 5 seconds, then leave. This is ruining my bounce rate. There are about 6,000 to 8,000 hits a day from these bots, but I am afraid it might go up, and if it keeps getting higher it will eventually cause server issues. I am on my own server and can handle the traffic, but I would like to stop this in case there is a major increase in bot attacks.

I have no idea where they bots came from. They are showing as direct traffic, no backlinks, search engine, etc. They are coming straight to the site. I did notice that these bots do not use www when they visit my site. So instead of them going to www.htmlhelp.com, they go to htmlhelp.com and since they visit my main home page, I cannot delete the page.

What I have seen so far is that these bots are showing up as Windows 7 and IE 9 users. I don't want to block IE9 users from my site since I am sure there are many valid users on that browser. So any other ideas? Is there a code I can put somewhere to help determine if it is a bot, and then automatically block them from the site?

Thanks
User is offlinePM
Go to the top of the page
Toggle Multi-post QuotingQuote Post
Christian J
post Mar 31 2012, 06:15 PM
Post #2


.
********

Group: WDG Moderators
Posts: 6,287
Joined: 10-August 06
Member No.: 7



QUOTE(Resident Evil @ Mar 31 2012, 03:24 AM) *

There are about 6,000 to 8,000 hits a day from these bots

Is that a high figure compared with your normal volume? Sounds low for being a ddos attack. Is your site controversial in any way that could explain a ddos attempt?

QUOTE
They are showing as direct traffic, no backlinks, search engine, etc.

Could it be some browser privacy filter that removes the referrer header? Maybe such a filter spoofs the UA string as well. Do all the suspected bots send the exact same User Agent string (or a few identical ones), and if so could you post it here?

IE9 itself features user tracking protection, but I don't know if it strips the referrer header.

Of course none of this explains why only the URL without the www prefix is requested.

QUOTE
I don't want to block IE9 users from my site since I am sure there are many valid users on that browser. So any other ideas? Is there a code I can put somewhere to help determine if it is a bot, and then automatically block them from the site?

You might redirect requests without referrer header (and the IE9 UA string?) to another web page (of minimal size to save bandwidth), and from there provide a link back to the home page (for valid users that don't send referrer headers for various reasons --don't know how many they are, but they'll likely be annoyed). This might be done in a .htaccess directive if you use Apache, or with a server-side script like PHP.

But if someone tries to ddos you for real this doesn't sound like an effective counter-measure.


--------------------
User is offlinePM
Go to the top of the page
Toggle Multi-post QuotingQuote Post
Resident Evil
post Mar 31 2012, 08:50 PM
Post #3





Group: Members
Posts: 4
Joined: 30-March 12
Member No.: 16,823



QUOTE(Christian J @ Mar 31 2012, 06:15 PM) *

Is that a high figure compared with your normal volume? Sounds low for being a ddos attack. Is your site controversial in any way that could explain a ddos attempt?


The particular page that these bots are going to would normally get about 2,000 to 3,000 hits a day, and now it is getting 8,000 to 10,000 hits. Overall the site is a higher traffic one, but a lot of people go to other pages because they find them through search engines, so they are not always hitting my home page. But these bots are going to the home page.

The site is not totally controversial, but I do not think this attack is specifically generated just for me. I have found some posts on other sites including Google's AdSense forum of other people complaining of this same traffic spike with a bunch of bots coming in, also same descriptions I gave with them going to one page, no www on URL, etc. It all began in late February for everyone. Some of them are getting up to 40,000 extra hits a day, I am lucky enough to be lower, but would not want this to increase. Here is one post about it:

http://www.webmasterworld.com/analytics/4420174.htm

QUOTE
Could it be some browser privacy filter that removes the referrer header? Maybe such a filter spoofs the UA string as well. Do all the suspected bots send the exact same User Agent string (or a few identical ones), and if so could you post it here?


When you say UA string, do you mean something like this? Mozilla/5.0 (compatible;)

I would have to look closer at my site logs, but it looks like the only User Agent the bots show up as is Internet Explorer 9.x

QUOTE
You might redirect requests without referrer header (and the IE9 UA string?) to another web page (of minimal size to save bandwidth), and from there provide a link back to the home page (for valid users that don't send referrer headers for various reasons --don't know how many they are, but they'll likely be annoyed). This might be done in a .htaccess directive if you use Apache, or with a server-side script like PHP.


That is a good idea, and something I have considered. Yes it would probably annoy regular visitors who use IE9, and I really don't know how many actual human visitors use IE9 for my site. I believe more were using Firefox and older versions of IE. I wish the bots were not going to my main home page, otherwise this would be much easier.

I was wondering if there was some sort of script to use to auto block these things without blocking real traffic. Or even something that is able to tell if it is a human or bot visitor, but since the bots show up under so many different IP addresses, it is hard to block them. I was wondering if maybe since the bots are not using the "www" if that could help anyway in blocking them.

Thanks for your help so far, this has been going on for a month now for a lot of people. No one knows who started it or where it came from. I would like to stop it just in case it got bigger. Also there appears to be no common link between any of the sites that are being attacked.
User is offlinePM
Go to the top of the page
Toggle Multi-post QuotingQuote Post
Christian J
post Apr 1 2012, 06:40 AM
Post #4


.
********

Group: WDG Moderators
Posts: 6,287
Joined: 10-August 06
Member No.: 7



QUOTE(Resident Evil @ Apr 1 2012, 03:50 AM) *

The site is not totally controversial, but I do not think this attack is specifically generated just for me.

It would be interesting to know if the content of all affected sites had something controversial in common. But even if they did, trying to ddos lots of sites at once seems unlikely, and if such a ddos attack would take place it would probably be known quickly.

QUOTE
When you say UA string, do you mean something like this? Mozilla/5.0 (compatible;)

Yes, but they are usually more verbose. I was hoping you might be able to identify the bots by their UA string, assuming that all of the bots used the same one.

QUOTE
I would have to look closer at my site logs, but it looks like the only User Agent the bots show up as is Internet Explorer 9.x

The Webmasterworld thread mentions other IE versions too. I was hoping it was possible to narrow down the suspects by combining various identifiers (like identical UA string, no referrer header, no www prefix, etc).

QUOTE
I was wondering if maybe since the bots are not using the "www" if that could help anyway in blocking them.

Not much for blocking (without annoying valid visitors), unless most of your valid visitors are supposed to use the www version anyway. E.g., if other sites and search engines always link to the www version and you use it consistently (on business cards etc), only very few valid visitors should request the non-www version.

But you might be able to compare webstats from the www and non-www versions, and filter out the bots that way. For example, you might let a server-side script (say PHP) print the stat HTML code only on the www version of the page.


--------------------
User is offlinePM
Go to the top of the page
Toggle Multi-post QuotingQuote Post
Resident Evil
post Apr 3 2012, 05:05 PM
Post #5





Group: Members
Posts: 4
Joined: 30-March 12
Member No.: 16,823



QUOTE(Christian J @ Apr 1 2012, 06:40 AM) *

It would be interesting to know if the content of all affected sites had something controversial in common. But even if they did, trying to ddos lots of sites at once seems unlikely, and if such a ddos attack would take place it would probably be known quickly.


I doubt they are, some of the people have linked to their sites that are being attacked, and they were sites about fitness, health, hobbies, etc.


QUOTE

The Webmasterworld thread mentions other IE versions too. I was hoping it was possible to narrow down the suspects by combining various identifiers (like identical UA string, no referrer header, no www prefix, etc).


No one has been able to narrow it down yet, it is completely random and shows up under a variety of ip addresses and isp's.

QUOTE

Not much for blocking (without annoying valid visitors), unless most of your valid visitors are supposed to use the www version anyway. E.g., if other sites and search engines always link to the www version and you use it consistently (on business cards etc), only very few valid visitors should request the non-www version.

But you might be able to compare webstats from the www and non-www versions, and filter out the bots that way. For example, you might let a server-side script (say PHP) print the stat HTML code only on the www version of the page.


That is a good idea, I should be able to configure my stats to check non www. Actually I think I may have already done that in late February when this all started, which is how I knew the bots were under so many ip addresses. However the bot attacks are a little less now than they were in Feb.

One good thing about it is the bots are coming at a steady pace, instead of all at once. They seem to come around the same amount per hour, so they are not overloading the server. Someone mentioned that Google finally acknowledged that they are aware of these attacks, and requested info from a site that was under attack. We may have to wait for Google to solve the problem.
User is offlinePM
Go to the top of the page
Toggle Multi-post QuotingQuote Post
jimlongo
post Apr 4 2012, 01:18 PM
Post #6


This is My Life
*******

Group: Members
Posts: 1,120
Joined: 24-August 06
From: t-dot
Member No.: 16



Do a whois on the IP address, I'll bet it's a search engine.


--------------------
User is offlinePM
Go to the top of the page
Toggle Multi-post QuotingQuote Post

Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



- Lo-Fi Version Time is now: 18th December 2014 - 02:33 AM