Introduction
If you are amongst those Website owners who never look into access_log still notice that your Website is slow then there is serious rethinking to do.
Internet is flooded with variety of bad bots
who typically steal your bandwidth
and make huge number of requests to your server
.
Your server starts serving the requests from bad bot
and in return becomes very slow to genuine visitors/customers.
You might say that i have perfect robots.txt
file in place which is allowing only legitimate Website directories to be crawled. You might have setup Crawl-delay
properly so you are safe. Well, like is said you have some rethinking to do.
What are Bad Bots (SpamBots)?
Bad bots
or spambots
are computer generated programs who attacks your site to steal all your data.
They can also do some specific tasks like email harvesting, content scraping or simply trying to index your pages in their less known search engines.
Whatever the reason of their attack they are simply putting immense load on your server and consuming precious bandwidth
. If you are on a shared hosting your mighty suspension due to overuse although the actual traffic (customers/visitors) is pretty less.
How To Block Bad Bots (SpamBots)
The traditional way is the to setup rules in your robots.txt
file to allow/disallow
certain bots from crawling your site.
Most of the decent bots (Google, Yahoo, MSN etc.) obey robots.txt
directives when it comes to crawling.
But, don’t expect the same from Bad Bots
, they are thieves and they tend to ignore robots.txt
so you have to device a different technique for them which i am listing below.
1. Set up crawl delay if you are experiencing load even from good bots. to do this you have to write the following lines in your robots.txt file situated at your root folder where Magento is installed.
# Crawlers Setup User-agent: * Crawl-delay: 10 # Disallow Internet Archiver Wayback Machine User-agent: ia_archiver Disallow: /
Note: Google don’t read crawl-delay so you have to explicitly set crawl rate in your Google Webmaster control panel.
2. Block bad bots
(SpamBots) via .htaccess
file: Whether bots like it or not but they have to obey .htaccess
file for sure.
We can write a huge list of known bad bots and keep on adding to the list by looking at your access_log. One’s which are attacking most should be checked via reverse proxy and if they are illegal bots just block them.
add the following lines of code in your .htaccess
file
# Start Bad Bot Prvention RewriteEngine On RewriteCond %{HTTP_USER_AGENT} ^Bot\ mailto:craftbot@yahoo.com [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^ChinaClaw [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Custo [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^DISCo [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Download\ Demon [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^eCatch [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^EirGrabber [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^EmailSiphon [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^EmailWolf [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Express\ WebPictures [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^EyeNetIE [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^FlashGet [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^GetRight [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^GetWeb! [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Go!Zilla [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Go-Ahead-Got-It [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^GrabNet [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Grafula [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^HMView [NC,OR] RewriteCond %{HTTP_USER_AGENT} HTTrack [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Image\ Stripper [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Image\ Sucker [NC,OR] RewriteCond %{HTTP_USER_AGENT} Indy\ Library [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^InterGET [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Internet\ Ninja [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^JetCar [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^JOC\ Web\ Spider [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^larbin [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^LeechFTP [NC,OR] RewriteCond %{HTTP_USER_AGENT} libwww-perl.* [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Mass\ Downloader [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^MIDown\ tool [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Mister\ PiX [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Navroad [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^NearSite [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^NetAnts [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^NetSpider [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Net\ Vampire [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^NetZIP [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Octopus [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Offline\ Explorer [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Offline\ Navigator [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^PageGrabber [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Papa\ Foto [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^pavuk [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^pcBrowser [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^RealDownload [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^ReGet [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^SiteSnagger [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^SmartDownload [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^SuperBot [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^SuperHTTP [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Surfbot [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^tAkeOut [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Teleport\ Pro [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^VoidEYE [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Web\ Image\ Collector [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Web\ Sucker [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^WebAuto [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^WebCopier [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^WebFetch [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^WebGo\ IS [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^WebLeacher [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^WebReaper [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^WebSauger [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Website\ eXtractor [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Website\ Quester [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^WebStripper [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^WebWhacker [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^WebZIP [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Widow [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^WWWOFFLE [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Xaldon\ WebSpider [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^BlackWidow [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Bolt\ 0 [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Bot\ mailto:craftbot\@yahoo\.com [NC,OR] RewriteCond %{HTTP_USER_AGENT} CazoodleBot [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^ChinaClaw [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Custo [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Default\ Browser\ 0 [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^DIIbot [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^DISCo [NC,OR] RewriteCond %{HTTP_USER_AGENT} discobot [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Download\ Demon [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^eCatch [NC,OR] RewriteCond %{HTTP_USER_AGENT} ecxi [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^EirGrabber [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^EmailCollector [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^EmailSiphon [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^EmailWolf [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Express\ WebPictures [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^EyeNetIE [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^FlashGet [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^GetRight [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^GetWeb! [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Go!Zilla [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Go-Ahead-Got-It [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^GrabNet [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Grafula [NC,OR] RewriteCond %{HTTP_USER_AGENT} GT::WWW [NC,OR] RewriteCond %{HTTP_USER_AGENT} heritrix [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^HMView [NC,OR] RewriteCond %{HTTP_USER_AGENT} HTTP::Lite [NC,OR] RewriteCond %{HTTP_USER_AGENT} HTTrack [NC,OR] RewriteCond %{HTTP_USER_AGENT} ia_archiver [NC,OR] RewriteCond %{HTTP_USER_AGENT} IDBot [NC,OR] RewriteCond %{HTTP_USER_AGENT} id-search [NC,OR] RewriteCond %{HTTP_USER_AGENT} id-search\.org [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Image\ Stripper [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Image\ Sucker [NC,OR] RewriteCond %{HTTP_USER_AGENT} Indy\ Library [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^InterGET [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Internet\ Ninja [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^InternetSeer\.com [NC,OR] RewriteCond %{HTTP_USER_AGENT} IRLbot [NC,OR] RewriteCond %{HTTP_USER_AGENT} ISC\ Systems\ iRc\ Search\ 2\.1 [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Java [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^JetCar [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^JOC\ Web\ Spider [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^larbin [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^LeechFTP [NC,OR] RewriteCond %{HTTP_USER_AGENT} libwww [NC,OR] RewriteCond %{HTTP_USER_AGENT} libwww-perl [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Link [NC,OR] RewriteCond %{HTTP_USER_AGENT} LinksManager.com_bot [NC,OR] RewriteCond %{HTTP_USER_AGENT} linkwalker [NC,OR] RewriteCond %{HTTP_USER_AGENT} lwp-trivial [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Mass\ Downloader [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Maxthon$ [NC,OR] RewriteCond %{HTTP_USER_AGENT} MFC_Tear_Sample [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^microsoft\.url [NC,OR] RewriteCond %{HTTP_USER_AGENT} Microsoft\ URL\ Control [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^MIDown\ tool [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Mister\ PiX [NC,OR] RewriteCond %{HTTP_USER_AGENT} Missigua\ Locator [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Mozilla\.*Indy [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Mozilla\.*NEWT [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^MSFrontPage [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Navroad [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^NearSite [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^NetAnts [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^NetSpider [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Net\ Vampire [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^NetZIP [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Nutch [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Octopus [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Offline\ Explorer [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Offline\ Navigator [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^PageGrabber [NC,OR] RewriteCond %{HTTP_USER_AGENT} panscient.com [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Papa\ Foto [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^pavuk [NC,OR] RewriteCond %{HTTP_USER_AGENT} PECL::HTTP [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^PeoplePal [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^pcBrowser [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Ping [NC,OR] RewriteCond %{HTTP_USER_AGENT} PHPCrawl [NC,OR] RewriteCond %{HTTP_USER_AGENT} PleaseCrawl [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^psbot [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^RealDownload [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^ReGet [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Rippers\ 0 [NC,OR] RewriteCond %{HTTP_USER_AGENT} SBIder [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^SeaMonkey$ [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^sitecheck\.internetseer\.com [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^SiteSnagger [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^SmartDownload [NC,OR] RewriteCond %{HTTP_USER_AGENT} Snoopy [NC,OR] RewriteCond %{HTTP_USER_AGENT} Steeler [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^SuperBot [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^SuperHTTP [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Surfbot [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^tAkeOut [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Teleport\ Pro [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Toata\ dragostea\ mea\ pentru\ diavola [NC,OR] RewriteCond %{HTTP_USER_AGENT} URI::Fetch [NC,OR] RewriteCond %{HTTP_USER_AGENT} urllib [NC,OR] RewriteCond %{HTTP_USER_AGENT} User-Agent [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^VoidEYE [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Web\ Image\ Collector [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Web\ Sucker [NC,OR] RewriteCond %{HTTP_USER_AGENT} Web\ Sucker [NC,OR] RewriteCond %{HTTP_USER_AGENT} webalta [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^WebAuto [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^[Ww]eb[Bb]andit [NC,OR] RewriteCond %{HTTP_USER_AGENT} WebCollage [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^WebCopier [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^WebFetch [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^WebGo\ IS [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^WebLeacher [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^WebReaper [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^WebSauger [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Website\ eXtractor [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Website\ Quester [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^WebStripper [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^WebWhacker [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^WebZIP [NC,OR] RewriteCond %{HTTP_USER_AGENT} Wells\ Search\ II [NC,OR] RewriteCond %{HTTP_USER_AGENT} WEP\ Search [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Wget [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Widow [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^WWW-Mechanize [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^WWWOFFLE [NC,OR] RewriteCond %{HTTP_USER_AGENT} zermelo [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Zeus\.*Webster [NC,OR] RewriteCond %{HTTP_USER_AGENT} ZyBorg [NC, OR] RewriteCond %{HTTP_USER_AGENT} ^attach [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^BackWeb [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Bandit [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^BatchFTP [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Buddy [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Collector [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Copier [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^DA [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^DISCo\ Pump [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Download\ Demon [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Download\ Wonder [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Downloader [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Drip [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^eCatch [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^EirGrabber [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Express\ WebPictures [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^EyeNetIE [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^FileHound [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^FlashGet [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^GetRight [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^GetSmart [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^gotit [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Grabber [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^GrabNet [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Grafula [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^HMView [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^HTTrack [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^InterGET [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Internet\ Ninja [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Iria [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^JetCar [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^JOC [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^JustView [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^larbin [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^LeechFTP [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^lftp [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^likse [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Magnet [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Mag-Net [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Mass\ Downloader [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Memo [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^MIDown\ tool [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Mirror [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Mister\ PiX [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Navroad [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^NearSite [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^NetAnts [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^NetSpider [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Net\ Vampire [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^NetZip [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Ninja [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Octopus [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Offline\ Explorer [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^PageGrabber [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Papa\ Foto [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^pcBrowser [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Pockey [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Pump [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^RealDownload [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Reaper [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Recorder [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^ReGet [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Siphon [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^SiteSnagger [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^SmartDownload [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Snake [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^SpaceBison [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Stripper [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Sucker [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^SuperBot [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^SuperHTTP [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Surfbot [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^tAkeOut [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Teleport\ Pro [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Vacuum [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^VoidEYE [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Web\ Image\ Collector [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Web\ Sucker [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^WebAuto [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^WebCopier [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^WebFetch [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^WebReaper [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^WebSauger [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Website [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Webster [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^WebStripper [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^WebWhacker [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^WebZIP [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Wget [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Whacker [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Sogou [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Speedy [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Baiduspider [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Nutch [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Widow [NC,OR] RewriteCond %{HTTP:Acunetix-Product} ^WVS [NC] RewriteRule ^.* - [F,L] SetEnvIfNoCase User-Agent "^EmailSiphon" bad_bot SetEnvIfNoCase User-Agent "^EmailWolf" bad_bot SetEnvIfNoCase User-Agent "^ExtractorPro" bad_bot SetEnvIfNoCase User-Agent "^CherryPicker" bad_bot SetEnvIfNoCase User-Agent "^NICErsPRO" bad_bot SetEnvIfNoCase User-Agent "^Teleport" bad_bot SetEnvIfNoCase User-Agent "^EmailCollector" bad_bot SetEnvIfNoCase User-Agent "^LinkWalker" bad_bot SetEnvIfNoCase User-Agent "^Zeus" bad_bot SetEnvIfNoCase User-Agent "^botpaidtoclick" bad_bot SetEnvIfNoCase User-Agent "^Click Bot" bad_bot SetEnvIfNoCase User-Agent "^WebRipper" bad_bot SetEnvIfNoCase User-Agent "^Wget" bad_bot SetEnvIfNoCase User-Agent "^Snoopy" bad_bot SetEnvIfNoCase User-Agent "^Security Kol" bad_bot SetEnvIfNoCase User-Agent "^libwww-perl" bad_bot SetEnvIfNoCase User-Agent "^Java" bad_bot SetEnvIfNoCase User-Agent "^DataCha0s" bad_bot SetEnvIfNoCase User-Agent "^Grazer" bad_bot SetEnvIfNoCase User-Agent "^lwp-request" bad_bot SetEnvIfNoCase User-Agent "^lwp-trivial" bad_bot SetEnvIfNoCase User-Agent "^Morpheus" bad_bot SetEnvIfNoCase User-Agent "^Site Sniper" bad_bot SetEnvIfNoCase User-Agent "^Winnie Poh" bad_bot SetEnvIfNoCase User-Agent "^curl" bad_bot SetEnvIfNoCase User-Agent "^Akregator" bad_bot SetEnvIfNoCase User-Agent "^ac-baidu" bad_bot SetEnvIfNoCase User-Agent "(Ubuntu-feisty)$" bad_bot SetEnvIfNoCase User-Agent "^Yandex*" bad_bot SetEnvIfNoCase User-Agent "^Baidu*" bad_bot Order Deny,Allow Deny from env=bad_bot # End Bad Bot Prevention
Author: Magik Magento Blog
Hope That helps.