htaccess reviewed

Posted on November 17th, 2008 by Donace in Webmaster

Since writing A few tricks up my sleeves – htaccess style I have been tweaking away at the htaccess file to prevent the bane of the internet ! (comment spammers!)

These are just a few changes I have made to increase security:

No link for you!

A lot of ‘trackback spiders’ and fake refers hit my site a while ago; this helped weed out a number of the automation tools that created fake ‘trackbacks’.

1
2
3
4
5
6
7
#Trackback Spam
#Denies obvious trackback spam.
 
RewriteCond %{REQUEST_METHOD} =POST
RewriteCond %{HTTP_USER_AGENT} ^.*(opera|mozilla|firefox|msie|safari).*$ [NC]
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /.+/trackback/?\ HTTP/ [NC]
RewriteRule .* - [F,NS,L]

Who are you?

Again even though I love automation and bot’s I hate spam, though one thing common with bots is that they are not always coded with a ‘user agent’. This little code stops ‘post’ commands for these people so they cannot comment or fill out forms etc.

1
2
3
4
5
6
7
#No UserAgent, No Post
#Denies POST requests by blank user-agents. May prevent a small number of visitors from POSTING.
 
RewriteCond %{REQUEST_METHOD} =POST
RewriteCond %{HTTP_USER_AGENT} ^-?$
RewriteCond %{REQUEST_URI} !^/(wp-login.php|wp-admin/|wp-content/plugins/|wp-includes/).* [NC]
RewriteRule .* - [F,NS,L]

Sorry what language is that?

A lot of bots / misc bad guys don’t work in ‘http’; ‘curl’ is used etc. What this effect does of it doesn’t see ‘http’ it prevent access to key areas of the site.

1
2
3
4
5
6
#NO HOST:
#Denies requests that dont contain a HTTP HOST Header
 
RewriteCond %{REQUEST_URI} !^/(wp-login.php|wp-admin/|wp-content/plugins/|wp-includes/).* [NC]
RewriteCond %{HTTP_HOST} ^$
RewriteRule .* - [F,NS,L]

Look but don’t touch
A lot of spammers and bots make use proxies to hide their tracks and so in my previous post I just blocked proxies as a whole. This though blocked a lot legitimate users, so what this chunk of code does is allow proxy users to view the site but not comment. A trade off I thought is worth it.

1
2
3
4
5
6
7
8
#Forbid Proxies
#Denies any POST Request using a Proxy Server. Can still access site, but not comment
 
RewriteCond %{REQUEST_METHOD} =POST
RewriteCond %{HTTP:VIA}%{HTTP:FORWARDED}%{HTTP:USERAGENT_VIA}%{HTTP:X_FORWARDED_FOR}%{HTTP:PROXY_CONNECTION} !^$ [OR]
RewriteCond %{HTTP:XPROXY_CONNECTION}%{HTTP:HTTP_PC_REMOTE_ADDR}%{HTTP:HTTP_CLIENT_IP} !^$
RewriteCond %{REQUEST_URI} !^/(wp-login.php|wp-admin/|wp-content/plugins/|wp-includes/).* [NC]
RewriteRule .* - [F,NS,L]

There is nothing to talk about!

Again in my experience ‘comment bots’ work in two main methods:

1) Go through google and pick url’s
2) Pick a random site and spam every post!

One way they achieve the second is buy editing the url string for wp-comments-post.php; what this does is prevent them accessing / commenting to posts that don’t exist!

1
2
3
4
5
#Real wp-comments-post.php
#Denies any POST attempt made to a non-existing wp-comments-post.php 
 
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /.*/wp-comments-post\.php.*\ HTTP/ [NC]
RewriteRule .* - [F,NS,L]

That’s a wrap!

Now there are a lot more small tweaks but they are specific to this site; though if you are scared / don’t want to or love an easier way to tweak and protect wordpress using htaccess go and grab a plugin from here: This where I ‘borrowed’ a lot of the mentioned amendments from, and as you can see he has an update due soon!

The plugin in brilliant for that extra layer of defence.

Popularity: 16% [?]

Related posts:

  1. Beware the smiling man A few weeks ago I stumbled across some interesting spam; it was generic enough to make me think hmm is this spam or a...
  2. Parasite hosting What is Parasite Hosting? Parasite hosting is a black hat technique that relies on utilizing a domain’s inherent authority to create a free blog/wiki...
  3. Blog Comment Demon Review Blog Comment Demon, spawn of Satan or Pee Wee Herman? The Sales Pitch We all know that links have to be diverse for your...

10 Comments

  • At 2008.11.25 21:05, Dave said:

    Most spam bots I see (and occasionally some other kinds of bots) on my site use legitimate user-agents harvested from access logs on another website somewhere. This means I see weird things like Firefox nightly builds from 10 months ago and Mozilla on AIX accessing my site. I also get a lot of normal user-agents this way that are still spam bots.

    curl is perfectly capable of sending HOST headers and for many sites HOST headers are necessary to even see the site. This may stop spam on your site but if it were adopted universally then the spammers would have no trouble adapting to it.

    The most effective technique I have used to stop blog comment spam is testing to see if the client is running . So far it has been 100% effective. I have some ideas about how to improve this method if the spammers started trying to work around it. I’m planning on writing a whole article about it one day.

    • At 2008.11.26 06:29, Donace said:

      Hey Dave cheers for stopping by, How do you check if the useragent is still active? A pre determined list? Is this auto updated? Interested to see any code if your using it for this effect.

      • At 2009.04.21 03:33, Dave said:

        I don’t check user-agents in the website, I just do that manually when I’m looking at my stats. I also don’t use this information in any way to prevent spam but I probably could. If I were going to do that in code I would keep a list of the user-agents in my database and use PHP to do the lookup whenever someone posted something. I would populate the user-agent list automatically from user-agents I saw on the site rather than trawling the web looking for every possible user-agent, many of which will never visit my site. I would score them based on whether the posts were marked as spam by some other means and then use this score to help determine whether the post was spam or not.

        Also, your anti hacking stuff stripped the word “java script” from my post above which is a pretty fundamental part of the recommendation. Spam bots don’t run java script so if a client is running java script, it’s not a spam bot.

    • At 2008.12.09 08:15, narendra.s.v said:

      my htaccess is a real mess with codes to do theses, but that trackback and dont touch will really help me :D

      • At 2009.01.13 22:06, Nihar said:

        As mentioned in the post, i will check that plugin by askapache. Does that plugin cover all your tricks in this post and previous post?

        • At 2009.01.14 01:20, Donace said:

          It covers the majority of the tweaks yes; it has ALL the ones in this post though.

          • At 2009.04.29 08:09, Hacks to boost your WordPress 2.7 blog said:

            [...] tricks with .htaccess for Wordpress can be found here and [...]

            • At 2009.08.06 11:24, Spunky Jones said:

              I found some of these hacks to be quite helpful. However, when I did apply a couple of them and I received some emails from users that said that they were denied access from the Google reader. The thing that I noticed is that they were be logged as, 127.255.255.255 IP address by my system. They said that they aren’t behind a proxy, but it sure looks like it to me. Looks like they got caught with their hands in the cookie jar!

              • At 2009.08.06 13:07, Donace said:

                The IP is actually a loopback address … so it might actually be a hosting side issue.

                Saying that there are alt ways to get rid of unwanted people and some legit people use proxies (as do some services) so I would suggest if your happy with your defences (if not check out http://thenexus.tk/how-to-stop-comment-spam/ ) and open up the proxies.

                Btw loved your theme design ;)

                • At 2009.08.06 14:34, Spunky Jones said:

                  So far, I am happy with the changes. I would prefer to block proxies, hoping it will cut down on spam and sploggers. The biggest issue that I have is sploggers who rip off my content.

                  Glad you like the theme design. It is simple and search engines seem to love it.

              (A must)
              (Another Must but dont worry will not be published)

              Archives

              Full Archive

              Tag Cloud

              .htaccess adgitize Alexa Internet automation Backlink Backlinks Blog bot Bots code competitons Contest copyright entrecard Firefox Google Google Page Rank How to howto Law link building Link Love links news Optimization PageRank PHP plugin Programming Promotion Rants of a loony toon rapidshare Search Engines Security SEO Site update Site updates Spammers TheDuke traffic tutorial updates Weblogs Webmaster Web traffic