Loading...
X

How to block by Referer, User Agent, URL, query string, IP and their combinations in mod_rewrite

As part of the fight against the influx of bots to the site (see the screenshot above), I had to refresh my knowledge of mod_rewrite. Below are examples of mod_rewrite rules that allow you to perform certain actions (such as blocking) for users who meet a large number of criteria at once – see the most recent example to see how flexible and powerful mod_rewrite is.

See also: How to protect my website from bots

Denying access with an empty referrer (Referer)

The following rule will deny access to all requests in which the HTTP Referer header is not set (in Apache logs, "-" is written instead of the Referer line):

RewriteEngine	on
RewriteCond	%{HTTP_REFERER}	^$
RewriteRule	^.*	-	[F,L]

Blocking access on the part of the user agent

When blocking bots by User Agent, it is not necessary to specify the full name – you can specify only part of the User Agent string to match. Special characters and spaces must be escaped.

For example, the following rule will block access for all users whose User Agent string contains “Android 10”:

RewriteEngine	on
RewriteCond	%{HTTP_USER_AGENT}	"Android\ 10"
RewriteRule	^.*	-	[F,L]

Examples of User Agents blocked by this rule:

  • Mozilla/5.0 (Linux; Android 10; SM-G970F) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.66 Mobile Safari/537.36
  • Mozilla/5.0 (Linux; Android 10; Redmi Note 7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.66 Mobile Safari/537.36

How to block access by exact match User Agent

If you need to block access to the site by a certain User Agent with an exact match of the name, then use the If construct (this does not apply to mod_rewrite, but do not forget about this possibility):

<If "%{HTTP_USER_AGENT} == 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)'">
	Require all denied
</If>

<If "%{HTTP_USER_AGENT} == 'Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101 Firefox/45.0'">
	Require all denied
</If>

<If "%{HTTP_USER_AGENT} == 'Mozilla/5.0 (Windows NT 6.1; rv:45.0) Gecko/20100101 Firefox/45.9.0'">
	Require all denied
</If>

If is available since Apache 2.4.

Denying access to certain pages

The %{REQUEST_URI} variable includes everything that goes after the hostname in the request (but does not include what comes after the question mark in the URL), using it you can filter requests by URL, query string, file names or parts of them. For example:

RewriteEngine	on
RewriteCond	%{REQUEST_URI}	"query-string"
RewriteRule	^.*	-	[F,L]

Despite the fact that in the logs of the Apache web server some characters, including Cyrillic, are displayed in URL encoding, you can specify Cyrillic or other letters of national alphabets in these rules. For example, the following rule will block access to an article with the URL https://site.ru/how-to-find-which-file-from/:

RewriteEngine	on
RewriteCond	%{REQUEST_URI}	"how-to-find-which-file-from"
RewriteRule	^.*	-	[F,L]

If you wish, you can specify several URLs (or their parts) at once. Each search string must be enclosed in parentheses; the parenthesized strings must be separated by | (pipe), for example:

RewriteEngine	on
RewriteCond	%{REQUEST_URI}	"(windows-player)|(how-to-find-which-file-from)|(how much-RAM)|(how-to-open-folder-with)|(7-applications-for)"
RewriteRule	^.*	-	[F,L]

Since %{REQUEST_URI} does not include what comes after the question mark in the URL, use %{QUERY_STRING} to filter by the query string that follows the question mark.

How to filter by the query string following the question mark

The %{QUERY_STRING} variable contains the query string that follows the ? (question mark) of the current request to the server.

Note that the filtered value must be URL encoded. For example, the following rule:

RewriteCond %{QUERY_STRING} "p=5373&%D0%B7%D0%B0%D0%B1%D0%BB%D0%BE%D0%BA%D0%B8%D1%80%D0%BE%D0%B2%D0%B0%D1%82%D1%8C"
RewriteRule ^.* - [F,L]

blocks access to the page https://suay.ru/?p=5373&заблокировать, but will not deny access to the page https://suay.ru/?p=5373.

Denying IP and Ranges Access

With mod_rewrite, you can block individual IPs from accessing the site:

RewriteEngine	on
RewriteCond	"%{REMOTE_ADDR}"	"84.53.229.255"
RewriteRule	^.*	-	[F,L]

You can specify multiple IP addresses to block:

RewriteEngine	on
RewriteCond	"%{REMOTE_ADDR}"	"84.53.229.255" [OR]
RewriteCond	"%{REMOTE_ADDR}"	"123.45.67.89" [OR]
RewriteCond	"%{REMOTE_ADDR}"	"122.33.44.55"
RewriteRule	^.*	-	[F,L]

You can also use ranges, but remember that in this case, strings are treated as regular expressions, so the CIDR notation (for example, 94.25.168.0/21) is not supported.

Ranges must be specified as regular expressions – this can be done using character sets. For example, to block the following ranges

  • 94.25.168.0/21 (range 94.25.168.0 - 94.25.175.255)
  • 83.220.236.0/22 (range 83.220.236.0 - 83.220.239.255)
  • 31.173.80.0/21 (range 31.173.80.0 - 31.173.87.255)
  • 213.87.160.0/22 (range 213.87.160.0 - 213.87.163.255)
  • 178.176.72.0/21 (range 178.176.72.0 - 178.176.75.255)

the rule will work:

RewriteEngine	on
RewriteCond	"%{REMOTE_ADDR}"	"((94\.25\.1[6-7]])|(83\.220\.23[6-9])|(31\.173\.8[0-7])|(213\.87\.16[0-3])|(178\.176\.7[2-5]))"
RewriteRule	^.*	-	[F,L]

Note that the range 94.25.168.0 - 94.25.175.255 cannot be written as 94.25.1[68-75], it will be interpreted as the string “94.25.1” and a character set including character 6, range 8-7 and character 5. Due to the range of 8-7, this entry will cause an error on the server.

Therefore, to write 94.25.168.0 - 94.25.175.255, “94\.25\.1[6-7]” is used. Yes, this record does not accurately convey the original range – to increase the precision, you can complicate the regular expression. But in my case, this is a temporary hotfix, so it will do just that.

Also note that the last octet 0-255 can be skipped, since part of the IP address is enough to match the regular expression.

Combining access control rules

Task: block users who meet ALL of the following criteria at once:

1. Empty referrer

2. The user agent contains the string “Android 10”

3. Access was made to a page whose URL contains any of the strings

  • windows-player
  • how-to-find-which-file-from
  • how much-RAM
  • how-to-open-folder-with
  • 7-applications-for

4. The user has an IP address belonging to any of the ranges:

  • 94.25.168.0/21 (range 94.25.168.0 - 94.25.175.255)
  • 83.220.236.0/22 (range 83.220.236.0 - 83.220.239.255)
  • 31.173.80.0/21 (range 31.173.80.0 - 31.173.87.255)
  • 213.87.160.0/22 (range 213.87.160.0 - 213.87.163.255)
  • 178.176.72.0/21 (range 178.176.72.0 - 178.176.75.255)

The following set of rules will match the specified task:

RewriteEngine	on
RewriteCond	"%{REMOTE_ADDR}"	"((94.25.1[6-7]])|(83.220.23[6-9])|(31.173.8[0-7])|(213.87.16[0-3])|(178.176.7[2-5]))"
RewriteCond	%{HTTP_REFERER}	^$
RewriteCond	%{HTTP_USER_AGENT}	"Android\ 10"
RewriteCond	%{REQUEST_URI}	"(windows-player)|(how-to-find-which-file-from)|(how much-RAM)|(how-to-open-folder-with)|(7-applications-for)"
RewriteRule	^.*	-	[F,L]

Please note that rules that are logical OR must be collected into one large rule. That is, you cannot use the [OR] flag with any of the rules, otherwise it will break the logic of the entire rule set.

By the way, I overcame the bots.


Leave Your Observation

Your email address will not be published. Required fields are marked *