Table of contents
1. How to change Varnish options
2. How to change cache retention time in Varnish
3. How to prevent Varnish from creating new cache for different browsers
4. A 403 page got into the Varnish cache and now it is shown to all users
5. How to make Apache logs show real IP address instead of 127.0.0.1 when used with Varnish
6. How to delete cookies that prevent caching
7. How to exclude certain pages from caching
8. How to exclude the home page from caching
9. How to increase the size of the Varnish cache
10. How to increase connection timeout in Varnish
11. How to exclude a specific site from Varnish caching
12. How to cache only a specific domain in Varnish
13. How to redirect HTTP to HTTPS in Varnish
14. How to delete cookies from all hosts except pages of a specific host
15. How to remove cookies from all pages except specific URLs
16. Adding and Removing HTTP Headers
17. Additional documentation
Once you've installed and configured Varnish to work with your web server. It would seem that everything is working fine and now you can move on to other things. But the truth is, with the default settings, Varnish is (almost) completely useless. The point is that by default:
-
cache storage time 2 minutes
-
a NEW cache is created for EACH User Agent. That is, if a page was requested by a user with a Chrome browser of one version, and then a user came with a different version of Chrome or with a different browser, then a new page will be created for him, instead of showing the one saved in the cache.
That is, the cache stores data for 2 minutes, which with a probability of 99% will not be shown to anyone for these two minutes, and then the data is deleted.
Moreover, with the default settings, Varnish is even harmful: a page with a response code of 503 (error on the server) or 403 (access denied) may be cached and this page will show everyone even when the problem is fixed.
In general, even though many tutorials on the Internet end up after installing Varnish and setting up a web server, you need to continue and do everything right. This is what this article is about. Here we will cover the minimum required configuration of Varnish to be useful and not harmful, and will also list examples of Varnish configuration that you can use in various situations.
How to change Varnish options
You can configure Varnish in the /etc/varnish/default.vcl configuration file, as well as by editing the command line launch options:
systemctl edit --full varnish
We will use both methods.
For the changes made in the default.vcl file to take effect, you must run the command:
systemctl reload varnish
This command will reload the configuration but keep the cache.
To make the changes made to the start command take effect, you must run:
systemctl restart varnish
This command will not save the cache - it will be cleared.
How to change cache retention time in Varnish
By default, Varnish keeps the cache for only 2 minutes - it is not enough for many situations.
The issue of storing the cache in Varnish is quite complex and can be devoted to a separate article. Here's a look at the basics.
First, there are three periods that Varnish keeps data in its cache:
-
TTL - Time To Live. This is the lifetime of the data. This is the period in which the data is stored in the cache and is considered fresh. “Fresh” - this means that when a request is received to show the page, the contents of the cache will be returned.
-
Grace. Grace period. This period comes after TTL. The data is still stored in the cache, and when a request is received for it, data from the cache is returned, at the same time a data update is started - a request is made to the web server.
-
Keep. Storage period. This is the period after TTL and Grace. The data is still stored in the cache, but is served to the user under certain conditions.
All these periods can be customized:
-
in the config file
-
in the command line
-
via HTTP headers
We will use the configuration file /etc/varnish/default.vcl. Open it:
vim /etc/varnish/default.vcl
The settings need to be made in the vcl_backend_response section. There are three options available, corresponding to each period
-
beresp.ttl
-
beresp.grace
-
beresp.keep
Example:
sub vcl_backend_response {
# First we set the TTL value for most of the content that needs to be cached
set beresp.ttl = 10m;
set beresp.grace = 2h;
# Now we can set specific TTLs based on the content to be cached
# For VoD, we set a medium-long TTL and a long grace period, since VoD
# content is not prone to change. This allows us to use this cache
# for the most requested content
if (beresp.url ~ "/vod") {
set beresp.ttl = 30m;
set beresp.grace = 24h;
}
# For live content we use a very low TTL and an even smaller grace period
# since the live content is no longer *live* once it has been consumed
if (beresp.url ~ "/url") {
set beresp.ttl = 10s;
set beresp.grace = 2s;
}
# We are increasing the *keep* duration for IMS
if (bereq.http.If-Modified-Since) {
set beresp.keep = 10m;
}
}
As you can see from the example, suffixes are used:
-
ms - milliseconds
-
s - seconds
-
m - minutes
-
h - hours
-
d - days
-
w - weeks
-
y - years
Additional Information:
How to prevent Varnish from creating new cache for different browsers
If the web server sends the HTTP header
Vary: User-Agent
Varnish then creates new pages for each browser version. That is, different cache entries will be created for Chrome 85 and Chrome 86! Apache sends this HTTP header by default.
Let's demonstrate this.
First, let's make a request to the website using port 8080 (that is, bypassing the cache, we connect directly to Apache):
curl -I 'http://w-e-b.site:8080/?act=all-country-ip&city=Pattaya'
Pay attention to the line:
Vary: User-Agent,Accept-Encoding
There may be other options, for example:
Vary: User-Agent
Now we will run the command twice
time curl -s 'https://w-e-b.site/?act=all-country-ip&city=Pattaya' -A 'Chrome' > /dev/null
In it, we measure the time and can make sure that the second time the cache is created, the time it takes to get the page is much shorter. We also specified Chrome as the User Agent.
But if you change the User Agent value, then you can see from the time spent that the cache is being re-created:
time curl -s 'https://w-e-b.site/?act=all-country-ip&city=Pattaya' -A 'Firefox' > /dev/null
This behavior of the caching system usually doesn't make sense. Therefore, you need to make it so that Varnish uses the same cache for all User Agents.
There are several ways to do this. The easiest is to change the web server settings so that it does not send the “Vary: User-Agent” HTTP header.
The following shows how to change Apache settings so that it doesn't send “Vary: User-Agent”.
On Debian and derivatives:
a2enmod headers
Then add the line to the configuration file /etc/apache2/apache2.conf:
Header set Vary "Accept-Encoding"
Reboot the server:
systemctl restart apache2
On Arch Linux and derivatives:
Make sure to uncomment the following line in /etc/httpd/conf/httpd.conf:
LoadModule headers_module modules/mod_headers.so
Then add the line to the same file (/etc/httpd/conf/httpd.conf):
Header set Vary "Accept-Encoding"
Reboot the server for the changes to take effect:
systemctl restart httpd.service
We have rewritten the Vary HTTP header to only have Accept-Encoding.
This setup is definitely worth the effort! Varnish will now have one cache for all web browsers.
Additional Information:
A 403 page got into the Varnish cache and now it is shown to all users
It may happen that a user who is prohibited from viewing the site (banned by IP or by User Agent) opened a page and it got into the cache. As a result, now all users will be shown a message that access is denied instead of the normal page.
To fix this, add the following setting:
sub vcl_backend_response {
if (beresp.status == 403 || beresp.status == 404 || beresp.status >= 500)
{
set beresp.ttl = 3s;
}
}
How to make Apache logs show real IP address instead of 127.0.0.1 when used with Varnish
Since Apache is now receiving requests from Varnish, now the web server logs always show 127.0.0.1 as client IP addresses.
This can be fixed as follows:
-
We tell Varnish to add the X-Forwarded-For HTTP header containing the user's real IP address when sending requests to the Apache server.
-
And in Apache, we enable the mod_remoteip module and specify in the settings to use the IP address from the X-Forwarded-For HTTP header as the user's real IP address
That is, you need to configure both Varnish and Apache. But the setup is very useful and well worth it!
Let's start by configuring Varnish, add to its config file:
sub vcl_recv {
if (req.restarts == 0) {
if (req.http.X-Forwarded-For) {
set req.http.X-Forwarded-For = req.http.X-Forwarded-For + ", " + client.ip;
} else {
set req.http.X-Forwarded-For = client.ip;
}
}
}
The web server settings differ slightly depending on the distribution.
On Debian and derivatives:
a2enmod remoteip
Then add the line to the configuration file /etc/apache2/apache2.conf:
RemoteIPHeader x-forwarded-for
Reboot the web server:
systemctl restart apache2
On Arch Linux and derivatives:
Make sure to uncomment the following line in /etc/httpd/conf/httpd.conf:
LoadModule remoteip_module modules/mod_remoteip.so
Then add the line to the same file:
RemoteIPHeader x-forwarded-for
Reboot the server for the changes to take effect:
systemctl restart httpd.service
Additional Information:
Some tutorials advise you to modify the Apache log format - this will also work if done correctly, but it will take longer.
How to delete cookies that prevent caching
Requests that send cookies are not cached. If you have ad units or metrics (counters) on your site, then we can say that Varnish does not cache anything. Moreover, in the case of cookies for ad networks and counters, this is also absolutely useless data for the server.
To remove all cookies and allow data to be cached add:
sub vcl_recv {
unset req.http.Cookie;
}
This setting will disrupt sites that rely on cookies. For example, if you have a WordPress site with registration for users, then cookies are required to identify users. In such cases, it is better to use another solution, such as Memcached.
How to exclude certain pages from caching
To exclude from caching all paths containing the string “/path/for/exclude/” anywhere in the URL:
sub vcl_recv {
if (req.url ~ "^/path/for/exclude/") {
return (pass);
}
}
Note that the ~ (tilde) character means that the search is performed using a regular expression. And the ^ (caret) character means the beginning of a line.
If you want to exclude URLs in which “/path/for/exclude/” appears in any part of the URL, then remove the ^ character, for example:
sub vcl_recv {
if (req.url ~ "/path/for/exclude/") {
return (pass);
}
}
If you want the search to be performed not by a regular expression, but by an exact match, then replace ~ with ==, for example:
sub vcl_recv {
if (req.url == "/path/for/exclude/") {
return (pass);
}
}
You can use the if () {} construct several times or combine them into one. Symbol || means logical “OR”:
sub vcl_recv {
if (req.url ~ "myip" || req.url ~ "proxy-checker" || req.url ~ "my-user-agent") {
return (pass);
}
}
That is, the previous setting will exclude from caching all pages with the string “myip” or the string “proxy-checker” or the string “my-user-agent” anywhere in the URL.
An example in which, for the host suip.biz, pages containing the string “/ru” in the URL are excluded from the cache:
sub vcl_recv {
if (req.http.host == "suip.biz" && req.url == "/ru") {
return (pass);
}
}
How to exclude the home page from caching
The following example excludes the home page from the cache for the host suip.biz:
sub vcl_recv {
if (req.http.host == "suip.biz" && req.url == "/") {
return (pass);
}
}
How to increase the size of the Varnish cache
The cache size settings must be changed in the command launch line, for this:
systemctl edit --full varnish
To resize the in-memory cache, edit the -s option. By the way, malloc means that the cache is stored in RAM. You can specify to store it in a file - if you need it, then refer to the documentation.
Specify the desired cache size, for example -s malloc,1400m.
Restart the varnish service for the changes to take effect:
systemctl restart varnish
How to increase connection timeout in Varnish
If the sending of data from Varnish to the client is not complete within 60 seconds, the connection is dropped. For regular sites this is fine, but for video portals or long-term services it may not be enough.
Varnish has quite a few timeout parameters, you can view their value with the command:
varnishadm param.show | grep timeout
Output example:
backend_idle_timeout 60.000 [seconds] (default)
between_bytes_timeout 60.000 [seconds] (default)
cli_timeout 60.000 [seconds] (default)
connect_timeout 3.500 [seconds] (default)
first_byte_timeout 60.000 [seconds] (default)
idle_send_timeout 60.000 [seconds] (default)
pipe_timeout 60.000 [seconds] (default)
send_timeout 60.000 [seconds] (default)
thread_pool_timeout 300.000 [seconds] (default)
timeout_idle 5.000 [seconds] (default)
timeout_linger 0.050 [seconds] (default)
The parameter we need, which is responsible for the maximum time for sending data during one connection, is called send_timeout.
To change it, you need to edit the command line
systemctl edit --full varnish
Add an option to the command like this:
-p send_timeout=SECONDS
For example, to set a connection timeout for 10 minutes:
-p send_timeout=600
Please note that you cannot specify the “s” suffix or any other, otherwise the program will not start. You only need to specify a number that means seconds for the connection timeout.
With this setting, we changed the value of the maximum time for sending data. But there is another timeout that sets the maximum connection duration. To edit it, open the /etc/varnish/default.vcl file and add the settings there:
backend default {
.connect_timeout = 600s;
.first_byte_timeout = 600s;
.between_bytes_timeout = 600s;
}
In order not to edit the service start options, I tried to add the send_timeout option to the config file, but in this case, the service does not start due to an error. If you know what the matter is and how to do without adding an option in the command line, then write in the comments.
Restart the varnish service for the changes to take effect:
systemctl restart varnish
Additional Information:
How to exclude a specific site from Varnish caching
Exception from hashing softocracy.ru:
sub vcl_recv {
if (req.http.host ~ "(www\.)?softocracy\.ru") {
return(pass);
}
}
If you do not want any site to be cached, then exclude it from caching entirely. In the example above, change softocracy.ru to the site's domain, which should not be cached.
How to cache only a specific domain in Varnish
For all other domains, pass must be returned:
sub vcl_recv {
# if it's any domain other than example.com, then skip caching
if (! req.http.host ~ "(www\.)?example\.com") {
return (pass);
}
# otherwise switch to default behavior
}
pass tells Varnish not to look into its cache, it will always receive content from the destination web server.
How to redirect HTTP to HTTPS in Varnish
If you use Varnish, it is no longer possible to redirect to HTTPS using the web server methods, since Apache no longer works with HTTPS connections and does not even listen on port 443.
The following example will show you how to configure Varnish to redirect all requests to suip.biz from HTTP to HTTPS.
import std;
sub vcl_recv {
# We ask Varnish to give 750 status for HTTP requests from external IP to port 80,
# but not with SSL Termination Proxy (Hitch).
if ((client.ip != "127.0.0.1" && std.port(server.ip) == 80) && (req.http.host ~ "^(?i)(www\.)?suip.biz")) {
set req.http.x-redir = "https://" + req.http.host + req.url;
return (synth(750, ""));
}
}
sub vcl_synth {
# Listen to 750 status from vcl_recv.
if (resp.status == 750) {
// Redirect to HTTPS with a 301 status.
set resp.status = 301;
set resp.http.Location = req.http.x-redir;
return(deliver);
}
}
In the previous configuration example, replace the “suip.biz” domain with your own.
How to delete cookies from all hosts except pages of a specific host
If you want cookies to be deleted for pages of all hosts, except for the pages of a specific host, then use the setting like this:
sub vcl_recv {
if (! req.http.host ~ "(www\.)?HERE\.DOMAIN") {
unset req.http.Cookie;
}
}
Instead of “HERE\.DOMAIN” enter the domain on which you do not need to delete cookies. Note the escaping of the point.
An example of a setting that deletes cookies from all domains, except for the pages of the domains wxmaxima.ru and softocracy.ru:
sub vcl_recv {
if (! req.http.host ~ "(www\.)?wxmaxima\.ru" && ! req.http.host ~ "(www\.)?softocracy\.ru") {
unset req.http.Cookie;
}
}
Attention:
1. Restart the varnish service for the changes to take effect:
systemctl restart varnish
2. In the config file use “unset req.http.Cookie;” only once! If you use it two or more times, the behavior will not be what you expect. Group all the conditions you need using logical AND and OR before using “unset req.http.Cookie;” only.
How to remove cookies from all pages except specific URLs
If you want to exclude only certain pages from deleting cookies, then use the setting like this:
sub vcl_recv {
if (! req.url ~ "^/PATH/FOR/EXCLUDE/") {
unset req.http.Cookie;
}
}
An example of a setting that deletes cookies from all domains, except for the pages of the wxmaxima.ru domains and pages on any domains whose URL contains the string “phpmyadmin”:
sub vcl_recv {
if (! req.http.host ~ "(www\.)?wxmaxima\.ru" && ! req.url ~ "^/phpmyadmin") {
unset req.http.Cookie;
}
}
Attention:
1. Restart the varnish service for the changes to take effect:
systemctl restart varnish
2. In the config file use “unset req.http.Cookie;” only once! If you use it two or more times, the behavior will not be what you expect. Group all the conditions you need using logical AND and OR before using “unset req.http.Cookie;” only.
Adding and Removing HTTP Headers
Varnish Cache gives you the ability to modify, add, and remove HTTP headers for the request and response object.
Request headers
The vcl_recv subroutine is called at the beginning of the request, and this is where we will change the request headers. We will add a hello header with the value world and remove the user agent header.
sub vcl_recv {
...
set req.http.hello = "world";
unset req.http.user-agent;
...
}
The req.http object is for accessing any request header and is readable only from vcl_recv and vcl_deliver.
Response headers
vcl_backend_response is called when Varnish Cache receives response headers from the upstream service. We will change the header cache control and set it to “public, max-age=600”, and remove the server header.
sub vcl_backend_response {
...
set beresp.http.cache-control = "public, max-age=600";
unset beresp.http.server;
...
}
The beresp.http object is for accessing any response header and is readable only from vcl_backend_response and vcl_deliver.
Additional documentation: