Configuring Varnish for Drupal

Configuring Varnish for Drupal

Ki Kim's picture

Varnish cache is an accelerator that sits in front of the web application, and serves cached web pages. It can speed up the web site significantly and lift loads off the web application. The web application gets to receive web traffic through Varnish. The unintended side effect is that all incoming traffic for the web application look as if they are originating from a single IP, say, 127.0.0.1, because they do come from a single source, Varnish.

Here is how to configure Drupal 7, Varnish and Apache web server to log correct client IP addresses.

Setting up Varnish to forward to Apache

In order to use Varnish for Drupal 6, you had to use a version of Drupal called Pressflow, but all the tweaks in Presslow necessary for Varnish made into Drupal 7 core. I assume you already have Varnish installed for your site (I use version 3).

Varnish has complex rules in a configuration file typically at /etc/varnish/default.vcl. It uses a C like language called Varnish Configuration Language (VCL). You need to program VCL optimized for Drupal 7 and the people behind Pressflow has a nice example VCL to use.

You may need to make a few changes in the file.

backend default {
  .host = "127.0.0.1";
  .port = "8080";
}

In my case, the hosting server has many sites running, and I changed .host to the internal IP of the site and .port to 80.

backend default {
  .host = "192.168.1.100";
  .port = "80";
}

My instance of Varnish is configured to listen to port 6081 (typical value). It either serves cached page or hands over the request to Apache web server. As far as Apache is concerned, all incoming requests are coming from Varnish. This makes it look like all traffics come from the same IP, which is itself.

Now you can visit your site at http://www.example.com:6081 and know that the pages are served by Varnish. You can check the header of the web page for that. Firebug add-on of Firefox is one tool to do that. This article shows various ways to tell if Varnish is working. When it is ready to deploy your website with Varnish, you need to configure so the web site is reachable through standard http port 80. One way is to configure the firewall settings so incoming traffics at port 80 are forwarded to Varnish at port 6081. Another way is to configure Varnish itself to listen to port 80. In this case, you'd need to configure Apache to listen to a different port, say 8080.

Now we need to somehow restore original IP of incoming traffics to Apache log files and Drupal logs.

Configuring Drupal 7 settings.php and Varnish

First, edit your site's settings.php and uncomment and edit these lines,

$conf['reverse_proxy'] = TRUE;
$conf['reverse_proxy_addresses'] = array('192.168.1.100');

This informs Drupal of existence of a reverse proxy (Varnish) standing in front of Apache.

The IP address is that of Varnish, which is same as the internal IP of the site itself. For a simpler environment, '127.0.0.1' may just work.

Then go back go Varnish VCL file (/etc/varnish/default.vcl) and add these lines in a section "sub vcl_recv" near at the top of the routine.

sub vcl_recv {
  // ...
 
  # Let Drupal know client IP.
  remove req.http.X-Forwarded-For;
  set req.http.X-Forwarded-For = client.ip;
 
  // ...
}

This is setting http header variable, X-Forwarded-For, to the traffic's original IP. If you for some reason decide to name the variable differently, say X-My-Own-Var, then you need to tell Drupal to expect it in settings.php

$conf['reverse_proxy_header'] = 'HTTP_X_MY_OWN_VAR';

Note that HTTP variable uses hyphens and it needs to be capitalized with hyphen turned into underscores and prepend it with 'HTTP_'.

Drupal will start saving logs using correct user IP addresses.

Configuring Apache logs

We now turn our attention to fixing Apache's logs. My site is in Plesk managed environment, so the location of the files can be different for you. The Apache configuration for my site was at,

/var/www/vhosts/<your-domain.com>/conf/httpd.include

There are two places about logs in the file; one for secure (Usually 443) and another for non-secure (Usually 80). Your site may not have configuration for secure hosting.

CustomLog  /var/www/vhosts/<your-domain.com>/statistics/logs/access_log plesklog
ErrorLog  /var/www/vhosts/<your-domain.com>/statistics/logs/error_log

I added a line above them to specify LogFormat,

LogFormat "%{X-Forwarded-For}i %l %u %t "%r" %>s %O "%{Referer}i" "%{User-Agent}i"" varnishlog
CustomLog  /var/www/vhosts/<your-domain.com>/statistics/logs/access_ssl_log varnishlog
ErrorLog  /var/www/vhosts/<your-domain.com>/statistics/logs/error_log

Note first two lines end in "varnishlog", which tells CustomLog to use the LogFormat just defined above named "varnishlog".

The log format is slightly modified version of the default, which could be found in,
/etc/httpd/conf.d/zz010_psa_httpd.conf. This is the default log format named "plesklog".

<IfModule mod_logio.c>
LogFormat "%h %l %u %t "%r" %>s %O "%{Referer}i" "%{User-Agent}i"" plesklog
</IfModule>
<IfModule !mod_logio.c>
LogFormat "%h %l %u %t "%r" %>s %b "%{Referer}i" "%{User-Agent}i"" plesklog
</IfModule>

I just replaced the first variable %h with %{X-Forwarded-For}i. That is, replace the host ip, which is all same coming from Varnish, with the HTTP header variable, X-Forwarded-For, that we added in Varnish VCL file.

That's it. You have configured Varnish, Drupal 7, and Apache to log correct user IPs.

TIP

If your site is http password protected during development, you need to remove it for better testing with Varnish. One easy way to block public access to the site is to insert code to Varnish VCL file.

sub vcl_recv {
  if (client.ip != "12.34.56.78") {
    return ( error );
  }

  // ...
}

Varnish will serve error page if the traffic is not from 12.34.56.78, which would be an allowed IP.
The example Varnish confguration from Pressflow contains code to generate an error page that refreshes every 5 seconds, which is defined down in the file. Feel free to modify the error page if the refreshing bothers you.

References:
* https://fourkitchens.atlassian.net/wiki/display/TECH/Configure+Varnish+3...
* http://janezurevc.name/how-get-clients-ip-number-drupal-when-using-varnish
* https://www.varnish-cache.org/docs/2.1/faq/http.html

Comments

For those wanting to read more about using Drupal and Varnish and SESSION data, see http://www.flink.com.au/tips-tricks/erased-memories-cracked-varnish

For those wanting to read more about using Drupal and Varnish and SESSION data, see http://www.flink.com.au/tips-tricks/erased-memories-cracked-varnish

Lullabot has an excellent tutorial on the .vcl file:

http://www.lullabot.com/articles/varnish-multiple-web-servers-drupal

Post new comment