Using Varnish HTTP to Boost HTTP Performance

In today's fast-paced digital landscape, web application performance is key to delivering seamless user experiences. Users expect fast-loading web pages and swift interactions. Any delays or sluggishness can lead to frustration, increased bounce rates, lost sales, and a tarnished reputation.

To ensure that your web application meets these expectations, you can leverage optimization technologies to enhance performance. Enter Varnish – a reliable HTTP accelerator that can revolutionize the speed and efficiency of your web application.

In this article, we will explore the power and potential of Varnish, guiding you through its key features, benefits, and use cases. We will share the exact steps to install, configure, and optimize it for maximum performance. Additionally, we will address security considerations and introduce some monitoring tools to ensure that Varnish remains robust and effective.

What is Varnish?

Varnish is a reverse proxy that speeds up websites by caching HTTP content. In a typical architecture, Varnish acts as an intermediary between clients and web servers. It stores and serves frequently accessed web resources from its local memory, resulting in reduced server load and faster response times.

Varnish is open-source, and available for different operating systems, including major Linux distributions, macOS, and Windows. Some of its key features are:

Varnish is highly configurable. It offers a domain-specific language (DSL), known as the Varnish Configuration Language (VCL). Using VCL code, you can implement custom routing, alter requests, and implement connection pooling.
Varnish provides a robust set of caching features, including cache invalidation, defining uncacheable content, and native compression.
Varnish is easily extensible. You can use inline C code to extend your Varnish installation. (Note: Only use this feature if you have expertise in C development. Inline C code is compiled into Varnish, and any bugs in the code could crash Varnish.)
Varnish supports Edge Side Includes (ESI), which allows it to create web pages by combining different fragments. ESI can significantly boost hit rate and overall performance.

Why should you use Varnish?

In addition to the obvious benefits of increased performance and speed, here are a few other reasons why you should use Varnish:

Scalability: Varnish supports multiple backends. Using VCL, you can implement efficient load balancing logic to increase scalability and availability.
Cost savings: By reducing server load and increasing overall throughput, Varnish helps you maximize your resource utilization. This can save costs by reducing the need for additional servers or bandwidth.
Security: By acting as a reverse proxy, Varnish shields your web servers from direct exposure to the internet, reducing your attack surface.

The Varnish workflow

To understand how Varnish works, let’s explore its workflow:

A client sends an HTTP request to access a web resource.
Varnish intercepts the request before it reaches the web server.
Varnish checks whether the requested content is present in its local memory. If it’s present, it’s served directly to the client, saving the round trip to the web server. If it’s not present, Varnish forwards the request to the web server.
The web server generates and sends the response to Varnish.
Varnish relays the response to the client. Depending on the configured caching strategy, Varnish may or may not cache the response for subsequent requests.

The importance of HTTP optimization

HTTP optimization encompasses a set of techniques that can be used to improve website performance. Examples of optimization strategies include caching, reducing the size of images, CSS minification, and using a Content Delivery Network (CDN).

Factors affecting HTTP performance

While optimizing the HTTP performance of your website, it’s important to consider the following factors:

Network latency: The time it takes for a request to travel from the client to the server and back again. Network latency can be influenced by factors such as geographical distance, network congestion, and connection quality.
Server response time: The time taken by the server to process a request and generate a response. Factors like server load, database queries, and code quality can impact response times.
File size and complexity: The size and complexity of web resources, including HTML, CSS, JavaScript, and multimedia files. Large files or complex code structures require more time to download and process, resulting in slower page loads.
Caching policies: Caching configuration and policies have a significant impact on HTTP performance. Efficient caching reduces the frequency of server requests by serving cached content directly. Conversely, improper caching policies can cause stale content or unnecessary requests, negatively affecting performance.
Bandwidth limitations: Limited bandwidth, especially in mobile or low-speed network environments, can significantly affect HTTP performance. Factors like large files, high request volume, and lack of optimization techniques can lead to high bandwidth utilization.

Why bother with HTTP optimization?

Let’s look at a few reasons why HTTP optimization is pivotal for online success:

The modern user is constantly bombarded with information. If a website doesn’t load in the first few seconds, the user is likely to abandon it.
Search engines, including Google, consider page load times when ranking websites. A faster-loading website ranks higher in search results.
Techniques like caching, compression, and configurational fine-tuning can reduce the size of transmitted data, leading to reduced bandwidth usage.
An optimized web server helps deliver a faster and more responsive experience for mobile users.

Installing and configuring Varnish

In this section, we will share the steps to get started with Varnish on an Ubuntu server. Let’s begin.

Installing Varnish on Ubuntu

The list of available Varnish packages is available on the Packagecloud website. As of the time of writing, the latest stable version of Varnish is 7.3.0. According to the official Varnish website, the only three supported versions are 7.3.0, 7.2.1, and 6.0.11, as all other versions have reached end-of-life. Depending on your specific requirements, choose a Varnish version to install. We recommend installing 7.3.0.
Run the following command to get information about all the latest available packages:
```
sudo apt-get update
```
To fetch the right Varnish package for installation, we must configure the repository that contains the package. But first, run this command to install all the dependencies we must configure the Varnish package repository:
```
sudo apt-get install debian-archive-keyring curl gnupg apt-transport-https
```

This command is used to add the GPG key to our package manager:


curl -s -L https://packagecloud.io/varnishcache/varnish60lts/gpgkey | sudo apt-key add -

Finally, execute these commands to configure the Varnish package repository:

. /etc/os-release 
sudo tee /etc/apt/sources.list.d/varnishcache_varnish60lts.list > /dev/null <<-EOF 
deb https://packagecloud.io/varnishcache/varnish60lts/$ID/ $VERSION_CODENAME main 
EOF 
sudo tee /etc/apt/preferences.d/varnishcache > /dev/null <<-EOF 
Package: varnish varnish-* 
Pin: release o=packagecloud.io/varnishcache/* 
Pin-Priority: 1000 
EOF

Run the update command again so that information about the newly configured repository is also fetched.
```
sudo apt-get update
```
Finally, run this command to install the latest (7.3.0) version of Varnish
```
sudo apt-get install varnish
```

Configuring Varnish on Ubuntu

Now that Varnish is installed on our system, we’ll go through some steps to configure it.

Step # 1 – The systemd configuration

The systemd configuration file for Varnish is located at: /lib/systemd/system/varnish.service You may tweak the settings in this file as per your needs. To do so, run the following command, which opens the file in the default editor:

sudo systemctl edit --full varnish

For instance, if you want to increase the maximum number of open file descriptors for Varnish, you can modify the LimitNOFILE parameter in the file, which defaults to 131072. You can also set the max size of the core file using the LimitCORE parameter, which defaults to infinity.

The ExecStart statement in the file includes the different command line options passed to the varnishd process at startup. You can pass your desired port, cache size, and any other options to Varnish by changing this command. For instance, we have changed the port to 80, and set the size of cache to 5 gigabytes. The ExecStart command is updated to:

ExecStart=/usr/sbin/varnishd \ 
           	  -a :80 \ 
           	  -a www.site24x7.com,PROXY \ 
           	  -p feature=+http2 \ 
           	  -f /etc/varnish/default.vcl \ 
           	  -s malloc,5g

Once you have made all the changes to the file, save it, and run the following command:

sudo systemctl daemon-reload

Step # 2 – Configuring Apache to run with Varnish

The next step is to configure your web server to run with Varnish. For this article, we will be using Apache, but you can configure any server that uses HTTP.

Edit the /etc/apache2/ports.conf configuration file and change the listening port from 80 to 8080. (You may use a different port based on your preferences)
Also, replace <VirtualHost *:80> with <VirtualHost *:8080> across all virtual host files. (You may use a different port based on your preferences)

Step # 3 – Configuring your backends using VCL

The default VCL file is located at /etc/varnish/default.vcl. The default backend specification in the file uses the IP 127.0.0.1 and port 8080.

backend default { 
	.host = "127.0.0.1"; 
	.port = "8080"; 
}

If you chose a different port in the last step, specify that here. You can also add more backends to your Varnish setup using the same syntax.

backend python { 
	.host = "127.0.0.1"; 
	.port = "7000"; 
}

The above command will configure another backend for our Python-based web application.

Step # 4 – Restarting Apache and Varnish

Finally, we are ready to restart Apache and Varnish so that all the changes can be applied. Run the following command:

sudo systemctl restart apache2 varnish

Getting the most out of Varnish

Unlike most systems, Varnish doesn’t use configuration directives, and instead relies on VCL for configuration. Whether you want to implement an access control list, add health checks, use hashing for cached data, or serve content based on the user device, you can achieve it with VCL. Let’s look at a few tips and tricks to optimize Varnish configuration.

Configuring cache rules based on URL patterns and headers

By defining specific rules in VCL, you can control which content should be cached and for how long. One approach is to create cache rules based on URL patterns. Varnish allows you to match URLs using regular expressions and apply specific caching directives accordingly. Let’s look at an example:

sub vcl_recv { 
  # Match URLs starting with "/articles/" 
 if (req.url ~ "^/articles/") { 
	# Set a longer cache Time to Live (TTL) for articles 
	set beresp.ttl = 1h; 
  } 

  # Match URLs ending with ".css" or ".js" 
  if (req.url ~ "\.(css|js)$") { 
	# Cache these static assets for a longer duration 
	set beresp.ttl = 7d; 
  } 
}

In the above code snippet, we are setting a longer TTL for the articles on our website, as they are less frequently updated. We are also caching static assets for a longer duration.

Similarly, you can also configure cache rules based on specific request or response headers. Here's an example:

sub vcl_backend_response { 
  # Cache responses with a specific header value 
  if (beresp.http.X-Cacheable == "true") { 
	# Set a custom cache TTL for cacheable responses 
	set beresp.ttl = 1h; 
  } 
}

In the above snippet, we update the TTL of a cache result if the response contains a particular header.

Setting cache invalidation strategies

It’s important to have a cache invalidation strategy in place to ensure that users never get stale or outdated content. Varnish supports three different cache invalidation strategies:

HTTP purging: This method involves discarding an object from the cache. An example:
```
sub vcl_recv { 
  if (req.method == "PURGE") { 
	if (client.ip !~ purge_acl) { 
  	return(synth(403, "Forbidden")); 
	} 
	# Extract the URL to be purged 
	set req.http.X-Purge-URL = regsub(req.url, "^[^:]*:[/]{2}", ""); 
	return (purge); 
  } 
}  
```
In the above code snippet, we first check whether the request method is PURGE, then verify that the client IP is part of the purge ACL, then extract the URL using a regular expression, and finally purge the URL from the cache.

Bans: Another way to enforce cache invalidation is by banning certain content from being served by the cache. For example:

sub vcl_backend_response { 
  # Add cache tags to the response headers 
  set beresp.http.Cache-Tags = "category-123"; 
} 

sub vcl_recv { 
  # Invalidate cache when specific cache tags are present in the request 
  if (req.http.Cache-Tags ~ "category-123") { 
	# Remove the associated content from the cache 
	ban("obj.http.Cache-Tags ~ category-123"); 
  } 
}

In the above code snippet, we are imposing a ban when a particular tag is part of the response headers.

Force a cache miss: To ensure that Varnish never caches an object, you can use the req.hash_always_miss parameter.
```
sub vcl_recv { 
  # Bypass cache for requests with a specific query parameter 
  if (req.url ~ "^/api" && req.url ~ "refresh=true") { 
	set req.hash_always_miss = true; 
	return (hash); 
  } 
}  
```
In the above code snippet, we set the req.hash_always_miss to true for all API requests that have a specific query parameter set to true.

Tuning Varnish for even better performance

Several parameters in Varnish can be adjusted to optimize performance based on your specific needs. Here are a few examples:

thread_pools: Varnish supports multiple thread pools, which can be useful in scenarios with high traffic or multiple backends. By configuring separate thread pools for different purposes, such as handling client connections, you can fine-tune performance and throughput.
backend_idle_timeout: Specifies the idle timeout for unused backend connections. Setting an appropriate value helps in timely closure of unnecessary connections, preventing resource waste.
gzip_level: Enables compression of HTTP responses using gzip. Adjusting the compression level (default: 6, minimum: 0, maximum: 9) balances the trade-off between size reduction and CPU usage.
http_resp_hdr_len and http_resp_size: These parameters control the maximum length and size of HTTP response headers and bodies, respectively. Setting appropriate values for them ensures that Varnish can efficiently handle responses of varying sizes.
default_keep: This parameter represents the default time for which Varnish will keep an object in the cache. Configuring an appropriate value will ensure that useless records don’t stay cached for too long.

Intelligent routing between multiple backends

If you are using multiple backends, you can use VCL to implement logic to route requests based on factors such as request headers, URL patterns, or other conditions. Consider this example:

sub vcl_recv { 
  if (req.http.User-Agent ~ "Mobile") { 
	# Route mobile users to a specific backend server 
	set req.backend_hint = mobile_backend; 
  } else if (req.url ~ "^/auth/") { 
	# Route auth requests to a dedicated backend server 
	set req.backend_hint = auth_backend; 
  } else { 
	# Route all other requests to the default backend server 
	set req.backend_hint = default_backend; 
  } 
}

In the above code snippet, we are routing mobile client requests to a dedicated mobile backend, authentication requests to the authentication backend, and all other requests to the default backend server.

High availability considerations

Consider the following guidelines to ensure high availability:

Load balancing: Distribute traffic across multiple Varnish instances.
Redundancy: Deploy Varnish in a redundant configuration for failover.
Health monitoring: While defining backend servers in VCL, use the probe keyword to enforce automated health checks for the servers.
Scalability: Scale horizontally or vertically to handle increased workload.
Disaster recovery: Implement backup and recovery strategies for system failures or data loss.

Security considerations while using Varnish

Varnish offers several security controls to protect your web servers from cyberattacks.

Set up SSL/TLS: Through its addon Hitch, Varnish supports SSL/TLS for all HTTP communications. To prevent sensitive data exposure, enable SSL/TLS.
Enable CLI interface authentication: Secure access to the Varnish CLI interface with Pre-Shared Key (PSK) authentication. This ensures that only authorized individuals can perform operations on the Varnish instance.
Use access control lists (ACLs): Use VCL to define ACLs, and use them to govern access to Varnish and your web servers.
Sanitize user content: Use VCL to sanitize user content, remove sensitive information, or apply custom security controls.
Rate limiting: Varnish allows you to limit the number of requests per second from an IP address or for specific resources. Rate limiting helps prevent DDoS attacks and excessive resource consumption.

Varnish tools for monitoring

Varnish is bundled with different tools for monitoring and reporting.

varnishtop is used for real-time monitoring and analysis of Varnish cache traffic, providing insights into the most frequently requested URLs and client IP addresses.
varnishhist can be used for analyzing the trends of response times in the Varnish cache, helping identify performance bottlenecks and optimize caching strategies.
varnishstat is used for displaying real-time statistics about the performance, health, and utilization of an instance.
varnishtest helps in automated testing and validation of Varnish configurations, ensuring proper functionality and performance.
varnishncsa is used for displaying HTTP request information in the NCSA common log file format.

Conclusion

Varnish is a reliable HTTP cache that plays a pivotal role in enhancing the performance and scalability of web architectures. With its diverse feature set, customizable VCL configuration, and built-in security features, Varnish empowers developers to optimize their applications and safeguard their web servers from potential threats.

However, it's important to remember that Varnish is just one part of the equation. For optimal website performance, monitoring performance from the users’ perspective is crucial. This is where Site24x7 comes in.

Site24x7 offers comprehensive website performance monitoring solutions that allow you to track the performance of your web applications from various geographical locations. By combining the power of Varnish with the insights provided by Site24x7, you can guarantee optimal performance for your website.

Was this article helpful?

Sorry to hear that. Let us know how we can improve the article.

How to use Varnish HTTP to boost your HTTP performance