How to configure reverse proxy caching with Nginx

18 min read

Caching is a technique that stores a copy of a given resource for fast retrieval. This technique can be found at any level of a web service, whether it is inside the backend application, on a proxy server, or even in the client web browser. The purpose is to reduce server load and increase responsiveness.

A web application usually runs behind a reverse proxy. It forwards incoming requests to a backend server and performs other tasks: TLS encryption, load balancing, caching, compression, etc. Nginx is a free and open source reverse proxy that implements all of these features.

This article shows how to configure reverse proxy caching with Nginx, increase availability by serving stale responses, limit request concurrency with locking, and normalize requests to increase cache efficiency. The prerequisites are basic knowledge of HTTP and prior experience with Nginx.

§
Introduction

The Nginx documentation should be your go-to reference, particularly the following entries:

This article assumes Nginx is configured with a virtual host answering requests for http://app.example.com/. The application has some static files in /srv/http/app and the backend server is listening on http://127.0.0.1:3000:

upstream app {
	server 127.0.0.1:3000;
}

server {
	listen 80;
	server_name app.example.com;

	root /srv/http/app;

	# Matches all requests.
	location / {
		# Try to serve a file from the root if any, or route the request to
		# @app.
		try_files $uri @app;
	}

	location @app {
		proxy_pass http://app;
	}
}

Before reloading the service, check for any configuration errors with the command nginx -t. The full configuration after applying the lessons from this article is available in § Conclusion.

§
Define a caching zone

The proxy_cache_path directive defines a named cache location that you can enable with the proxy_cache directive.

§
Parameters

This section introduces the most important parameters of proxy_cache_path.

§
path (mandatory)

The path parameter indicates the directory for the cache entries, for example, /var/cache/nginx/app (note that the parent directory is not created automatically).

§
keys_zone (mandatory)

keys_zone=name:size specifies the name and the size of the shared memory area that holds the keys. A size of 1M corresponds to 1 MB of shared memory that can hold 8000 keys.

§
levels

The proxy cache works like a hash table with a key derived from proxy_cache_key, set to $scheme$proxy_host$uri$is_args$args by default. During request processing, these variables are substituted by their values. The cache file location depends on the MD5 hash of proxy_cache_key and the argument of the option levels, that specifies how to split the digest into segments. The purpose is to prevent slowdowns caused by too many entries inside a single directory.

For instance, levels=1:2 organizes the storage hierarchy with two directory levels. The path corresponding to the hashed key b1946ac92492d2347c6235b4d2611184 is /var/cache/nginx/app/b/19/46ac92492d2347c6235b4d2611184:

  • The first-level directory is named after the first byte of the key (b).
  • The second-level directory is named after the next two (19).

By default, all the entries are stored in path.

§
max_size

You might want to set max_size to specify the maximum size of the cache on disk (by default, it is unbounded).

§
manager_*

When the cache exceeds max_size, the cache manager evicts:

  • at most manager_files entries (by default, 100),
  • in iterations of less than manager_threshold milliseconds (by default, 200 ms),
  • separated by a pause of manager_sleep milliseconds (by default, 50 ms).

The loader process that indexes the cache when Nginx is starting has equivalent parameters.

§
inactive

Entries that haven't been used for the time specified with inactive are evicted from the cache, whether they are expired or not. By default, inactive is set to 10m.

§
Configuration

proxy_cache_path defines the shared memory zone, location, and properties of the cache:

http {
	proxy_cache_path /var/cache/nginx/app
		levels=1:2 keys_zone=app:1M max_size=200M;
}

§
Activation

To activate the cache, you need to set proxy_cache to the name of the shared memory zone:

http|server|location {
	proxy_cache app;
}

It is now active, but you still have to specify which responses to cache.

§
Control caching

To save responses to the cache, you can match request or response attributes such as the HTTP method, status code, or headers. You can declare these rules in the Nginx configuration or control the cache from the backend application.

§
Configuration

proxy_cache_valid tells Nginx to cache responses with the matching status codes for the specified duration. The following example caches responses with status code 200, 301, and 302 (defaults when none are given) for 1 minute, and all other responses, including errors, for 10 seconds:

http|server|location {
	proxy_cache_valid 1m;
	proxy_cache_valid any 10s;
}

While proxy_cache_valid includes responses based on the status code, you can exclude resources using the following directives:

  • proxy_cache_bypass: always queries the upstream server without looking at the cache, but saves the response to the cache.
  • proxy_no_cache: always queries the upstream server without looking at the cache, but doesn't save the response to the cache.

These two directives accept string parameters, including variables. If one of these parameters is not empty and different from 0, then the cache is bypassed or disabled for that request. For example:

  • proxy_cache_bypass: 1: always bypasses the cache.
  • proxy_cache_bypass: $http_pragma: bypasses the cache when the request contains the header Pragma.
  • proxy_no_cache $cookie_sessionid: disables the cache for users with a sessionid cookie (but unauthenticated users visiting the same URL will get a public cached response).

You can apply caching directives to a specific location block, but for more complicated scenarios, you can use custom variables and conditionals, as this example from 9 Tips for Improving WordPress Performance demonstrates:

# Enable caching by default.
set $skip_cache 0;

# Do not cache POST requests.
if ($request_method = POST) {
	set $skip_cache 1;
}

# Do not cache URLs with query strings.
if ($query_string != "") {
	set $skip_cache 1;
}

# Do not cache URLs containing the following segments.
if ($request_uri ~* "/wp-admin/|/xmlrpc.php|wp-.*.php|/feed/|index.php|sitemap(_index)?.xml") {
	set $skip_cache 1;
}

# Do not use the cache for logged-in users or recent commenters.
if ($http_cookie ~* "comment_author|wordpress_[a-f0-9]+|wp-postpass|wordpress_no_cache|wordpress_logged_in") {
	set $skip_cache 1;
}

proxy_cache_bypass $skip_cache;

Because proxy_cache_key contains $args, the cache may be filled with many duplicate entries, unless the response somehow depends on the query parameters. Use proxy_cache_min_uses to set the minimum number of requests required before the associated response gets saved to the cache:

http|server|location {
	proxy_cache_min_uses 3;
}

The configuration syntax is a little bit esoteric and has some pitfalls, so it is possible to set the caching policy from the backend.

§
Headers

You can override the default cache policy set in the Nginx configuration using the following response headers, that take precedence over proxy_cache_valid:

  • X-Accel-Expires: indicates the response caching duration (0 to disable caching altogether).
  • Vary: adds the listed request headers to the cache key.
  • Set-Cookie: disables caching if present in the response.

Be careful with the Vary header, as it can render caching useless. For instance, a key based on the entire Cookie header makes no sense because it will likely be different for each logged-in user, so you will end up with as many entries as there are users. See Best practices for using the Vary header.

If X-Accel-Expires is not present in the response, Nginx looks for the following headers:

  • Expires: same as X-Accel-Expires but in a more complicated format.
  • Cache-Control: modern way to set the caching policy.

As these last two headers also affect caching on the client, prefer X-Accel-Expires to explicitly control reverse proxy caching:

  • X-Accel-Expires: 0 to disable caching.
  • X-Accel-Expires: 3600 to cache the response for an hour.

§
In practice

It is difficult to precisely explain how to configure caching because it highly depends on the application, not even mentioning how Cache-Control affects the client web browser. Nevertheless, here's a general approach:

  1. For static resources, set expires to a long duration, and add fingerprints to their names for cache busting.
  2. Define a default cache policy with proxy_cache_valid for all shared resources, ensuring it is disabled for private ones such as user sessions.
  3. Always indicate private resources from the backend with the response header Cache-Control set to private, no-cache, or no-store, to prevent any configuration mismatch that could hinder security (you really don't want to cache private webpages and serve them to everyone).
  4. Use X-Accel-Expires to override the previous policies when it is not possible to do so in the configuration.

To test the configuration, you can instrument the cache using the variable $upstream_cache_status. For example, you can add the cache status (hit, miss) to the response in the extension header X-Cache-Status:

http|server|location {
	add_header X-Cache-Status $upstream_cache_status;
}

If reverse proxy caching is difficult to apply to some resources, then you will likely have to implement caching in the backend instead (which I leave as an exercise to the reader).

§
Increase availability

When a lot of requests arrive at the same time and the cache doesn't contain a valid entry, all the requests get forwarded to the backend server while the cache is being updated. The combination of stale responses and cache locking can prevent these bursts of requests to the backend.

§
Stale responses

Enable proxy_cache_background_update to update the cache in the background and proxy_cache_use_stale with the parameter updating to return a stale response while the entry is being updated:

http|server|location {
	proxy_cache_background_update on;
	proxy_cache_use_stale updating;
}

The entry is updated in the background. Any request that arrives during the update receives a stale response. proxy_cache_use_stale can also return a stale response when the backend is unavailable due to errors or timeouts:

http|server|location {
	proxy_cache_background_update on;
	proxy_cache_use_stale error timeout updating
		http_500 http_502 http_503 http_504;
}

Stale responses limit the number of concurrent requests to the upstream server. However, if no cache entry can be found, all the requests are passed through until it gets populated.

§
Cache lock

Nginx provides proxy_cache_lock to prevent concurrent requests from updating the cache simultaneously. Only the first request is passed upstream to update the cache, while the others wait for its completion:

http|server|location {
	proxy_cache_lock on;
}

The drawback is that requests waiting for the cache are blocked for at least 500 ms, and up to the duration set by proxy_cache_lock_timeout (5 seconds by default). These 500 ms come from the unconfigurable interval at which Nginx wakes up the blocked request tasks to check whether the entry is present or not. To overcome this problem, use the technique from the previous subsection to return a stale response while the entry is being updated in the background:

http|server|location {
	proxy_cache_lock on;
	proxy_cache_background_update on;
	proxy_cache_use_stale error timeout updating
		http_500 http_502 http_503 http_504;
}

Unfortunately, proxy_cache_lock is prone to misuse. If a resource is configured for caching (e.g., with proxy_cache_valid), but the backend always prevents that (e.g., with X-Accel-Expires: 0 or any other cache control header), then the entry will always be missing and Nginx will never be able to return a stale response during the update. Assuming the backend takes 100 ms to answer, the proxy will only be able to make 10 background requests per second to try to update the cache.

Meanwhile, the clients wait for proxy_cache_lock_timeout to expire. Then, their requests are all forwarded to the backend. This not only increases latency, but also causes a surge of requests to the upstream server. Therefore, make sure to keep Nginx synchronized with the backend when it excludes private resources from the cache. You can disable caching based on the virtual host, the location block, or an authentication cookie. As a workaround, you can also decrease proxy_cache_lock_timeout.

§
Normalize request attributes

Response compression reduces bandwidth usage and loading time. Clients advertise the algorithms they support in a request header. Different values may correspond to the same response encoding, causing the cache to store multiple entries for the same response. Normalization is the process that maps these different values to a reduced set of entries, thereby improving caching effectiveness.

§
Compression

Compression can drastically reduce the size of static resources such as JavaScript or CSS. Instead of doing it live, you can pre-compress assets with a higher ratio, and configure Nginx to automatically return the appropriate response:

location / {
	try_files $uri @app;
	brotli_static on;
	gzip_static on;
}

To send a compressed response, the server must choose the best encoding that the client accepts based on the value of the header Accept-Encoding. Clients often support multiple compression algorithms, indicated by a comma-separated list, e.g., Accept-Encoding: br, gzip, deflate.

HTML formatted content is also a good candidate for compression. You can enable live response compression with Nginx, but caching happens before compression, not after. That means responses from the cache are recompressed for each request. As a solution, you can implement compression in the backend instead.

§
Normalization

Assuming Nginx caches responses based on Accept-Encoding, if a client makes a request with Accept-Encoding: br, gzip, deflate, the Brotli-compressed response is saved to a first entry identified by the value of this header. If another client only indicates br, gzip, it gets the same Brotli-compressed response, but it is recorded to a second cache entry.

To prevent caching the same response multiple times, you can perform header normalization. The role of Nginx is to map the Accept-Encoding header to a set of values corresponding to the best compression scheme supported by both the server and the client. For instance, if the header contains br, the response should be compressed with Brotli, if the header contains gzip, it should be compressed with Gzip; otherwise, it shouldn't be compressed at all:

server|location {
	set $encoding "";

	if ($http_accept_encoding ~ br) {
			set $encoding "br";
			break;
	}

	if ($http_accept_encoding ~ gzip) {
			set $encoding "gzip";
			break;
	}
}

Make sure to pass any normalized attributes to the backend:

http|server|location {
	proxy_set_header Accept-Encoding $encoding;
}

§
Cache key

By default, the cache doesn't look at the value of Accept-Encoding. Because the backend returns an encoded response but proxy_cache_key doesn't include the encoding, a client that doesn't support Brotli may receive a previously Brotli-encoded response from the cache. You could insert the header Vary: Accept-Encoding in the response, which has the same effect as adding the value of Accept-Encoding to the cache key. Unfortunately, Nginx uses the original and immutable value for Accept-Encoding, not the normalized value set with proxy_set_header.

As an alternative, you can add $encoding to proxy_cache_key. The default value is $scheme$proxy_host$uri$is_args$args. If you just append $encoding, you can have a key collision when the URL ends with br or gzip. Hence, $encoding should come before $url or any other user-controlled fields:

http|server|location {
	proxy_cache_key $scheme$proxy_host$encoding$uri$is_args$args;
}

This way, Nginx can cache responses with multiple encodings, without duplicate entries. Compared to live recompression, you can afford a higher compression level to decrease bandwidth, with no significant increase in server load.

§
Proxy chain

For demonstration purposes, it is possible to do everything with Nginx, without having to implement response compression in the backend, without modifying proxy_cache_key, and with support for Vary based on the normalized header values. The trick is to chain multiple Nginx "servers".

The first endpoint is the one clients connect to. It normalizes the request attributes, like Accept-Encoding, and passes them to the next endpoint in the chain. This is necessary to get around the request header immutability.

server {
	listen 80;
	server_name app.example.com;

	proxy_set_header Accept-Encoding $encoding;

	location / {
		proxy_pass http://127.0.0.1:8080;
	}
}

The second internal endpoint caches the response. If Vary: Accept-Encoding is set, it associates the entry with the normalized Accept-Encoding given by the previous endpoint.

server {
	listen 127.0.0.1:8080;
	server_name localhost;

	proxy_cache_valid 10m;

	location / {
		proxy_cache app;
		proxy_pass http://127.0.0.1:8081;
	}
}

The third endpoint proxies the request to the backend server and performs live response compression, adding Vary: Accept-Encoding. A separate server is required because compression happens after caching if they are both enabled.

server {
	listen 127.0.0.1:8081;
	server_name localhost;

	brotli on;
	brotli_vary on;

	gzip on;
	gzip_vary on;

	location / {
		proxy_pass http://app;
	}
}

Of course, copying the data around and opening extra connections is less efficient than a single server block.

§
Conclusion

Caching can drastically increase server performance, even when set to a few seconds in high traffic scenarios.

Final recommendations to configure reverse proxy caching with Nginx:

  • Cache static assets and public dynamic webpages.
  • Apply header normalization to cache compressed responses.
  • Always set Cache-Control for private resources.
  • Beware of any mismatch between Nginx and the backend regarding private resources.

Full configuration:

http {
	proxy_cache_path /var/cache/nginx/app
		levels=1:2 keys_zone=app:1M max_size=200M;

	upstream app {
		server 127.0.0.1:3000;
	}

	server {
		listen 80;
		server_name app.example.com;

		root /srv/http/app;

		location / {
			try_files $uri @app;
			brotli_static on;
			gzip_static on;
		}

		location /admin {
			proxy_pass http://app;
		}

		set $encoding "";

		if ($http_accept_encoding ~ br) {
				set $encoding "br";
				break;
		}

		if ($http_accept_encoding ~ gzip) {
				set $encoding "gzip";
				break;
		}

		proxy_set_header Accept-Encoding $encoding;
		proxy_cache_key $scheme$proxy_host$encoding$uri$is_args$args;

		proxy_cache_valid 1m;
		proxy_cache_valid any 10s;
		proxy_cache_min_uses 3;

		proxy_no_cache $cookie_sessionid;

		proxy_cache_lock on;
		proxy_cache_background_update on;
		proxy_cache_use_stale error timeout updating
			http_500 http_502 http_503 http_504;

		location @app {
			proxy_cache app;
			proxy_pass http://app;
		}
	}
}

§
Further readings