Caching is a technique that stores a copy of a given resource for fast retrieval. This technique can be found at any level of a web service, whether it is inside the backend application, on a proxy server, or even in the client web browser. The purpose is to reduce server load and increase responsiveness.
A web application usually runs behind a reverse proxy. It forwards incoming requests to a backend server and performs other tasks: TLS encryption, load balancing, caching, compression, etc. Nginx is a free and open source reverse proxy that implements all of these features.
This article shows how to configure reverse proxy caching with Nginx, increase availability by serving stale responses, limit request concurrency with locking, and normalize requests to increase cache efficiency. The prerequisites are basic knowledge of HTTP and prior experience with Nginx.
The Nginx documentation should be your go-to reference, particularly the following entries:
ngx_http_proxy_module: documentation for the caching directives.
- Alphabetical list of index variables: variables defined in the configuration.
- Configuration file measurement units: list of time and space specifiers.
This article assumes Nginx is configured with a virtual host answering requests
http://app.example.com/. The application has some static files in
/srv/http/app and the backend server is listening on
Before reloading the service, check for any configuration errors with the
nginx -t. The full configuration after applying the lessons from this
article is available in § Conclusion.
§Define a caching zone
This section introduces the most important parameters of
path parameter indicates the directory for the cache entries, for
/var/cache/nginx/app (note that the parent directory is not created
keys_zone=name:size specifies the
name and the
size of the shared memory
area that holds the keys. A size of
1M corresponds to 1 MB of shared memory
that can hold 8000 keys.
The proxy cache works like a hash table with a key derived from
proxy_cache_key, set to
$scheme$proxy_host$uri$is_args$args by default.
During request processing, these variables are substituted by their values. The
cache file location depends on the MD5 hash of
proxy_cache_key and the
argument of the option
levels, that specifies how to split the digest into
segments. The purpose is to prevent slowdowns caused by too many entries inside
a single directory.
levels=1:2 organizes the storage hierarchy with two directory
levels. The path corresponding to the hashed key
- The first-level directory is named after the first byte of the key (
- The second-level directory is named after the next two (
By default, all the entries are stored in
You might want to set
max_size to specify the maximum size of the cache on
disk (by default, it is unbounded).
When the cache exceeds
max_size, the cache manager evicts:
- at most
manager_filesentries (by default, 100),
- in iterations of less than
manager_thresholdmilliseconds (by default, 200 ms),
- separated by a pause of
manager_sleepmilliseconds (by default, 50 ms).
The loader process that indexes the cache when Nginx is starting has equivalent parameters.
Entries that haven't been used for the time specified with
evicted from the cache, whether they are expired or not. By default,
is set to
proxy_cache_path defines the shared memory zone, location, and properties of
To activate the cache, you need to set
proxy_cache to the name of the
shared memory zone:
It is now active, but you still have to specify which responses to cache.
To save responses to the cache, you can match request or response attributes such as the HTTP method, status code, or headers. You can declare these rules in the Nginx configuration or control the cache from the backend application.
proxy_cache_valid tells Nginx to cache responses with the matching status
codes for the specified duration. The following example caches responses with
status code 200, 301, and 302 (defaults when none are given) for 1 minute, and
all other responses, including errors, for 10 seconds:
proxy_cache_valid includes responses based on the status code, you can
exclude resources using the following directives:
proxy_cache_bypass: always queries the upstream server without looking at the cache, but saves the response to the cache.
proxy_no_cache: always queries the upstream server without looking at the cache, but doesn't save the response to the cache.
These two directives accept string parameters, including variables. If one of
these parameters is not empty and different from
0, then the cache is
bypassed or disabled for that request. For example:
proxy_cache_bypass: 1: always bypasses the cache.
proxy_cache_bypass: $http_pragma: bypasses the cache when the request contains the header
proxy_no_cache $cookie_sessionid: disables the cache for users with a
sessionidcookie (but unauthenticated users visiting the same URL will get a public cached response).
You can apply caching directives to a specific location block, but for more complicated scenarios, you can use custom variables and conditionals, as this example from 9 Tips for Improving WordPress Performance demonstrates:
$args, the cache may be filled with many
duplicate entries, unless the response somehow depends on the query parameters.
proxy_cache_min_uses to set the minimum number of requests required
before the associated response gets saved to the cache:
The configuration syntax is a little bit esoteric and has some pitfalls, so it is possible to set the caching policy from the backend.
You can override the default cache policy set in the Nginx configuration using
the following response headers, that take precedence over
X-Accel-Expires: indicates the response caching duration (0 to disable caching altogether).
Vary: adds the listed request headers to the cache key.
Set-Cookie: disables caching if present in the response.
Be careful with the
Vary header, as it can render caching useless. For
instance, a key based on the entire
Cookie header makes no sense because it
will likely be different for each logged-in user, so you will end up with as
many entries as there are users. See Best practices for using the Vary
X-Accel-Expires is not present in the response, Nginx looks for the
Expires: same as
X-Accel-Expiresbut in a more complicated format.
Cache-Control: modern way to set the caching policy.
As these last two headers also affect caching on the client, prefer
X-Accel-Expires to explicitly control reverse proxy caching:
X-Accel-Expires: 0to disable caching.
X-Accel-Expires: 3600to cache the response for an hour.
It is difficult to precisely explain how to configure caching because it highly
depends on the application, not even mentioning how
Cache-Control affects the
client web browser. Nevertheless, here's a general approach:
- For static resources, set
expiresto a long duration, and add fingerprints to their names for cache busting.
- Define a default cache policy with
proxy_cache_validfor all shared resources, ensuring it is disabled for private ones such as user sessions.
- Always indicate private resources from the backend with the response header
no-store, to prevent any configuration mismatch that could hinder security (you really don't want to cache private webpages and serve them to everyone).
X-Accel-Expiresto override the previous policies when it is not possible to do so in the configuration.
To test the configuration, you can instrument the cache using the variable
$upstream_cache_status. For example, you can add the cache status (hit,
miss) to the response in the extension header
If reverse proxy caching is difficult to apply to some resources, then you will likely have to implement caching in the backend instead (which I leave as an exercise to the reader).
When a lot of requests arrive at the same time and the cache doesn't contain a valid entry, all the requests get forwarded to the backend server while the cache is being updated. The combination of stale responses and cache locking can prevent these bursts of requests to the backend.
The entry is updated in the background. Any request that arrives during the
update receives a stale response.
proxy_cache_use_stale can also return a
stale response when the backend is unavailable due to errors or timeouts:
Stale responses limit the number of concurrent requests to the upstream server. However, if no cache entry can be found, all the requests are passed through until it gets populated.
proxy_cache_lock to prevent concurrent requests from
updating the cache simultaneously. Only the first request is passed upstream to
update the cache, while the others wait for its completion:
The drawback is that requests waiting for the cache are blocked for at least
500 ms, and up to the duration set by
proxy_cache_lock_timeout (5 seconds
by default). These 500 ms come from the unconfigurable interval at which Nginx
wakes up the blocked request tasks to check whether the entry is present or
not. To overcome this problem, use the technique from the previous subsection
to return a stale response while the entry is being updated in the background:
proxy_cache_lock is prone to misuse. If a resource is
configured for caching (e.g., with
proxy_cache_valid), but the backend always
prevents that (e.g., with
X-Accel-Expires: 0 or any other cache control
header), then the entry will always be missing and Nginx will never be able to
return a stale response during the update. Assuming the backend takes 100 ms to
answer, the proxy will only be able to make 10 background requests per second
to try to update the cache.
Meanwhile, the clients wait for
proxy_cache_lock_timeout to expire. Then,
their requests are all forwarded to the backend. This not only increases
latency, but also causes a surge of requests to the upstream server. Therefore,
make sure to keep Nginx synchronized with the backend when it excludes private
resources from the cache. You can disable caching based on the virtual host,
the location block, or an authentication cookie. As a workaround, you can also
§Normalize request attributes
Response compression reduces bandwidth usage and loading time. Clients advertise the algorithms they support in a request header. Different values may correspond to the same response encoding, causing the cache to store multiple entries for the same response. Normalization is the process that maps these different values to a reduced set of entries, thereby improving caching effectiveness.
To send a compressed response, the server must choose the best encoding that
the client accepts based on the value of the header
often support multiple compression algorithms, indicated by a comma-separated
Accept-Encoding: br, gzip, deflate.
HTML formatted content is also a good candidate for compression. You can enable live response compression with Nginx, but caching happens before compression, not after. That means responses from the cache are recompressed for each request. As a solution, you can implement compression in the backend instead.
Assuming Nginx caches responses based on
Accept-Encoding, if a client makes a
Accept-Encoding: br, gzip, deflate, the Brotli-compressed
response is saved to a first entry identified by the value of this header. If
another client only indicates
br, gzip, it gets the same Brotli-compressed
response, but it is recorded to a second cache entry.
To prevent caching the same response multiple times, you can perform header
normalization. The role of Nginx is to map the
Accept-Encoding header to a
set of values corresponding to the best compression scheme supported by both
the server and the client. For instance, if the header contains
response should be compressed with Brotli, if the header contains
should be compressed with Gzip; otherwise, it shouldn't be compressed at all:
Make sure to pass any normalized attributes to the backend:
By default, the cache doesn't look at the value of
the backend returns an encoded response but
proxy_cache_key doesn't include
the encoding, a client that doesn't support Brotli may receive a previously
Brotli-encoded response from the cache. You could insert the header
Vary: Accept-Encoding in the response, which has the same effect as adding the value
Accept-Encoding to the cache key. Unfortunately, Nginx uses the original
and immutable value for
Accept-Encoding, not the normalized value set with
As an alternative, you can add
proxy_cache_key. The default
$scheme$proxy_host$uri$is_args$args. If you just append
you can have a key collision when the URL ends with
$encoding should come before
$url or any other user-controlled fields:
This way, Nginx can cache responses with multiple encodings, without duplicate entries. Compared to live recompression, you can afford a higher compression level to decrease bandwidth, with no significant increase in server load.
For demonstration purposes, it is possible to do everything with Nginx, without
having to implement response compression in the backend, without modifying
proxy_cache_key, and with support for
Vary based on the normalized header
values. The trick is to chain multiple Nginx "servers".
The first endpoint is the one clients connect to. It normalizes the request
Accept-Encoding, and passes them to the next endpoint in the
chain. This is necessary to get around the request header immutability.
The second internal endpoint caches the response. If
Vary: Accept-Encoding is
set, it associates the entry with the normalized
Accept-Encoding given by the
The third endpoint proxies the request to the backend server and performs live
response compression, adding
Vary: Accept-Encoding. A separate server is
required because compression happens after caching if they are both enabled.
Of course, copying the data around and opening extra connections is less efficient than a single server block.
Caching can drastically increase server performance, even when set to a few seconds in high traffic scenarios.
Final recommendations to configure reverse proxy caching with Nginx:
- Cache static assets and public dynamic webpages.
- Apply header normalization to cache compressed responses.
- Always set
Cache-Controlfor private resources.
- Beware of any mismatch between Nginx and the backend regarding private resources.