Caching is a technique that stores a copy of a given resource for fast retrieval. This technique can be found at any level of a web service, whether it is inside the backend application, on a proxy server, or even in the client web browser. The purpose is to reduce server load and increase responsiveness.
A web application usually runs behind a reverse proxy. It forwards incoming requests to a backend server and performs other tasks: TLS encryption, load balancing, caching, compression, etc. Nginx is a free and open source reverse proxy that implements all of these features.
This article shows how to configure reverse proxy caching with Nginx, increase availability by serving stale responses, limit request concurrency with locking, and normalize requests to increase cache efficiency. The prerequisites are basic knowledge of HTTP and prior experience with Nginx.
§Introduction
The Nginx documentation should be your go-to reference, particularly the following entries:
- Module
ngx_http_proxy_module
: documentation for the caching directives. - Alphabetical list of index variables: variables defined in the configuration.
- Configuration file measurement units: list of time and space specifiers.
This article assumes Nginx is configured with a virtual host answering requests
for http://app.example.com/
. The application has some static files in
/srv/http/app
and the backend server is listening on http://127.0.0.1:3000
:
Before reloading the service, check for any configuration errors with the
command nginx -t
. The full configuration after applying the lessons from this
article is available in § Conclusion.
§Define a caching zone
The proxy_cache_path
directive defines a named cache location that you can
enable with the proxy_cache
directive.
§Parameters
This section introduces the most important parameters of proxy_cache_path
.
§path
(mandatory)
path
(mandatory)The path
parameter indicates the directory for the cache entries, for
example, /var/cache/nginx/app
(note that the parent directory is not created
automatically).
§keys_zone
(mandatory)
keys_zone
(mandatory)keys_zone=name:size
specifies the name
and the size
of the shared memory
area that holds the keys. A size of 1M
corresponds to 1 MB of shared memory
that can hold 8000 keys.
§levels
levels
The proxy cache works like a hash table with a key derived from
proxy_cache_key
, set to $scheme$proxy_host$uri$is_args$args
by default.
During request processing, these variables are substituted by their values. The
cache file location depends on the MD5 hash of proxy_cache_key
and the
argument of the option levels
, that specifies how to split the digest into
segments. The purpose is to prevent slowdowns caused by too many entries inside
a single directory.
For instance, levels=1:2
organizes the storage hierarchy with two directory
levels. The path corresponding to the hashed key
b1946ac92492d2347c6235b4d2611184
is
/var/cache/nginx/app/b/19/46ac92492d2347c6235b4d2611184
:
- The first-level directory is named after the first byte of the key (
b
). - The second-level directory is named after the next two (
19
).
By default, all the entries are stored in path
.
§max_size
max_size
You might want to set max_size
to specify the maximum size of the cache on
disk (by default, it is unbounded).
§manager_*
manager_*
When the cache exceeds max_size
, the cache manager evicts:
- at most
manager_files
entries (by default, 100), - in iterations of less than
manager_threshold
milliseconds (by default, 200 ms), - separated by a pause of
manager_sleep
milliseconds (by default, 50 ms).
The loader process that indexes the cache when Nginx is starting has equivalent parameters.
§inactive
inactive
Entries that haven't been used for the time specified with inactive
are
evicted from the cache, whether they are expired or not. By default, inactive
is set to 10m
.
§Configuration
proxy_cache_path
defines the shared memory zone, location, and properties of
the cache:
§Activation
To activate the cache, you need to set proxy_cache
to the name of the
shared memory zone:
It is now active, but you still have to specify which responses to cache.
§Control caching
To save responses to the cache, you can match request or response attributes such as the HTTP method, status code, or headers. You can declare these rules in the Nginx configuration or control the cache from the backend application.
§Configuration
proxy_cache_valid
tells Nginx to cache responses with the matching status
codes for the specified duration. The following example caches responses with
status code 200, 301, and 302 (defaults when none are given) for 1 minute, and
all other responses, including errors, for 10 seconds:
While proxy_cache_valid
includes responses based on the status code, you can
exclude resources using the following directives:
proxy_cache_bypass
: always queries the upstream server without looking at the cache, but saves the response to the cache.proxy_no_cache
: always queries the upstream server without looking at the cache, but doesn't save the response to the cache.
These two directives accept string parameters, including variables. If one of
these parameters is not empty and different from 0
, then the cache is
bypassed or disabled for that request. For example:
proxy_cache_bypass: 1
: always bypasses the cache.proxy_cache_bypass: $http_pragma
: bypasses the cache when the request contains the headerPragma
.proxy_no_cache $cookie_sessionid
: disables the cache for users with asessionid
cookie (but unauthenticated users visiting the same URL will get a public cached response).
You can apply caching directives to a specific location block, but for more complicated scenarios, you can use custom variables and conditionals, as this example from 9 Tips for Improving WordPress Performance demonstrates:
Because proxy_cache_key
contains $args
, the cache may be filled with many
duplicate entries, unless the response somehow depends on the query parameters.
Use proxy_cache_min_uses
to set the minimum number of requests required
before the associated response gets saved to the cache:
The configuration syntax is a little bit esoteric and has some pitfalls, so it is possible to set the caching policy from the backend.
§Headers
You can override the default cache policy set in the Nginx configuration using
the following response headers, that take precedence over proxy_cache_valid
:
X-Accel-Expires
: indicates the response caching duration (0 to disable caching altogether).Vary
: adds the listed request headers to the cache key.Set-Cookie
: disables caching if present in the response.
Be careful with the Vary
header, as it can render caching useless. For
instance, a key based on the entire Cookie
header makes no sense because it
will likely be different for each logged-in user, so you will end up with as
many entries as there are users. See Best practices for using the Vary
header.
If X-Accel-Expires
is not present in the response, Nginx looks for the
following headers:
Expires
: same asX-Accel-Expires
but in a more complicated format.Cache-Control
: modern way to set the caching policy.
As these last two headers also affect caching on the client, prefer
X-Accel-Expires
to explicitly control reverse proxy caching:
X-Accel-Expires: 0
to disable caching.X-Accel-Expires: 3600
to cache the response for an hour.
§In practice
It is difficult to precisely explain how to configure caching because it highly
depends on the application, not even mentioning how Cache-Control
affects the
client web browser. Nevertheless, here's a general approach:
- For static resources, set
expires
to a long duration, and add fingerprints to their names for cache busting. - Define a default cache policy with
proxy_cache_valid
for all shared resources, ensuring it is disabled for private ones such as user sessions. - Always indicate private resources from the backend with the response header
Cache-Control
set toprivate
,no-cache
, orno-store
, to prevent any configuration mismatch that could hinder security (you really don't want to cache private webpages and serve them to everyone). - Use
X-Accel-Expires
to override the previous policies when it is not possible to do so in the configuration.
To test the configuration, you can instrument the cache using the variable
$upstream_cache_status
. For example, you can add the cache status (hit,
miss) to the response in the extension header X-Cache-Status
:
If reverse proxy caching is difficult to apply to some resources, then you will likely have to implement caching in the backend instead (which I leave as an exercise to the reader).
§Increase availability
When a lot of requests arrive at the same time and the cache doesn't contain a valid entry, all the requests get forwarded to the backend server while the cache is being updated. The combination of stale responses and cache locking can prevent these bursts of requests to the backend.
§Stale responses
Enable proxy_cache_background_update
to update the cache in the background
and proxy_cache_use_stale
with the parameter updating
to return a stale
response while the entry is being updated:
The entry is updated in the background. Any request that arrives during the
update receives a stale response. proxy_cache_use_stale
can also return a
stale response when the backend is unavailable due to errors or timeouts:
Stale responses limit the number of concurrent requests to the upstream server. However, if no cache entry can be found, all the requests are passed through until it gets populated.
§Cache lock
Nginx provides proxy_cache_lock
to prevent concurrent requests from
updating the cache simultaneously. Only the first request is passed upstream to
update the cache, while the others wait for its completion:
The drawback is that requests waiting for the cache are blocked for at least
500 ms, and up to the duration set by proxy_cache_lock_timeout
(5 seconds
by default). These 500 ms come from the unconfigurable interval at which Nginx
wakes up the blocked request tasks to check whether the entry is present or
not. To overcome this problem, use the technique from the previous subsection
to return a stale response while the entry is being updated in the background:
Unfortunately, proxy_cache_lock
is prone to misuse. If a resource is
configured for caching (e.g., with proxy_cache_valid
), but the backend always
prevents that (e.g., with X-Accel-Expires: 0
or any other cache control
header), then the entry will always be missing and Nginx will never be able to
return a stale response during the update. Assuming the backend takes 100 ms to
answer, the proxy will only be able to make 10 background requests per second
to try to update the cache.
Meanwhile, the clients wait for proxy_cache_lock_timeout
to expire. Then,
their requests are all forwarded to the backend. This not only increases
latency, but also causes a surge of requests to the upstream server. Therefore,
make sure to keep Nginx synchronized with the backend when it excludes private
resources from the cache. You can disable caching based on the virtual host,
the location block, or an authentication cookie. As a workaround, you can also
decrease proxy_cache_lock_timeout
.
§Normalize request attributes
Response compression reduces bandwidth usage and loading time. Clients advertise the algorithms they support in a request header. Different values may correspond to the same response encoding, causing the cache to store multiple entries for the same response. Normalization is the process that maps these different values to a reduced set of entries, thereby improving caching effectiveness.
§Compression
Compression can drastically reduce the size of static resources such as JavaScript or CSS. Instead of doing it live, you can pre-compress assets with a higher ratio, and configure Nginx to automatically return the appropriate response:
To send a compressed response, the server must choose the best encoding that
the client accepts based on the value of the header Accept-Encoding
. Clients
often support multiple compression algorithms, indicated by a comma-separated
list, e.g., Accept-Encoding: br, gzip, deflate
.
HTML formatted content is also a good candidate for compression. You can enable live response compression with Nginx, but caching happens before compression, not after. That means responses from the cache are recompressed for each request. As a solution, you can implement compression in the backend instead.
§Normalization
Assuming Nginx caches responses based on Accept-Encoding
, if a client makes a
request with Accept-Encoding: br, gzip, deflate
, the Brotli-compressed
response is saved to a first entry identified by the value of this header. If
another client only indicates br, gzip
, it gets the same Brotli-compressed
response, but it is recorded to a second cache entry.
To prevent caching the same response multiple times, you can perform header
normalization. The role of Nginx is to map the Accept-Encoding
header to a
set of values corresponding to the best compression scheme supported by both
the server and the client. For instance, if the header contains br
, the
response should be compressed with Brotli, if the header contains gzip
, it
should be compressed with Gzip; otherwise, it shouldn't be compressed at all:
Make sure to pass any normalized attributes to the backend:
§Cache key
By default, the cache doesn't look at the value of Accept-Encoding
. Because
the backend returns an encoded response but proxy_cache_key
doesn't include
the encoding, a client that doesn't support Brotli may receive a previously
Brotli-encoded response from the cache. You could insert the header Vary: Accept-Encoding
in the response, which has the same effect as adding the value
of Accept-Encoding
to the cache key. Unfortunately, Nginx uses the original
and immutable value for Accept-Encoding
, not the normalized value set with
proxy_set_header
.
As an alternative, you can add $encoding
to proxy_cache_key
. The default
value is $scheme$proxy_host$uri$is_args$args
. If you just append $encoding
,
you can have a key collision when the URL ends with br
or gzip
. Hence,
$encoding
should come before $url
or any other user-controlled fields:
This way, Nginx can cache responses with multiple encodings, without duplicate entries. Compared to live recompression, you can afford a higher compression level to decrease bandwidth, with no significant increase in server load.
§Proxy chain
For demonstration purposes, it is possible to do everything with Nginx, without
having to implement response compression in the backend, without modifying
proxy_cache_key
, and with support for Vary
based on the normalized header
values. The trick is to chain multiple Nginx "servers".
The first endpoint is the one clients connect to. It normalizes the request
attributes, like Accept-Encoding
, and passes them to the next endpoint in the
chain. This is necessary to get around the request header immutability.
The second internal endpoint caches the response. If Vary: Accept-Encoding
is
set, it associates the entry with the normalized Accept-Encoding
given by the
previous endpoint.
The third endpoint proxies the request to the backend server and performs live
response compression, adding Vary: Accept-Encoding
. A separate server is
required because compression happens after caching if they are both enabled.
Of course, copying the data around and opening extra connections is less efficient than a single server block.
§Conclusion
Caching can drastically increase server performance, even when set to a few seconds in high traffic scenarios.
Final recommendations to configure reverse proxy caching with Nginx:
- Cache static assets and public dynamic webpages.
- Apply header normalization to cache compressed responses.
- Always set
Cache-Control
for private resources. - Beware of any mismatch between Nginx and the backend regarding private resources.
Full configuration:
§Further readings
- A Guide to Caching with NGINX and NGINX Plus for an overview of the proxy caching features.
- The Benefits of Microcaching with NGINX for benchmarks with a low caching duration.
- Using
$upstream_cache_status
inaccess.log
to measure caching effectiveness.