Here is a small trick we recently implemented for a customer:
The main premise was:
No clients should have to wait while the backend works. If a request is a miss, give the client a slightly stale/old page instead, and fetch a new page in the background.
Since Varnish and VCL is super configurable, we can do this with a VCL hack and a small helper process.
The flow is that a client requests something that just expired. In vcl_miss we notice this, and change the backend to a sick one. We also log the URL that just failed with std.log(), before restarting the request handling. Back in vcl_recv the usual sick-backend behaviour kicks in and the slightly stale graced object is given to the client.
Outside Varnish there is a small python script that tails varnishlog output for a special VCL_Log entry. When it picks it up, it sends a request for the same URL to the local Varnish. In vcl_recv we detect this client, set req.hash_always_miss to force a refetch and let the python script wait while the backend recreates the page.
All requests that come in while the refetch is underway will be served graced copies at full speed.
Your 95 percentile response time graphs will love this feature, and maybe even some of your users as well. Cool, huh?
More information about grace in Varnish: