questions about content delivery networks

Hello! Here are some questions & answers. The goal isn't to get all the questions "right". Instead, the goal is to learn something! If you find a topic you're interested in learning more about, I'd encourage you to look it up and learn more.

if you put your site "behind" a CDN, will the site's IP address be the CDN or your backend server?

the CDN's IP!

all requests will go to the CDN, and then the CDN will (if it needs to) make a request to your backend server

do CDNs have just one big datacenter where they cache content?

nope!

the whole point of a CDN is that they cache content on many servers in different cities around the world, so that your users can get a response quickly no matter where they are. Some CDNs have servers (points of presence, or "PoPs") in hundreds of cities around the world.

does a CDN only cache the body of a HTTP response (like an image?)

nope!

it can cache the whole HTTP response, including the status code and headers. So, for example, if you return a 404 by accident and set that response to be cached, then your site might still 404 even if you've fixed your backend server.

if you accidentally cache something you didn't mean to, can you fix it?

yes!

CDNs usually have a way to purge the cache -- the exact way you do it depends on the CDN. Sometimes it takes a few minutes for it to finish -- the CDN might need to go tell hundreds of servers all over the world to clear their caches.

you can usually choose to either just remove specific files from the cache or remove everything

how does a CDN know it should put a HTTP response in its cache?

it gets a request for it from a client!

when a CDN gets a request for a resource, it'll request it from your server and then (if appropriate) it'll put the resource in its cache so that it doesn't have to request it from your server next time.

can a CDN cache any type of HTTP response?

yes!

if you ask the CDN to (like by setting the response header Cache-Control: public; max-age=3600), it can usually cache any HTTP response you want.

sometimes they have limits on how big the HTTP response can be though, so you might not be able to cache a large video.

can a CDN keep serving your site even if your server is down?

maybe!

a CDN can keep serving cached pages even if your server isn't running.

But if you've told it to only cache content for a certain amount of time (like 2 hours), the content might expire after a while and not be available anymore. And if the content wasn't cached at all, the CDN can't help you!

if you use TLS for a site behind a CDN, can the CDN read your unencrypted site traffic?

yes!

if you want a CDN to cache content, it needs to be able to decrypt and read it.

often people handle this by only putting static content (like CSS/JS/images) on the domain behind a CDN, and using a separate domain for requests with user data. For example, https://github.githubassets.com/ is behind a CDN but https://github.com isn't.

do CDNs always cache resources?

nope!

if you want, you can usually configure your CDN to not cache at all and just proxy every request to your backend server.

is it possible to tell if someone is using a CDN for their website?

often, yes!

you can usually figure it out from the headers: run curl -I https://css-tricks.com and see what CDN they use!

is it possible to tell if you've been served a cached response?

often, yes!

the CDN will often set a response header like x-cache: HIT which you can use to tell if it was a cache hit or a cache miss. This is a nice way to debug if for example you're trying to make sure something isn't being cached -- check the response headers and make sure it's a cache miss!

can a CDN make requests faster even if it doesn't cache?

yes!

2 reasons:

  • the CDN can often terminate TLS closer to the client, which means the TLS handshake can be a lot faster. This can save a second or so if your backend servers are far away from the client. This works because it'll often keep a TLS connection open to the backend server so it doesn't have to reestablish a new one each time.
  • it might have access to faster routes to your backend server than the client does

there are also more ways a CDN can improve performance!

if your website only supports HTTP/1.1, can the CDN accept HTTP/2 requests?

often, yes!

many CDNs can transparently translate HTTP/2 requests into HTTP/1 requests to your backend, so you can get a lot of the performance benefits of HTTP/2 without having to do any work at all.

is it possible to make sure the CDN only caches a response for a limited amount of time (like 10 minutes)?

yes!

You can do this by setting the Cache-Control response header, like Cache-Control: max-age=600

is it possible to allow a resource to be cached by browsers, but not by a CDN?

yes!

You can do this by setting the Cache-Control: private, max-age=3600. private means that the content should only be stored in a browser's cache, not a CDN cache.

This is useful if a response is cacheable but different for every user.

if you request the same URL from a CDN but with different headers, will you get the same cached response?

it depends!

by default, it'll always be the same response. But if the server set the Vary: header, then the CDN will store a different cached value for each value of that header. For example, Vary: Accept-Encoding will make the CDN store both a compressed and an uncompressed version.