-
-
Notifications
You must be signed in to change notification settings - Fork 989
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add cache control and invalidation for docs site (prep for Fastly) #870
Conversation
The main issue I see with a delay is when doing security releases. It would be confusing if we didn't delay the blog post until the release notes are updated. |
The new page would still be accessible as soon as the docs rebuilt. But, any older cached pages not have a link to the new one until the cache expires. We could theoretically pick a set of key pages to invalidate when a release is pushed out, or even selectively set a shorter cache time for the release notes pages (in particular the TOC). |
Often there will be existing "stub release notes" for security releases that are pushed out before a release so it would be nice to have a solution. |
Ah, yes that could cause some confusion. What do you think of keeping the timeout at 5 minutes for anything under /releases/ and setting it to an hour (or more) for other URLs? Are there others that would be good to update more frequently? We could theoretically do the same thing for the whole dev tree if that would be useful, for example. |
Might be fine. There are also rare cases where a security patch modifies other documentation. |
Thinking about it more, I feel like were inevitably going to get occasional reports when someone is looking at some release notes for a new feature in the development version and the link to the feature won't work since the rest of the docs haven't updated. Maybe the development version of the docs shouldn't be cached longer either. I don't know -- probably worth writing to the mailing list to get a consensus. |
# make some allowance for temporary network failures for our .post() request below | ||
retry = Retry( | ||
total=5, | ||
method_whitelist=frozenset(['POST']), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small thing, but does this have to be a frozenset? Would {'POST'}
not be OK here?
# If only the dev docs have changed, we can purge only the surrogate key we've set | ||
# up for the dev docs release in Fastly. This will usually happen with every new commit | ||
# to django master (on the next hour, when the cron job runs). | ||
url = '%s/purge/dev-docs-key' % fastly_service_url |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we do a urljoin
here in case fastly_service_url
is modified to include a trailing slash at any point?
@timgraham @MarkusH @apollo13 Any objections to merging and deploying this? I've added a separate cache just for the docs pages, so I think this should be good to go. But, please take a look and let me know what you think. |
Currently, djangoproject.com uses a per-site cache with a 5-minute timeout, which while appropriate for
www
&dashboard
seems like overkill fordocs
:https://github.com/django/djangoproject.com/blob/master/djangoproject/settings/common.py#L27
The vast majority of the documentation (esp. for releases) will rarely ever change. As discussed some below, the biggest issue we'll likely have to contend with is serving out-of-sync pages, e.g., a new release could in theory show up for the "Prev" link on the detail page for a point release (e.g., https://docs.djangoproject.com/en/2.1/releases/2.1.6/) but be missing from the TOC (https://docs.djangoproject.com/en/2.1/releases/) until the latter expires (I've seen a similar issue with Read The Docs).
I've updated this PR to set a
Surrogate-Control
header that Fastly will use to determine how often it refetches from the origin. Browsers will still see the 5-minuteCache-Control
header and re-fetch from Fastly at least that often.Then, in
update_docs
, we purge Django's per-view cache (just for docs pages, with a new'docs-pages'
cache in Redis) and the Fastly cache any time there's a change in git. If we're sure the change was only to the dev docs, we purge only that section of the site in Fastly; otherwise, we purge everything for simplicity.Per @MarkusH's suggestion I've added a separate cache just for
docs.views.document()
to keep this change from clearing all other cached pages on the site at the same time. I opted to go with Redis because it's simple to install and includes support for multiple databases, in the event we need other more granular caches in the future. For this cache I chose DB 2 not because anything else is using DB 1, but to future-proof it against other apps that may not think to set a DB when connecting to Redis.