Re: Use CDN for djangoproject.com

2019-02-24 Thread Tobias McNulty
Hi Tom,

Thanks for your message. I think we'll end up with Fastly since it would be
free, but I'm waiting to see their sponsorship contract. CloudFront would
work too but I don't know of any such open source sponsorship options with
AWS.

I will say wildcard purging looks a bit simpler in CloudFront, but your
idea purging the whole cache only for non-dev builds could work (provided
we have a lower cache timeout or a single wildcard purge condition set up
for the dev builds, I guess).

Feel free to test and post any feedback about Fastly prior to the potential
transition here: https://django-docs.global.ssl.fastly.net/en/2.1/ (this is
set up on a free dev account, so no custom SSL)

For the sake of comparison I'm working on getting a distribution set up for
CloudFront too, but it won't be so simple to test (without a DNS or server
configuration change) since I don't think CloudFront supports passing a
custom Host header to the origin like Fastly does (i.e., you'll probably
need to edit /etc/hosts).

Cheers,
Tobias


On Sat, Feb 23, 2019, 7:15 PM Tom Forbes  wrote:

> Sorry, I did not completely grok your message. I would be in favour of
> just invalidating the whole cache if needed, it seems the simplest
> solution. Invalidating most of the cache on every non-dev deploy would also
> be OK I think.
>
> On Sun, 24 Feb 2019, 00:10 Tom Forbes,  wrote:
>
>> Which CDN are we going to use? Fastly has awesome sub 100ms global
>> invalidation which we can trigger on every deploy, and cloudflare has
>> something similar.
>>
>> On Sun, 24 Feb 2019, 00:00 Tobias McNulty, 
>> wrote:
>>
>>> Hi all,
>>>
>>> An implementation question has come up regarding cache lifetime (see this
>>> PR ). Right now,
>>> the whole site (including docs) has the site-wide Django cache enabled
>>> ,
>>> with a timeout of 5 minutes
>>> .
>>> A couple docs views (search_suggestions
>>> 
>>> and search_description
>>> )
>>> views have longer timeouts set (to 1 hour and 1 week, respectively).
>>>
>>> Once released, the vast majority of Django docs won't change much,
>>> except for the release notes section and any (likely minor) related updates
>>> to the docs themselves. To get the most benefit out of a CDN, it would
>>> obviously be desirable to set the timeout to something greater than 5
>>> minutes.
>>>
>>> At the same time, there are moments when a quick update to the docs *is*
>>> desired, and waiting an hour or more for any cached pages to expire may
>>> cause significant confusion, for example, in conjunction with a security
>>> release for which stubbed (non-final) release notes may have already been
>>> pushed out and cached.
>>>
>>> I see two main options at this point (which could even be combined):
>>>
>>> 1) Invalidate the whole cache (or at least some key release notes URLs)
>>> any time there's a docs build that has changes. It would be pretty easy to
>>> piggyback off of the existing business logic for avoiding a rebuild
>>> 
>>> if the git checkout hasn't changed (in the update_docs management command).
>>> 2) Pick subsections of the docs (e.g., for anything matching
>>> '///releases/*' and perhaps the development docs) that would
>>> keep a shorter cache timeout of 5-10 minutes. All URLs not specifically
>>> requiring this special treatment would get a longer timeout, perhaps
>>> somewhere between 1 and 24 hours.
>>>
>>> So, some questions for the list:
>>>
>>> * Are there sections of the docs besides '///releases/'
>>> and '//dev/' that might update frequently and merit some combination
>>> of invalidation and/or a shorter cache time? And what's a good cache
>>> timeout for such pages?
>>> * How long are we comfortable waiting for *other* (not frequently
>>> updated) pages to timeout, in the event they do change?
>>>
>>> Tobias
>>>
>>> On Fri, Feb 15, 2019 at 7:13 AM Tobias McNulty 
>>> wrote:
>>>
 Thanks for sharing the results.

 I did manage to get a domain set up with working SSL, in case you want
 to use it: https://django-docs.global.ssl.fastly.net/en/2.1/

 Tobias

 On Thu, Feb 14, 2019, 11:49 PM Cheng C >>>
> Thanks for the test site, Tobias.
>
> Tested from Melbourne, Australia:
>
> https://docs.djangoproject.com/en/2.1/
> Average Ping: 268ms
>  Browser: 22 requests, 238KB transferred, Finish: 2.72s,
> DOMContentLoaded: 1.37s, Load: 1.68s
>
> https://docs.djangoproject.com.global.prod.fastly.net/en/2.1/
> Average Ping: 28ms
>  Browser: 

Re: Use CDN for djangoproject.com

2019-02-24 Thread Tom Forbes
Awesome work! For my location (Lisbon, Portugal) it takes about 130ms to
retrieve the HTML for a docs page (
https://django-docs.global.ssl.fastly.net/en/2.1/intro/reusable-apps/ to be
specific). The same page on docs.djangoproject.com responds in 800–900ms.




On 24 February 2019 at 14:35:55, Tobias McNulty (tob...@caktusgroup.com)
wrote:

Hi Tom,

Thanks for your message. I think we'll end up with Fastly since it would be
free, but I'm waiting to see their sponsorship contract. CloudFront would
work too but I don't know of any such open source sponsorship options with
AWS.

I will say wildcard purging looks a bit simpler in CloudFront, but your
idea purging the whole cache only for non-dev builds could work (provided
we have a lower cache timeout or a single wildcard purge condition set up
for the dev builds, I guess).

Feel free to test and post any feedback about Fastly prior to the potential
transition here: https://django-docs.global.ssl.fastly.net/en/2.1/ (this is
set up on a free dev account, so no custom SSL)

For the sake of comparison I'm working on getting a distribution set up for
CloudFront too, but it won't be so simple to test (without a DNS or server
configuration change) since I don't think CloudFront supports passing a
custom Host header to the origin like Fastly does (i.e., you'll probably
need to edit /etc/hosts).

Cheers,
Tobias


On Sat, Feb 23, 2019, 7:15 PM Tom Forbes  wrote:

> Sorry, I did not completely grok your message. I would be in favour of
> just invalidating the whole cache if needed, it seems the simplest
> solution. Invalidating most of the cache on every non-dev deploy would also
> be OK I think.
>
> On Sun, 24 Feb 2019, 00:10 Tom Forbes,  wrote:
>
>> Which CDN are we going to use? Fastly has awesome sub 100ms global
>> invalidation which we can trigger on every deploy, and cloudflare has
>> something similar.
>>
>> On Sun, 24 Feb 2019, 00:00 Tobias McNulty, 
>> wrote:
>>
>>> Hi all,
>>>
>>> An implementation question has come up regarding cache lifetime (see this
>>> PR ). Right now,
>>> the whole site (including docs) has the site-wide Django cache enabled
>>> ,
>>> with a timeout of 5 minutes
>>> .
>>> A couple docs views (search_suggestions
>>> 
>>> and search_description
>>> )
>>> views have longer timeouts set (to 1 hour and 1 week, respectively).
>>>
>>> Once released, the vast majority of Django docs won't change much,
>>> except for the release notes section and any (likely minor) related updates
>>> to the docs themselves. To get the most benefit out of a CDN, it would
>>> obviously be desirable to set the timeout to something greater than 5
>>> minutes.
>>>
>>> At the same time, there are moments when a quick update to the docs *is*
>>> desired, and waiting an hour or more for any cached pages to expire may
>>> cause significant confusion, for example, in conjunction with a security
>>> release for which stubbed (non-final) release notes may have already been
>>> pushed out and cached.
>>>
>>> I see two main options at this point (which could even be combined):
>>>
>>> 1) Invalidate the whole cache (or at least some key release notes URLs)
>>> any time there's a docs build that has changes. It would be pretty easy to
>>> piggyback off of the existing business logic for avoiding a rebuild
>>> 
>>> if the git checkout hasn't changed (in the update_docs management command).
>>> 2) Pick subsections of the docs (e.g., for anything matching
>>> '///releases/*' and perhaps the development docs) that would
>>> keep a shorter cache timeout of 5-10 minutes. All URLs not specifically
>>> requiring this special treatment would get a longer timeout, perhaps
>>> somewhere between 1 and 24 hours.
>>>
>>> So, some questions for the list:
>>>
>>> * Are there sections of the docs besides '///releases/'
>>> and '//dev/' that might update frequently and merit some combination
>>> of invalidation and/or a shorter cache time? And what's a good cache
>>> timeout for such pages?
>>> * How long are we comfortable waiting for *other* (not frequently
>>> updated) pages to timeout, in the event they do change?
>>>
>>> Tobias
>>>
>>> On Fri, Feb 15, 2019 at 7:13 AM Tobias McNulty 
>>> wrote:
>>>
 Thanks for sharing the results.

 I did manage to get a domain set up with working SSL, in case you want
 to use it: https://django-docs.global.ssl.fastly.net/en/2.1/

 Tobias

 On Thu, Feb 14, 2019, 11:49 PM Cheng C >>>
> Thanks for the test site, Tobias.

Re: Use CDN for djangoproject.com

2019-02-24 Thread Tobias McNulty
Tom,

That's great! Thanks for the feedback.

I've updated the PR 
with something along the lines of what you suggested (along with the
corresponding configuration in Fastly).

Take a look and let me know what you think.

Cheers,
Tobias



On Sun, Feb 24, 2019 at 10:40 AM Tom Forbes  wrote:

> Awesome work! For my location (Lisbon, Portugal) it takes about 130ms to
> retrieve the HTML for a docs page (
> https://django-docs.global.ssl.fastly.net/en/2.1/intro/reusable-apps/ to
> be specific). The same page on docs.djangoproject.com responds in
> 800–900ms.
>
>
>
>
> On 24 February 2019 at 14:35:55, Tobias McNulty (tob...@caktusgroup.com)
> wrote:
>
> Hi Tom,
>
> Thanks for your message. I think we'll end up with Fastly since it would
> be free, but I'm waiting to see their sponsorship contract. CloudFront
> would work too but I don't know of any such open source sponsorship options
> with AWS.
>
> I will say wildcard purging looks a bit simpler in CloudFront, but your
> idea purging the whole cache only for non-dev builds could work (provided
> we have a lower cache timeout or a single wildcard purge condition set up
> for the dev builds, I guess).
>
> Feel free to test and post any feedback about Fastly prior to the
> potential transition here:
> https://django-docs.global.ssl.fastly.net/en/2.1/ (this is set up on a
> free dev account, so no custom SSL)
>
> For the sake of comparison I'm working on getting a distribution set up
> for CloudFront too, but it won't be so simple to test (without a DNS or
> server configuration change) since I don't think CloudFront supports
> passing a custom Host header to the origin like Fastly does (i.e., you'll
> probably need to edit /etc/hosts).
>
> Cheers,
> Tobias
>
>
> On Sat, Feb 23, 2019, 7:15 PM Tom Forbes  wrote:
>
>> Sorry, I did not completely grok your message. I would be in favour of
>> just invalidating the whole cache if needed, it seems the simplest
>> solution. Invalidating most of the cache on every non-dev deploy would also
>> be OK I think.
>>
>> On Sun, 24 Feb 2019, 00:10 Tom Forbes,  wrote:
>>
>>> Which CDN are we going to use? Fastly has awesome sub 100ms global
>>> invalidation which we can trigger on every deploy, and cloudflare has
>>> something similar.
>>>
>>> On Sun, 24 Feb 2019, 00:00 Tobias McNulty, 
>>> wrote:
>>>
 Hi all,

 An implementation question has come up regarding cache lifetime (see this
 PR ). Right now,
 the whole site (including docs) has the site-wide Django cache enabled
 ,
 with a timeout of 5 minutes
 .
 A couple docs views (search_suggestions
 
 and search_description
 )
 views have longer timeouts set (to 1 hour and 1 week, respectively).

 Once released, the vast majority of Django docs won't change much,
 except for the release notes section and any (likely minor) related updates
 to the docs themselves. To get the most benefit out of a CDN, it would
 obviously be desirable to set the timeout to something greater than 5
 minutes.

 At the same time, there are moments when a quick update to the docs
 *is* desired, and waiting an hour or more for any cached pages to
 expire may cause significant confusion, for example, in conjunction with a
 security release for which stubbed (non-final) release notes may have
 already been pushed out and cached.

 I see two main options at this point (which could even be combined):

 1) Invalidate the whole cache (or at least some key release notes URLs)
 any time there's a docs build that has changes. It would be pretty easy to
 piggyback off of the existing business logic for avoiding a rebuild
 
 if the git checkout hasn't changed (in the update_docs management command).
 2) Pick subsections of the docs (e.g., for anything matching
 '///releases/*' and perhaps the development docs) that would
 keep a shorter cache timeout of 5-10 minutes. All URLs not specifically
 requiring this special treatment would get a longer timeout, perhaps
 somewhere between 1 and 24 hours.

 So, some questions for the list:

 * Are there sections of the docs besides '///releases/'
 and '//dev/' that might update frequently and merit some combination
 of invalidation and/or a shorter cache time? And what's a good cache
 timeout for such pages?
 * How long are we comfortable