Re: GSoC 2012: Security Enhancements

Russell Keith-Magee Thu, 05 Apr 2012 23:43:21 -0700

Hi Rohan,

Apologies for the lack of response. Anyone who has put effort into writing up a 
proposal certainly deserves a response of some kind, so we've dropped the ball 
here.


In our defence, here's a couple of the reasons why your proposal probably 
hasn't got a wild response:
 
 * You've picked a project on your own, rather than one that is on the list of 
suggested projects. Part of the GSoC process is mentoring, and you need to have 
a mentor that can spare the time, and has the technical skills to review your 
work. Most of the projects on the suggested list already have champions inside 
the community, so if you pick one of them, you're likely to get a response. If 
you pick your own project, you also have to find someone to get sufficiently 
enthused about it. 

 * You've picked a very gnarly problem. Security issues are the very model of 
an 'anti-bikeshed'. If you read the original discussion about Bikesheds [1], 
it's all about how everyone gives their opinion on "simple" topics, but 
everyone leaves hard problems alone. In Django's context -- *everyone* has an 
opinion about contrib.auth.User because it seems like a simple problem. 
However, security is all about subtle issues and expert knoweledge. Therefore, 
your pool of experts is much smaller.  

[1] http://bikeshed.com/

Regarding your project proposal itself: I can't really address the technical 
merits, because I don't have any expertise on CORS, or the subtleties of the 
CSRF changes your proposing. What you've proposed certainly sounds interesting 
on the surface, but I'd really want to see someone like Paul McMillan comment 
on the technical specifics. Ideally, Paul would also mentor the project, since 
he's Django's resident security expert, and he'd need to sign off on anything 
that was bound for trunk.

What I can do is point at the things that look like problems from a GSoC 
perspective. In particular, of the three sections to your project plan, two of 
them (unified tokenization and django-secure) involve merging existing projects 
into trunk. This is problematic, because one of the conditions of GSoC is that 
the student writes the bulk of the code. While integrating these two code pools 
may well be very valuable contributions to Django, they're not good from the 
perspective of a GSoC project.

So - apologies for not responding sooner. Unfortunately, I suspect that while 
your project probably has merit, the community isn't in a position to support 
your ambition at the moment. If Paul wants to swoop in at the last minute and 
prove me wrong, I'd be a very happy man -- I hate seeing someone enthusiastic 
get turned away -- but absent of that, it's only fair that we be honest to you 
about your chances.

Yours,
Russ Magee %-)

On 06/04/2012, at 2:09 PM, Rohan Jain wrote:

> Hi again,
> 
> I really couldn't understand the response this post has got. It
> deserved at least a little feedback, positive or negative. I guess I
> wont be submitting this over melange.
> 
> Still, I have put some effort and research in the proposal. So if
> possible I would like to know if it had anything of value. Maybe some
> one could work over that, even me if I get the time.
> 
> --
> Rohan
> 
> 
> On 23:40 +0530 / 31 Mar, Rohan Jain wrote:
>> Hi,
>> 
>> I am Rohan Jain, a 4th (final) year B.Tech undergraduate Student
>> from Indian Institute of Technology, Kharagpur. I have been using
>> django since over a year and generally look into the code base to find
>> about various implementations. I have made attempts to make some minor
>> contributions and if selected this would be my first major one.
>> 
>> More about Me: <http://www.rohanjain.in/about/>
>> IRC, Github: crodjer
>> 
>> I am interested in contributing some security enhancements to django
>> as my Summer of Code project. Below is the 1st draft of my proposal
>> regarding this. A pretty version of this is available at:
>> https://gist.github.com/2203174
>> 
>> 
>> #Abstract
>> 
>> Django is a reasonably secure framework. It provides an API and
>> development patterns which transparently take care of the common web
>> security issues. But still there are security features which need
>> attention. I propose to work on integration of existing work on
>> centralized token system and improved CSRF checking without any
>> compromises. If time permits I will also attempt on integration of
>> django-secure.
>> 
>> #Description
>> ##Centralized tokenization
>> There are multiple places in django which use some or other kinds of
>> tokens:
>> 
>> - contirb.auth (random password, password reset)
>> - formtools
>> - session (backends)
>> - cache
>> - csrf
>> - etags
>> 
>> Token generation is pretty common around the framework.  So, instead
>> of each application having its own token system, and hence needs to be
>> maintained separately. There should be centralized token system, which
>> provides an abstract API for everyone to loose. In fact, I have seen
>> that some apps use `User.objects.make_random_password` from
>> contrib.auth, which they can be sure of being maintained in the future
>> for random generation. To me this looks kind of weird.
>> In last djangocon, a lot of work regarding this was done over [Yarko's
>> Fork][yarko-fork].
>> 
>> I had a discussion with Yarko Tymciurak regarding this. The work is
>> nearly ready for a merge, only some tasks left. In the initial period
>> my SoC I can work over these to insure that the already done
>> significant work gets in django and is updated for 1.5.
>> 
>> - Porting more stuff to the new system (README.sec in
>>   [yarko's fork][yarko-fork])
>> - Testing - See if the current coverage of the tests is enough, write
>>   them if not.
>> - Compatibility issues
>> - API Documentation
>> 
>> I will study the changes done at djangocon and then attempt the tasks
>> mentioned above.
>> 
>> ##CSRF Improvements
>> 
>> Cross-Origin Resource Sharing (CORS):  
>> W3C has a working draft regarding [CORS][w3c-cors-draft], which opens
>> up the possibility for allowing client-side request cross-origin
>> requests. This directly triggers in mind the capability to develop
>> API which can be exposed directly to the web browser. This would let
>> us get rid of proxies and other hacks used to achieve this.
>> Currently all the major browsers support this: Chrome (all versions),
>> Firefox (> 3.0), IE (> 7.0), Safari (> 3.2), Opera (> 12.0).
>> Introduced it here as some further parts of the post refer to this.
>> 
>> ###Origin checking
>> 
>> With CORS around need for using CSRF token can be dropped, at least in
>> some browsers. [Ticket #16859][orig-check-ticket], is an attempt for
>> that. But this was rejected because of neglecting the case for
>> presence of `CSRF_COOKE_DOMAIN` (Refer to the closing comment on the
>> ticket for details). So to handle this we need to simulate checking of
>> CSRF cookie domain as web browsers do it. Maybe:
>> 
>> ```python
>> reqest.META.get('HTTP_ORIGIN').endswith(settings.CSRF_COOKIE_DOMAIN)
>> ```
>> 
>> As the closing comment points it out, we can't do this with secure
>> requests. They need to be essentially checked against the referrer or
>> origin, at least for now. We can not be sure that some untrusted or
>> insecure subdomain has not already set the cookie or cookie domain.
>> d
>> To deal with this, we have to consider https separately as it is
>> being done now. So it will be something like:
>> 
>> 
>> ```python
>> def process_view(self, request, ....):
>> 
>>    # Same initial setup
>> 
>>    if request.method not in ('GET', 'HEAD', 'OPTIONS', 'TRACE'):
>> 
>>        host = request.get_host()
>>        origin = reqest.META.get('HTTP_ORIGIN', "")
>>        cookie_domain = settings.CSRF_COOKIE_DOMAIN
>> 
>>        if request.is_secure():
>>            good_referer = 'https://%s/' % host
>>            referer = origin or request.META.get('HTTP_REFERER')
>>            # Do the same origin checks here
>> 
>>        # We are insecure, so care less
>>        # A better way for this check can be used if needed
>>        elif origin.endswith(cookie_domain):
>>            # Safe, accept request
>> 
>>        # Some unsupported browser
>>        # Do the conventional checks here
>> ```
>> 
>> If the above were to be implemented, the setting `CSRF_COOKIE_DOMAIN`
>> should be deprecated for something like `CSRF_ALLOWED_DOMAIN` which
>> makes more sense.
>> 
>> I would also suggest making CSRF cookie as http only. There doesn't
>> seem a reason currently why the cookies would be needed to be accessed
>> in browser.
>> 
>> ###Less restrictive secure requests
>> 
>> The current CSRF system is pretty much secure as it is. But CSRF
>> protection poses too much restriction to https. It says no to all the
>> request, without honouring any tokens. It kind of has to, thanks to
>> the way browsers allow cookie access. A cookie accessible through
>> subdomains mean that any subdomain secure or insecure can set the CSRF
>> token, which could be really serious for the site security. To get
>> around this, currently one has to completely exempt views from CSRF
>> and may or may not handle CSRF attacks. This can be dangerous. Also if
>> a person has a set of sites, which talk to each other through clients
>> and decides to run it over https, it would need some modifications.
>> 
>> Django should behave under https similarly as it does under http
>> without compromising any security. So, we need to make sure that the
>> CSRF token is always set by a trusted site. Signing the data with the
>> same key, probably `settings.SECRET_KEY`, across the sites looks apt
>> for this, using `django.core.signing`. We can have `get_token` and
>> `set_token` methods which abstract the signing process.
>> This can be done in two ways:
>> 
>> - Store CSRF data in sessions data in case `contrib.sessions` is
>>   installed. Then the data will automatically be signed with the
>>   secret key or will not be stored in the client as cookies at all.
>> 
>> - In case of it being absent from installed apps, revert to custom
>>   signing
>> 
>> ```python
>> from django.core.signing import TimestampSigner
>> 
>> signer = TimestampSigner("csrf-token")
>> CSRF_COOKIE_MAX_AGE = 60 * 60 * 24 * 7 * 52
>> 
>> 
>> def get_unsigned_token(request):
>>    # BadSignature exception needs to be handled somewhere
>>    return signer.unsign(request.META.get("CSRF_COOKIE", None)
>>                         max_age = CSRF_COOKIE_MAX_AGE)
>> 
>> def set_signed_token(response, token):
>>    response.set_cookie(settings.CSRF_COOKIE_NAME,
>>                        signer.sign(request.META["CSRF_COOKIE"]),
>>                        max_age = CSRF_COOKIE_MAX_AGE,
>>                        domain=settings.CSRF_COOKIE_DOMAIN,
>>                        path=settings.CSRF_COOKIE_PATH,
>>                        secure=settings.CSRF_COOKIE_SECURE
>>                        )
>> 
>> 
>> def get_token(request):
>>    if 'django.contrib.sessions' in settings.INSTALLED_APPS:
>>        return request.session.csrf_token
>>    else:
>>        return get_unsigned_token(request)
>> 
>> def set_token(response, token)
>>    if 'django.contrib.sessions' in settings.INSTALLED_APPS:
>>        request.session.csrf_token = token
>>    else:
>>        set_signed_token(response, token)
>> 
>> # Comparing to the token in the request
>> constant_time_compare(request_csrf_token, get_token(csrf_token))
>> 
>> ```
>> 
>> Now, doing this is not as simple as the above code block makes it
>> look. There is a lot which can and probably will go wrong with this
>> approach:
>> 
>> - Even when the token is signed, other domains can completely replace
>>   the CSRF token cookie, it won't grant them access through CSRF
>>   check though.
>> 
>> - This sort of couples CSRF with sessions, a contrib app. Currently
>>   nothing except some of the other contrib apps are tied up with
>>   sessions. It will break if sessions were to be removed in future or
>>   the API changed. Also, this means that if one website is using
>>   sessions CSRF, all of the other must be too.
>> 
>> - If this were successfully implemented, is this exposing any
>>   critical security flaws otherwise? Will it cause compatibility
>>   issues?
>> 
>> As Paul McMillan said "This is a hard problem", I'll delegate figuring
>> this to future me. I will look into [The Tangled Web][tangled-web]
>> and [Google's Browser Security Handbook][gobrowsersec] for ideas,
>> again suggested by Paul on the IRC.
>> 
>> ###Better CORS Support
>> Since, already introducing Origin checking, we can go one step further
>> and try to provide better support for CORS for browsers supporting it.
>> A tuple/list setting, which specifies allowed domains will be
>> provided. Using this the various access control allowance response
>> headers will be set when the request origin is from amongst the
>> allowed domains. For CSRF check, just see if http origin is present in
>> allowed domains.
>> 
>> ```python
>> 
>> def set_cors_headers(response, origin):
>>    response['Access-Control-Allow-Origin']: origin
>> 
>> def process_response(self, request, response):
>> 
>>    origin = reqest.META.get('HTTP_ORIGIN', "")
>> 
>>    if origin in settings.CSRF_ALLOWED_DOMAINS:
>>        set_cors_headers(response, origin)
>> 
>> def process_request(self, request, response):
>> 
>>    # Use origin in settings.CSRF_ALLOWED_DOMAINS here instead of
>>    # origin.endswith
>> 
>> ```
>> 
>> Probably, something similar to the above will be needed to incorporate
>> the CORS support.
>> 
>> ##Integrating django-secure
>> A really useful app for catching security configuration related
>> mistakes is [carljm's django-secure][djang-secure]. It is specially
>> useful to find out issues that might have been introduced while quick
>> changes to settings for development. This project is popular and
>> useful enough that it can be shipped with django. I haven't been able
>> give this enough time yet. I can think of two ways of integrating
>> this:
>> 
>> - Dropping it as a contrib app  
>>   This seems pretty straight forward would require minimal amount of
>>   changes.
>> 
>> - Distribute around the framework:  
>>   Like CSRF, this can also be distributed framework wide and hence it
>>   won't be optional to have. Apps can still define custom checks in
>>   the same way when `django-secure` was installed as a pluggable
>>   application.
>> 
>> The app might also need some changes whilst being integrated:
>> 
>> - More security checks, if required
>> - Adjust according to the changes introduced above.
>> 
>> #Plan
>> I think that the tasks centralized tokenization and CSRF enhancements
>> will be enough to span through the SoC period. If after a thorough
>> implementation and testing of these, I still have time, django-secure
>> integration can be looked into.
>> 
>> ##Timeline
>> I have listed the tasks above in a chronological order. I'll add a
>> more granular timeline in one of the next drafts.
>> 
>> 
>> [yarko-fork]: https://github.com/yarko/django
>> [w3c-cors-draft]: http://www.w3.org/TR/access-control/
>> [orig-check-ticket]: https://code.djangoproject.com/ticket/16859
>> [tangled-web]: 
>> http://www.amazon.com/The-Tangled-Web-Securing-Applications/dp/1593273886/
>> [gobrowsersec]: http://code.google.com/p/browsersec/wiki/Main
>> [django-secure]: https://github.com/carljm/django-secure
>> 
>> 
>> --
>> Thanks
>> Rohan Jain
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Django developers" group.
> To post to this group, send email to django-developers@googlegroups.com.
> To unsubscribe from this group, send email to 
> django-developers+unsubscr...@googlegroups.com.
> For more options, visit this group at 
> http://groups.google.com/group/django-developers?hl=en.
> 

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.

Re: GSoC 2012: Security Enhancements

Reply via email to