Re: GSoC Check-in: Security Enhancements

2012-06-12 Thread Rohan Jain
I have done some work on CSRF revolving around origin header checking.
The origin header is fairly new and is not yet implemented in a
uniform fashion in the major browsers, so it cannot be solely relied
upon for CSRF checks. Instead we check if the header exists and use it
only for rejection of requests.

If an `HTTP_ORIGIN` header is sent by the browser, it checks if it
matches to be from a valid origin. In case it is not the request is
rejected right away. Otherwise it proceeds for further CSRF checks 

Because of the way browser cookies behave, I had to do some unpleasant
tweaks to origin header checking such that `COOKIE_DOMAIN_SETTING` is
followed. From the commit:

if  not ((good_origin.startswith('.')
  and good_origin.count('.') is origin.count('.')
  and origin.endswith(good_origin))
 or origin[origin.find('://')+3:] == good_origin):

All this is because using cookies open the possibility for allowing
cross subdomain requests (a side effect?). I had a chat with Paul for
breaking (fixing) this behaviour. The plan is to:

 - Make the strict referer checking, currently implemented only for
   HTTPS, default for http requests too. The http scheme which is more
   popular definitely deserves better checking.
 - Make the origin checking work similar to the way referer checking
   is done. Also, will it be safe to bypass referer checking in case
   of origin header being present?
 - Implement a PERMITTTED_DOMAINS setting which lets administrators
   explicitly mention the domains which are permitted to make the
   otherwise restricted requests. This gives more control to the
   administrators.
 - The PERMITTED_DOMAINS setting, a list, accepts patterns. Regex
   cannot be used here because characters like `.`, `-` will need
   escaping. So I decided to settle on the simpler unix style globs.
   The pattern matching is done through the `fnmatch` library
   function, documented [here][fnmatch-docs]. One can use this setting
   like this:

PERMITTED_DOMAINS = [
'www.example.com',  # Exactly the domain `www.example.com`
'*.supertrusteddomain.com', # All subdomains to 
`supertrusteddomain.com`
'?.services.example.com',   # Single letter subdomains to 
`.services.example.com`
]

I have done an initial implementation of these, changes in [pull
request #95][pull-95]. I'll now proceed to clean these up, writing
better tests and documentation for these. Also with these, we can
completely get rid of the cookie based CSRF check system.

--
Thanks
Rohan Jain

[fnmatch-docs]: http://docs.python.org/library/fnmatch.html
[pull-95]: https://github.com/django/django/pull/95

On 20:10 +0530 / 21 May, Rohan Jain wrote:
> Hi,
> 
> Since my last check in I worked on improvements to
> contrib.sessions:
> 
>  - Introduction of signing framework
>  - Session expiry checks (for Ticket [#18194][0]
>  - And some other trivial patches.
> 
> The tests (existing and the one which I added) are passing.
> 
> These changes are in my [Pull Request #78][1] over github.
> Paul, could you please review it to see if the patches are usable.
> 
> Next, I'll make the changes which may be required in documentation
> because of the above.
> Today is official start date of the GSoC project, so I'll now start
> concentrating more on the project now.
> 
> Rohan Jain
> 
> [0]: https://code.djangoproject.com/ticket/18194
> [1]: https://github.com/django/django/pull/78
> 
> On Mon, May 7, 2012 at 12:21 PM, Rohan Jain  wrote:
> > Hi,
> >
> > Last week I looked into the Ticket [#18194][0]:
> >
> >  - Trivial attempts to handle the issue.
> >  - Wrote a minor initial patch.
> >  - The test fails for Cache and Cookie backend.
> >
> > Also, I looked at the talks from Paul regarding advanced security
> > topics at py/django cons. Realised that why I should not attempt
> > anything related to encryption in my project.
> >
> > There is high academic pressure currently, so I am not able to give
> > enough time to these. I think the situation will be better this
> > weekend onwards.
> >
> > I'll try to work on:
> >
> >  - Write tests which emulate the problem in #18194 well, and then work
> > on the final fix.
> >  - Start looking into resources useful for my project, like [The
> > Tangled Web][1].
> >
> > Rohan Jain
> >
> >
> > [0]: https://code.djangoproject.com/ticket/18194
> > [1]: 
> > http://www.amazon.com/The-Tangled-Web-Securing-Applications/dp/1593273886

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: json vs simplejson

2012-06-12 Thread Vinay Sajip
On Jun 11, 10:51 pm, Luke Plant  wrote:


> We've switched internally from json to simplejson. Our 1.5 release notes
> say:

Do you mean the other way around?

> You can safely change any use of django.utils.simplejson to json
>
> I just found a very big difference between json and simplejson
>
> >>> simplejson.loads('{"x":"y"}')
>
> {'x': 'y'}
>
> >>> json.loads('{"x":"y"}')
>
> {u'x': u'y'}
>
> i.e. simplejson returns bytestrings if the string is ASCII (it returns
> unicode objects otherwise), while json returns unicode objects always.
>
> This was, unfortunately, a very unfortunate design decision on the part
> of simplejson - json is definitely correct here - and a very big change
> in semantics. It led to one very difficult to debug error for me already.

Right. And on Python 3, the json module (correctly) doesn't accept
byte-strings at all.

> So, this is a shout out to other people to watch out for this, and a
> call for ideas on what we can do to mitigate the impact of this. It's
> likely to crop up in all kinds of horrible places, deep in libraries
> that you can't do much about. In my case I was loading config, including
> passwords, from a config file in JSON, and the password was now
> exploding inside smtplib because it was a unicode object.

This is one place where there are limitations in the 2.x stdlib -
other places include cStringIO and cookies. For example, if you pass a
Unicode object to a cStringIO.StringIO, it doesn't complain, but does
the wrong thing:

>>> from cStringIO import StringIO; StringIO(u'abc').getvalue()
'a\x00b\x00c\x00'
>>>

Fun and games ...

I'm not sure there's any easy way out, other than comprehensive
testing.

Regards,

Vinay Sajip

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: json vs simplejson

2012-06-12 Thread Luke Plant
On 12/06/12 06:14, Alex Ogier wrote:

> This seemed strange to me because the standard library json shipping
> with python 2.7.3 is in fact simplejson 2.0.9, so I did some digging.
> It turns out that if the C extensions have been compiled and you pass
> a str instance to loads(), then you get that behavior in both
> versions. This isn't documented anywhere, but here's the offending
> pieces:
> 
> http://hg.python.org/releasing/2.7.3/file/7bb96963d067/Modules/_json.c#l419
> https://github.com/simplejson/simplejson/blob/master/simplejson/_speedups.c#L527
> 
> If the C extensions aren't enabled, or you pass a unicode string to
> loads(), then you get the "proper" behavior as documented. I'm not
> sure how you are triggering this optimized, iffy behavior in
> django.utils.simplejson though, without also triggering it in the
> standard library. Did you ever install simplejson with 'pip install
> simplejson' such that Django picked it up? Can you try running 'from
> django.utils import simplejson; print simplejson.__version__'?

Thanks for digging into that.

(BTW, in reply to Vinay, yes I meant "from simplejson to json", not the
other way around).

I've found the same difference of behaviour on both a production machine
where I'm running my app (CentOS machine, using a virtualenv, Python
2.7.3), and locally on my dev machine which is currently running Debian,
using the Debian Python 2.7.2 packages.

In both cases, json is always returning unicode objects, which implies I
don't have the C extensions for the json module according to your
analysis. I don't know enough about how this is supposed to work to
understand why.

It also implies I probably not the only one affected by this, if it's
happened on two quite different machines. Looking at this discussion:

http://stackoverflow.com/questions/712791/json-and-simplejson-module-differences-in-python

it seems that lots of people don't have the C extension for json
(reporting json 10x slower than simplejson).

Luke

-- 
OSBORN'S LAW
Variables won't, constants aren't.

Luke Plant || http://lukeplant.me.uk/

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: json vs simplejson

2012-06-12 Thread Luke Plant
On 12/06/12 10:58, Vinay Sajip wrote:
> 
> I'm not sure there's any easy way out, other than comprehensive
> testing.

There is another issue I found.

Django's DateTimeAwareJSONEncoder now subclasses json.JSONEncoder
instead of simplejson.JSONEncoder. The two are not perfectly compatible.
simplejson.dumps() passes the keyword argument 'namedtuple_as_object' to
the JSON encoder class that you pass in, but json.JSONEncoder doesn't
accept that argument, resulting in a TypeError.

So any library that uses Django's JSONEncoder subclasses, but uses
simplejson.dumps() (either via 'import simplejson' or 'import
django.utils.simplejson') will break. I found this already with
django-piston.

I think we at least need a bigger section in the release notes about this.

Luke

-- 
OSBORN'S LAW
Variables won't, constants aren't.

Luke Plant || http://lukeplant.me.uk/

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: json vs simplejson

2012-06-12 Thread Alex Ogier
On Jun 12, 2012 6:54 AM, "Luke Plant"  wrote:
> I've found the same difference of behaviour on both a production machine
> where I'm running my app (CentOS machine, using a virtualenv, Python
> 2.7.3), and locally on my dev machine which is currently running Debian,
> using the Debian Python 2.7.2 packages.
>
> In both cases, json is always returning unicode objects, which implies I
> don't have the C extensions for the json module according to your
> analysis. I don't know enough about how this is supposed to work to
> understand why.
>

I'm not sure why no one is getting speedups from simplejson, but I can
tell you that on python 2.6+ django.utils.simplejson.loads should be
an alias for json.loads:

>>> import json
>>> json.loads('{"a":"b"}')
{u'a': u'b'}
>>> from django.utils import simplejson
>>> simplejson.loads('{"a":"b"}')
{u'a': u'b'}
>>> json.loads == simplejson.loads
True

Best,
Alex Ogier

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: json vs simplejson

2012-06-12 Thread Alex Ogier
On Tue, Jun 12, 2012 at 7:19 AM, Luke Plant  wrote:
>
> There is another issue I found.
>
> Django's DateTimeAwareJSONEncoder now subclasses json.JSONEncoder
> instead of simplejson.JSONEncoder. The two are not perfectly compatible.
> simplejson.dumps() passes the keyword argument 'namedtuple_as_object' to
> the JSON encoder class that you pass in, but json.JSONEncoder doesn't
> accept that argument, resulting in a TypeError.
>
> So any library that uses Django's JSONEncoder subclasses, but uses
> simplejson.dumps() (either via 'import simplejson' or 'import
> django.utils.simplejson') will break. I found this already with
> django-piston.
>

Wait, 'import simplejson' works? Then that explains your problems. You
are using a library you installed yourself that has C extensions,
instead of the system json. If you switch to a system without
simplejson installed, then you should see the "proper" behavior from
django.utils.simplejson.loads(). If your program depends on some
optimized behavior of the C parser such as returning str instances
when it finds ASCII, it is bugged already on systems without
simplejson. If Django depends on optimized behavior, then it is a bug,
and a ticket should be filed.

Best,
Alex Ogier

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: json vs simplejson

2012-06-12 Thread Luke Plant
On 12/06/12 13:28, Alex Ogier wrote:

> Wait, 'import simplejson' works? Then that explains your problems. You
> are using a library you installed yourself that has C extensions,
> instead of the system json. If you switch to a system without
> simplejson installed, then you should see the "proper" behavior from
> django.utils.simplejson.loads(). If your program depends on some
> optimized behavior of the C parser such as returning str instances
> when it finds ASCII, it is bugged already on systems without
> simplejson. If Django depends on optimized behavior, then it is a bug,
> and a ticket should be filed.

I agree my existing program had a bug. I had simplejson installed
because a dependency pulled it in (which means it can be difficult to
get rid of).

The thing I was flagging up was that the release notes say "You can
safely change any use of django.utils.simplejson to json." I'm just
saying the two differences I've found probably warrant at least some
documentation.

The second issue is difficult to argue as a bug in my program or
dependencies. Django has moved from a providing a JSONEncoder object
that supported a certain keyword argument to one that doesn't. We could
'fix' it to some extent:

class DjangoJSONEncoder(json.JSONEncoder):
def __init__(self, *args, **kwargs):
kwargs.pop('namedtuple_as_object')
super(DjangoJSONEncoder, self).__init__(*args, **kwargs)

But like that, it would create more problems if the json module ever
gained that keyword argument in the future.


Luke


-- 
OSBORN'S LAW
Variables won't, constants aren't.

Luke Plant || http://lukeplant.me.uk/

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: json vs simplejson

2012-06-12 Thread Alex Ogier
On Tue, Jun 12, 2012 at 8:49 AM, Luke Plant  wrote:
>
> I agree my existing program had a bug. I had simplejson installed
> because a dependency pulled it in (which means it can be difficult to
> get rid of).
>
> The thing I was flagging up was that the release notes say "You can
> safely change any use of django.utils.simplejson to json." I'm just
> saying the two differences I've found probably warrant at least some
> documentation.
>
> The second issue is difficult to argue as a bug in my program or
> dependencies. Django has moved from a providing a JSONEncoder object
> that supported a certain keyword argument to one that doesn't. We could
> 'fix' it to some extent:
>
> class DjangoJSONEncoder(json.JSONEncoder):
>    def __init__(self, *args, **kwargs):
>        kwargs.pop('namedtuple_as_object')
>        super(DjangoJSONEncoder, self).__init__(*args, **kwargs)
>
> But like that, it would create more problems if the json module ever
> gained that keyword argument in the future.
>

Like loads(), json.JSONEncoder is just an alias for
simplejson.JSONEncoder, and we need to support versions of simplejson
down to 1.9 which is what python 2.6 ships with. This
'namedtuple_as_object' thing seems to only appear as of simplejson
2.2, which means that depending on it is a bug that appears on any
system without a recent version of simplejson (for example, the
version that was bundled with Django doesn't support it). Depending on
this kwarg is a bug in Django, and should be fixed.

https://github.com/simplejson/simplejson/blob/namedtuple-object-gh6/simplejson/encoder.py

It's clear that people have begun to depend on the quirky ways in
which simplejson diverged from its earlier codebase. I found the place
where that unicode "proper behavior" was fixed, so apparently in
Python's stdlib they undid the C optimizations at some point. So I was
incorrect earlier, and the C speedups work "properly" with Python
stdlib's patch.

http://bugs.python.org/issue11982

Basically, anyone who depended on features of simplejson added after
1.9, or its wonky optimizations, already had arguably broken code in
that it only worked when simplejson is installed. I'm torn as to
whether we should add a note about these subtle problems when
switching to json, recommend that people switch to simplejson instead,
or undeprecate django.utils.simplejson as a necessary wart (we can
still stop vendoring simplejson though).

Best,
Alex Ogier

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



ModelForms and the Rails input handling vulnerability

2012-06-12 Thread Luke Plant
Hi all,

On django-core we've been discussing an issue that was security related.
However, we decided it wasn't urgent enough to warrant a security
release, and so we're moving the discussion here (especially because we
couldn't agree on what to do).

For the record, I brought up the issue initially, and favour the most
secure of the options outlined below, but I'll try to summarise with as
little bias as I can!

The issue was brought up by the Rails input handling vulnerability a
while back [1]. To sum that up, it seems that a common Rails idiom is
something like, using Django syntax:

   MyModel.objects.create(**request.POST)

(only more complicated). To avoid people having access to privileged
fields, you are supposed to set an attribute on MyModel to restrict the
fields that can be set in this way. This is a really poor design choice
(which they've finally admitted [2]), since it is insecure-by-default,
and led to the embarrassing github incident, and probably many more
unknown vulnerabilities in Rails apps.

For us, we don't have this issue because we have the forms layer that
sits between the HTTP request and model creation, and that's the
recommended way to do things.

However, in the case of ModelForms, you can easily get to a situation
where you've effectively got the same thing as Rails. In theory you've
got the forms layer, but in reality your form was autogenerated using
all the fields on the model. This happens if you use a ModelForm without
explicitly defining Meta.fields - which is the easiest thing to do since
it is less work.

While you can use 'Meta.exclude' to remove specific fields, you still
have to remember to do that.

This is particularly dangerous in two fairly common situations:

1) You add new fields to the model that are not supposed to be publicly
edited.

2) In the template, you are listing form fields individually, not just
doing {{ form.as_p }}, so you can't even see the fact that other fields
are editable, but the view/form code does allow an attacker to edit
those fields.

Also, the UpdateView CBV makes it very easy to be vulnerable - it will
generate a ModelForm class with all fields editable by default.

== Options ==

There were three main courses of action that we talked about on django-core.

= Option 1: Leave things as they are, just document the problem. This
was the least popular view on django-core.

= Option 2: Deprecate Meta.exclude, but still allow a missing
Meta.fields attribute to indicate that all fields should be editable.

The policy here is essentially this: if any fields the model are
sensitive, assume all are potentially, and require whitelisting, not
blacklisting.

= Option 3: Make the 'Meta.fields' attribute a required attribute, which
must list all the model fields that should be editable. Without it,
ModelForms would not work at all.

This also means deprecating Meta.exclude (it's redundant once you are
saying that all fields should be explicitly whitelisted).


Note that the admin would not be affected under any of these options -
the admin has its own mechanism for setting the Meta.fields, and "all
fields editable by default" is a good policy for something like the admin.

For either 2) or 3), we would be making the change in Django 1.6, with a
deprecation path - faster than our normal deprecation path, but not
immediate.

Also the UpdateView and CreateView CBVs would need modifying under at
least option 3, to match the requirements of the ModelForms they will
need to generate.

== Comparison ==

== Option 1: This is the least secure, but most convenient - you have
several ways to specify which fields should be editable, you can use
whichever you like. "We're all consenting adults here".

An argument in favour of keeping this is if we don't, people will just
use 'fields = [f.name for f in MyModel._meta.fields]' anyway.

== Option 2: the retains some of the convenience of option 1, while
encouraging more careful handling of "sensitive" models.

== Option 3: the most secure, the least convenient. You have to list all
fields for every ModelForm (except for cases of sub-classing forms,
where the base class's Meta.fields would still work, of course).
"Explicit is better than implicit".

The option doesn't make an assumption that models are either 'sensitive'
or not. It is also more consistent than option 2: if you add a field to
a model, you know that if it is meant to be publicly editable, you have
to edit the ModelForms to add it, and if it is not meant to be editable,
you don't, because the list is always "opt in".

 ~  ~  ~

So - discuss! If you have other options to bring to the table, please
do. Apologies to the core devs if I missed or misrepresented something.


Thanks,

Luke


[1] http://chrisacky.posterous.com/github-you-have-let-us-all-down
[2] http://weblog.rubyonrails.org/2012/3/21/strong-parameters/

-- 
OSBORN'S LAW
Variables won't, constants aren't.

Luke Plant || http://lukeplant.me.uk/

-- 
You received thi

Re: ModelForms and the Rails input handling vulnerability

2012-06-12 Thread Alex Ogier
On Tue, Jun 12, 2012 at 7:16 PM, Luke Plant  wrote:
> Hi all,
>
> On django-core we've been discussing an issue that was security related.
> However, we decided it wasn't urgent enough to warrant a security
> release, and so we're moving the discussion here (especially because we
> couldn't agree on what to do).
>
> For the record, I brought up the issue initially, and favour the most
> secure of the options outlined below, but I'll try to summarise with as
> little bias as I can!
>
> The issue was brought up by the Rails input handling vulnerability a
> while back [1]. To sum that up, it seems that a common Rails idiom is
> something like, using Django syntax:
>
>   MyModel.objects.create(**request.POST)
>
> (only more complicated). To avoid people having access to privileged
> fields, you are supposed to set an attribute on MyModel to restrict the
> fields that can be set in this way. This is a really poor design choice
> (which they've finally admitted [2]), since it is insecure-by-default,
> and led to the embarrassing github incident, and probably many more
> unknown vulnerabilities in Rails apps.
>
> For us, we don't have this issue because we have the forms layer that
> sits between the HTTP request and model creation, and that's the
> recommended way to do things.
>
> However, in the case of ModelForms, you can easily get to a situation
> where you've effectively got the same thing as Rails. In theory you've
> got the forms layer, but in reality your form was autogenerated using
> all the fields on the model. This happens if you use a ModelForm without
> explicitly defining Meta.fields - which is the easiest thing to do since
> it is less work.
>
> While you can use 'Meta.exclude' to remove specific fields, you still
> have to remember to do that.
>
> This is particularly dangerous in two fairly common situations:
>
> 1) You add new fields to the model that are not supposed to be publicly
> edited.
>
> 2) In the template, you are listing form fields individually, not just
> doing {{ form.as_p }}, so you can't even see the fact that other fields
> are editable, but the view/form code does allow an attacker to edit
> those fields.
>
> Also, the UpdateView CBV makes it very easy to be vulnerable - it will
> generate a ModelForm class with all fields editable by default.
>
> == Options ==
>
> There were three main courses of action that we talked about on django-core.
>
> = Option 1: Leave things as they are, just document the problem. This
> was the least popular view on django-core.
>
> = Option 2: Deprecate Meta.exclude, but still allow a missing
> Meta.fields attribute to indicate that all fields should be editable.
>
> The policy here is essentially this: if any fields the model are
> sensitive, assume all are potentially, and require whitelisting, not
> blacklisting.
>
> = Option 3: Make the 'Meta.fields' attribute a required attribute, which
> must list all the model fields that should be editable. Without it,
> ModelForms would not work at all.
>
> This also means deprecating Meta.exclude (it's redundant once you are
> saying that all fields should be explicitly whitelisted).
>
>
> Note that the admin would not be affected under any of these options -
> the admin has its own mechanism for setting the Meta.fields, and "all
> fields editable by default" is a good policy for something like the admin.
>
> For either 2) or 3), we would be making the change in Django 1.6, with a
> deprecation path - faster than our normal deprecation path, but not
> immediate.
>
> Also the UpdateView and CreateView CBVs would need modifying under at
> least option 3, to match the requirements of the ModelForms they will
> need to generate.
>
> == Comparison ==
>
> == Option 1: This is the least secure, but most convenient - you have
> several ways to specify which fields should be editable, you can use
> whichever you like. "We're all consenting adults here".
>
> An argument in favour of keeping this is if we don't, people will just
> use 'fields = [f.name for f in MyModel._meta.fields]' anyway.
>
> == Option 2: the retains some of the convenience of option 1, while
> encouraging more careful handling of "sensitive" models.
>
> == Option 3: the most secure, the least convenient. You have to list all
> fields for every ModelForm (except for cases of sub-classing forms,
> where the base class's Meta.fields would still work, of course).
> "Explicit is better than implicit".
>
> The option doesn't make an assumption that models are either 'sensitive'
> or not. It is also more consistent than option 2: if you add a field to
> a model, you know that if it is meant to be publicly editable, you have
> to edit the ModelForms to add it, and if it is not meant to be editable,
> you don't, because the list is always "opt in".
>

I'm +1 on option 1: just keep it the way it is. The reason is that we
aren't in the same situation as Rails. We have explicit ModelForms
that conform loudly and obviously to the model. These are better t

Re: ModelForms and the Rails input handling vulnerability

2012-06-12 Thread Karen Tracey
On Tue, Jun 12, 2012 at 10:10 PM, Alex Ogier  wrote:

> No one can sneak extra unexpected fields past a developer by editing HTML
> client side, because if the field wasn't rendered to HTML it's not
> going to validate.
>

But it may. If you have a template which renders specific fields, and yet
the form is set to allow a wider set of fields than are actually rendered,
client-side editing CAN result in the form allowing change to a field that
had not been rendered in the template. The Django ModelForm doesn't know
what fields were actually rendered in the HTML, it only knows what fields
have been included/excluded from the ModelForm. You can post data for a
field that was not rendered and it may pass validation and get saved.

Karen

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: ModelForms and the Rails input handling vulnerability

2012-06-12 Thread Gary Reynolds
On 13 June 2012 09:16, Luke Plant  wrote:

> = Option 2: Deprecate Meta.exclude, but still allow a missing
> Meta.fields attribute to indicate that all fields should be editable.
>
> The policy here is essentially this: if any fields the model are
> sensitive, assume all are potentially, and require whitelisting, not
> blacklisting.
>

+1 for option 2 as I've always felt like Meta.exclude was the wrong way to
do go building a ModelForm which was supposed to have a subset of fields.

Given that the underlying Model definition may change, it would be easy to
all of a sudden expose a newly added field. This would be particularly bad
when in your template you use {{ form.as_p }} or similar. I never use
Meta.exclude for this reason, always giving a whitelist anyway.

+0 for option 3 as I think the convenience of a missing Meta.fields is
worthwhile, but I could live without it.

Another option could be to split ModelForm such that you could mix and
match the functionality you require. Moving forward the ModelForm class
could provide option 3 behaviour, and a newly introduced class could
provide the option 2 behaviour. The only difference would be the alias to
Meta.fields not existing results in all model fields, but this would be an
explicit step.

Gary

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: ModelForms and the Rails input handling vulnerability

2012-06-12 Thread Alex Ogier
On Tue, Jun 12, 2012 at 11:43 PM, Karen Tracey  wrote:
> On Tue, Jun 12, 2012 at 10:10 PM, Alex Ogier  wrote:
>>
>> No one can sneak extra unexpected fields past a developer by editing HTML
>> client side, because if the field wasn't rendered to HTML it's not
>> going to validate.
>
>
> But it may. If you have a template which renders specific fields, and yet
> the form is set to allow a wider set of fields than are actually rendered,
> client-side editing CAN result in the form allowing change to a field that
> had not been rendered in the template. The Django ModelForm doesn't know
> what fields were actually rendered in the HTML, it only knows what fields
> have been included/excluded from the ModelForm. You can post data for a
> field that was not rendered and it may pass validation and get saved.
>
> Karen
>

Oof, you are right. I hadn't considered the wrench that templating
throws into the works. I've always done {% for field in form.fields %}
myself, but that's a bad assumption to make.

Best,
Alex Ogier

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: ModelForms and the Rails input handling vulnerability

2012-06-12 Thread Anssi Kääriäinen
On 13 kesä, 02:16, Luke Plant  wrote:
> For the record, I brought up the issue initially, and favour the most
> secure of the options outlined below, but I'll try to summarise with as
> little bias as I can!

> = Option 1: Leave things as they are, just document the problem. This
> was the least popular view on django-core.
>
> = Option 2: Deprecate Meta.exclude, but still allow a missing
> Meta.fields attribute to indicate that all fields should be editable.
>
> The policy here is essentially this: if any fields the model are
> sensitive, assume all are potentially, and require whitelisting, not
> blacklisting.
>
> = Option 3: Make the 'Meta.fields' attribute a required attribute, which
> must list all the model fields that should be editable. Without it,
> ModelForms would not work at all.

I support option 2. I will try to keep the reasoning part short. The
end of post is devoted to another idea.

Here are the reasons why I think Option 3 isn't worth the trouble:

1. Users will learn to use fields = [f.attname for f in
Permission._meta.fields]. At that point we haven't gained anything in
security, but made modelforms less convenient to use.

2. The fix doesn't actually fix the issue for all cases. You are still
allowed to use the same form in a way that renders additional fields
based on user permissions or if the user is authenticated. If you
reuse the same form in multiple views for example, or your template
contains some logic to add/remove fields dynamically.

Yes, option 3 is a bit more secure than option 2, but I don't find the
cost-benefit ratio correct. It will still not protect users from this
admittedly easy to do security mistake. So, I am -0 to option 3.



The short part is now over. I will throw in another idea I think is at
least worth presenting. The idea is that ModelForms contain attribute
contains_secure_data. Defaults to True. If True, then one must use
allowed_fields when initialising the form from request. So, you could
not do this for secure form:
MyForm(request.POST)
instead you would need to do:
MyForm(request.POST, allowed_fields=('field1', 'field2'))

Assume your model Document has three fields: title, body and creator.
If the creator edits his document, he would likely be able to change
the creator to somebody else when editing his document if the form is
defined like this using 1.4 behavior:

class DocumentForm:
   class Meta:
   model = Document

But, the developer could create this form:

class DocumentForm:
class Meta:
model = Document
fields = ('title', 'body')
contains_secure_data = False

This would would not need allowed_fields for form.__init__().

Alternatively, assume title shall not be changed on document edit.
Then the developer could use this form:

class DocumentForm:
class Meta:
model = Document
contains_secure_data = True

The developer would then use this form in create as:
DocumentForm(..., allowed_fields=('title', 'body'))
and in update as:
DocumentForm(..., allowed_fields=('body',))

My opinion is that the form.Meta isn't actually the right place to set
the security restriction. The same form can be used in different
places with different security restriction needs. The security
restriction should be done when the request data is added to the form.

The contains_secure_data could be thrown out for simplicity. However,
I do think it has its place. If allowed_fields (or form.meta.fields)
is always required, it leads to bad practices like using the "all
model.meta._fields". Ask the user if the form is security sensitive.
Only if it is, then ask for the field restrictions.

I'm not saying the above is the correct option. Just throwing in
ideas...

 - Anssi

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



QuerySet refactoring

2012-06-12 Thread Anssi Kääriäinen
I am asking for directions about what to do about
django.db.models.sql.query (actually sql.*). I would like to refactor
the code in small incremental steps. However, this will bring internal
API breakages, and will likely add some more bugs temporarily.

While the ORM mostly works, it IMHO needs some polish. The reasons why
I see ORM refactoring as needed:
  - The code is too complex and underdocumented currently. Some
examples are query.add_q() and compiler.fill_related_selections().
  - Because of the above, adding new features is hard. There are some
long standing bugs which are hard to fix. There are 407 open ORM or
models.Model tickets, of which 247 are bugs).
  - I believe the ORM could be made faster. The ORM currently uses 4x
the time in Python compared to the time the DB needs to parse, plan
and execute a query like this: Model.objects.get(pk=1).

Why incremental rewrite instead of total rewrite? Total rewrite will
likely take so much time as to never actually get done. The underlying
structure of the ORM is good enough. There are things which would
likely be done in different way in total rewrite, but there isn't
anything blocker quality.

Why not use SQLAlchemy or some other ORM? I am no expert of
SQLAlchemy, but I believe it doesn't actually do the same thing as
Django's ORM. The complexity of Django's ORM comes from the need to
handle things like subqueries for negated multijoin lookups and
checking when to use LEFT JOIN, when INNER JOIN. SQLAlchemy doesn't do
that as far as I know.

I have no need to make the  ORM generic enough for no-SQL databases. I
don't believe generating a generic query.py class for no-SQL databases
is the correct approach. 80% of the code in query.py deal with joins,
null handling, subqueries and things like that. None of those would be
common to no-SQL DBs. Instead, no-SQL databases need to deal with
structured records, which isn't a problem for the SQL side of the ORM.
The common things are lookup handling and the API. The API is already
separate from sql.*. The lookup handling should be, too.

So, what I am trying to do? Things like:

https://code.djangoproject.com/ticket/16759 - patch:
https://github.com/akaariai/django/compare/ticket_16759
  - This is a performance improvement for query.clone() which should
alone make the ORM around 20% faster. This is definite DDN stuff, so
don't worry, I won't be committing this one currently.

https://code.djangoproject.com/ticket/17000 - patch:https://github.com/
django/django/pull/92, 
https://github.com/akaariai/django/compare/refactor_utils_tree
  - Make the utils.tree saner to use. Make query.add_q and
where.as_sql() cleaner. Fixes a couple of bugs.

https://code.djangoproject.com/ticket/16715 - somewhat old patch in
ticket.
  - Unify and fix join promotion logic.

There are more tickets somewhat ready in Trac.

The above are small steps to making the ORM easier to handle. The long
term goals at the moment are just cleanup. This cleanup should allow
new features (conditional aggregates, custom lookups etc), but is not
the immediate goal.

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.