Re: Smart input filtering

Luke Plant Mon, 21 Nov 2005 14:58:58 -0800

On Sun, 20 Nov 2005 20:21:54 -0600 Jacob Kaplan-Moss wrote:

> I think this is a brilliant idea.  As far as I know no other  
> frameworks take input security quite as seriously as they should,
> and I'd love nothing more than to become the OpenBSD of web
> frameworks.
> 
> That said, I'm less that thrilled about the request.GET['foo'].as_int 
> () syntax.  Having a bunch of as_* methods seems slightly crufty to  
> me.  I *really* like having the default request.GET['foo'] return a  
> "safe" object... but the as_* methods kinda weird me out.  I'd have  
> to play with it, but perhaps GET.__getitem__ could return a string  
> subclass that acted "safe" unless cast to an "unsafe" object
> explicitly?


This highlights the problem with this approach - what is "safe"?  Safe
for output in HTML? Safe against SQL injection attacks (assume for a
second someone isn't using Django's DB backend)?  Safe for inserting
directly into PDF or Postscript? Safe to be eval'ed in Python?

I have seen this kind of thing in action and hated it.  In ASP.NET (my
day job, got to put the bread on the table somehow, etc. </violins>),
they have something very like this - nasty HTML throws an exception
back to the user (in debug mode at least, I presume a 403 if debug is
off). The problem is that suddenly someone is unable to use your app.
Yesterday their blog post/comment/whatever worked fine, but some
character they have added today has triggered the filter, and they are
now getting an exception so that the app never sees their input, or
parts of their input are mysteriously being stripped out when they go
back and check.  And as an application developer, you've no idea this
has happened.

I think I could live with it if you *always* had to specify what the
output format of GET and POST should be, but a magical default format
is a really bad idea IMO.  The magic is never good enough to rely on for
security, because it is impossible to know what the app is going to do
with the data.  So you can end up developing *less* secure apps - if
you are encouraging the developer to believe that the default output is
"safe"  (against an unspecified attack) you are telling them they don't
need to worry about security, but they always do. You also have to know
all about browser peculiarities, as someone else mentioned, and you can
never protect against tomorrow's bugs and exploits.

You *cannot* make development at this level idiot proof.  Until you
can, you should never say "it's OK to be an idiot here, we'll watch out
for you." (I hope that doesn't sound arrogant -- I know perfectly well
I am capable of being an idiot and not sanitising input, but the
framework encouraging me not to think won't help me do better).

So, I strongly think GET['foo'] should either be raw or you should 
be forced to do GET.rawstring('foo','default').

Luke

-- 
OSBORN'S LAW
    Variables won't, constants aren't.

Luke Plant || L.Plant.98 (at) cantab.net || http://lukeplant.me.uk/

Re: Smart input filtering

Reply via email to