#33981: Squelching stack traces from DisallowedHost / Django's handling of vulnerability scans in production

2022-11-13 Thread Rob Banagale
I'm writing to request support for reconsidering the wontfix of #33981 
, which squelches the stack 
trace resulting from SuspiciousOperation DisallowedHost introduced in #13910 
.

#33981 should be reconsidered because discussion of the related issue, 
#13910 , focused strictly on behavioral consistency and code correctness 
but did not include discussion of the impact of this change.

The problem of excessive logs due to vulnerability scans in production 
deployments of Django goes back 10 years to #19866 
.

The main argument for a noisy DisallowedHost exception back then was to 
reduce the challenge of debugging an improper production configuration of 
Django for production.

In line with that argument, one of the arguments for closing #33981 was the 
desire to "see an exception when a traceback is logged."

However, the concern of needing logging around *this specific exception* 
was seemingly handled 
 when 
SuspiciousOperation was originally downgraded to a warning:

"I'm a bit concerned that changing it to a warning leaves us overly-prone 
to not-quickly-detected mis-configuration, but maybe we need to address 
that via something like #17101  
instead."

#17101 was the feature implementation of the --deploy option for the Django 
management command. This feature, combined with the clearly explained 
documentation 
in Deployment Checklist 

 
seems to reasonably handle this concern.

However, earlier this year, when PR #13910 
 provided consistency to 
logging calls from response_for_exception(), the change added a stack trace 
to SuspiciousOperation, (which includes DisallowedHost) and seriously 
buffed the log noise from ongoing vulnerability scans.

Two months ago, someone submitted a PR #33981 
, suggesting the inclusion of 
stack traces on DisallowedHost was a mistake. 

#33981's author makes the following points:

   1. "Requiring a user to write a custom formatter in order to eliminate 
   an exception trace dump from the logs is, in my view, a useability mistake."
   
   2. Edited down for brevity: 
   
   "There are places in the framework where exception handlers serve as a 
   catch-all for exceptions...tracebacks are needed when logs are written from 
   these locations as they are so general and it's too late to find a clean 
   way to handle them all.
   
   SuspiciousOperation, is a handled exception, handled cleanly and 
   specifically. The probability you want a traceback is low. So why generate 
   a traceback for this well handled exception? That just has the effect of 
   potentially distracting from any other more urgent tracebacks."

I believe both of these points have merit consideration above and beyond 
the historical context and objection handling provided above.

The challenge of production deployment for Django comes up all too often. 
The --deploy option and docs reasonably deal with alerting users to 
ALLOWED_HOSTS configuration gotchas.

SuspiciousOperation exceptions, in general, focus on user behavior around 
common security concerns. Warnings alone provide clear enough indication of 
what has been attempted with stack traces offering limited value. 

In the case of DisallowedHost, including these traces arguably sets every 
production deployment of a Django project up for logs that mostly consist 
of one stack trace being displayed over and over again. 

I want to address one other point: People have complained about the noise 
of DissallowedHost when it was only a Warning in the logs.  

In years past, I've seen these complaints about vulnerability scanning log 
spam deflected to developer's failure to properly configure their web 
server or reverse proxy. ~"Your stack is wrong. This should never reach 
python."

These complaints persist and varying use cases make these kind of 
configuration changes are non-trivial 
.
 
This solution to DisallowedHost log noise blows problem of "I just want to 
be able to see my Django application's debug logs" up into a new 
stratosphere of challenge in infrastructure configuration.

The author of #33981 was referred to this mailing list to seek agreement of 
additional consideration from this group. 

Though the author seemed impassioned to express their frustration and 
desire for serious consideration of their points and solution, I did not 
see that they took this costly additional step.

I have again been dealing with this default behavior interfering with good 
faith efforts to view debug logging on an i

Re: Advancing the "content negotiation" and "modernising request object" proposals.

2022-11-13 Thread 'st...@jigsawtech.co.uk' via Django developers (Contributions to Django itself)
I also agree with raising a UnsupportedMediaType and having custom handlers 
for 400, 415 is always useful IMO

On Saturday, 12 November 2022 at 12:24:45 UTC Adam Johnson wrote:

> I would have a slight preference for raising an UnsupportedMediaType as 
>> well and letting that percolate to a 415 as it seems more correct from a 
>> content negotiation perspective.
>
>
> Thinking about it again I think I have a slight preference too.
>
> I guess this would warrant adding a urlconf option for a custom handler415 
> view, like handler400 etc. ?
>
> On Fri, Nov 11, 2022 at 4:58 PM charettes  wrote:
>
>> > DRF’s behaviour feels more correct to me, since it allows terser views 
>> that don’t check the content type explicitly. But it’s less backwards 
>> compatible. I’m not sure which I prefer.
>>
>> Given the .data attribute would be a new feature of the request object I 
>> assume we don't have any backward compatiblity concerns to worry about as 
>> long as we document the behaviour of .data properly and leave .POST 
>> unchanged? I would have a slight preference for raising an 
>> UnsupportedMediaType as well and letting that percolate to a 415 as it 
>> seems more correct from a content negotiation perspective.
>>
>> Le vendredi 11 novembre 2022 à 11:22:44 UTC-5, Adam Johnson a écrit :
>>
>>> This first-step solution is good with me. It will allow everyone to 
>>> switch to request.data (etc.). And there’d be a clear way to use your own 
>>> logic to set request.data if needed: write a middleware (or view decorator, 
>>> view class, etc.).
>>>
>>> What should request.data be/do in the case of an unsupported content 
>>> type? Currently request.POST returns an empty QueryDict. But DRF raises 
>>> UnsupportedMediaType if it has no matching parser, which is translated into 
>>> a 415 Unsupported Media Type response.
>>>
>>> DRF’s behaviour feels more correct to me, since it allows terser views 
>>> that don’t check the content type explicitly. But it’s less backwards 
>>> compatible. I’m not sure which I prefer.
>>>
>>> On Thu, Nov 10, 2022 at 3:14 AM charettes  wrote:
>>>
 Hello Carlton,

 This is not an area of the code base I'm heavily involved with but the 
 increment approach you are proposing over this lack of feature support for 
 basic content negotiation seems like a sane approach to gradually make the 
 landscape better in this area without trying to get everything just right 
 in a single stab.

 Adding ``request.data`` with support limited to JSON bodies at first 
 seems the minimal step to lay some foundations towards revisiting the 
 inclusion of very core/HTTP centric features that are sadly only available 
 in DRF at the moment.

 +1 from me.

 Cheers,
 Simon

 Le mercredi 9 novembre 2022 à 06:32:53 UTC-5, carlton...@gmail.com a 
 écrit :

> Hi all.
>
> I'm looking for a high-level sanity check if you would. 
>
> I've been trying to see a way forward through a nest of issues around 
> two concrete proposals:
>
> 1. Adding "content negotiation" to the request object, allowing 
> automatical parsing of different content types, such as JSON, as well as 
> allowing that to work for all request methods, rather than just POST. 
>
> 2. Modernising the API for the request object, adding attributes such 
> as `request.data`, and `request.query_params`, etc., rather than the 
> uppercase POST, GET, and so on.
>
> The first is a major stepping stone towards having (JSON or other) API 
> support in core — the "merge DRF into core" request that comes up 
> frequently. (The other main side of that would be a review of 
> serialization 
> and forms, in light of developments such as Pydantic, attrs/cattrs, and 
> django-readers, but that is **not** on topic here.) This was first 
> suggested in 2011, but has made little progress in that time. [0][1]
>
> [0]: https://groups.google.com/g/django-developers/c/4c4xT3ULNLk
> [1]: https://code.djangoproject.com/ticket/21442
>
> The second Adam Johnson proposed 2020, and was nearly merged bar 
> Mariusz
> **blinking** at the size of the distruption, particularly for 
> documentation
> throughout the community, for no change in behaviour. [2][3]
>
> There was an inconclusive discussion about whether we right there[4] 
> but, at the time I linked the modernisation to the content negotiation 
> issue, as the feature needed to pay for the change. 
>
> [2]: 
> https://groups.google.com/g/django-developers/c/Kx8BfU-z4_E/m/lFXTF0IMCQAJ
> [3]: https://code.djangoproject.com/ticket/32259
> [4]: 
> https://forum.djangoproject.com/t/technical-board-vote-on-ticket-32259-modernize-request-attribute-names/10255
>
> Digging further into the history, with a mind to move these issues 
> forward, having **not** merged Adam's patch