Yes, every time you you get data from an untrusted source you must validate 
it. As well as *every time you change model attributes, ever*. There seems 
to be a widespread frame of mind in Django that validation is something you 
only need to do with data from a untrusted sources. As someone who has had 
to deal with the consequences of this pattern in mission critical systems, 
this terrifies me, and I consider it *extremely* harmful. Untrusted users 
are not the only place you can get bad data from. Bugs can happen anywhere, 
and no data source can be considered "safe". It happens *all the time*. 
Nothing is more dangerous than a developer who says "don't worry, I'll 
remember to do everything perfectly 100% of the time".  This is why 
model-level validation is the default in other ORMs. Django is not somehow 
immune to this fundamental property of software.

I am aware there are patterns to work around this in Django. My position is 
that skipping validation should be the rare edge case and not the easy 
naive path. Unless Django's stated purpose is to be a cute toy for making 
blogs, and robust infrastructure is off-label, but that's not what I see in 
the wild.
On Friday, October 7, 2022 at 12:01:30 AM UTC-7 carlton...@gmail.com wrote:

> > ... the duplication I was referring to is having both Forms and 
> Serializers do validation.
>
> That's a separate issue. 
>
> Can we merge various aspects of DRF into Django, so that it better handles 
> building JSON APIs? Yes, clearly. One step of that is better content type 
> handling, another is serializers. (There are others). 
> On the serializer front, it would be a question of making django.forms 
> better able to handle list-like (possibly do-able with FormSet) and nested 
> data, and so on. 
> Not a small project, but with things like django-readers, and 
> Pydantic (and django-ninja), and attrs/cattrs showing new ideas, 
> re-thinking about serialization in Django is about due. 
>
> But the issue is here: 
>
> > ... I also don't relish the thought of needing to use a Form or 
> Serializer every time I alter a Model's data.
>
> I'm like literally, "¿Qué? 😳" - Every single time you get data from an 
> untrusted source you simply **must** validate it before use. ("Filter 
> input, escape output", I was drilled.) That applies exactly the same to a 
> CSV file as it does to HTTP request data. (That your CSV is malformed is 
> axiomatic no? :) 
>
> If you want to enforce validation, with a single call, write a method (on 
> a manager likely) that encapsulates your update logic (and runs the 
> validation before save). Then always use that in your code. (That's long 
> been a recommended pattern 
> <https://www.dabapps.com/blog/django-models-and-encapsulation/>.) But 
> don't skip the validation layer on your incoming data. 
>
> I would be -1 to `validate` kwarg to `save()` — that's every user ever 
> wondering *should I use it? *every time. (Same for a setting.)
> Rather — is this a docs issue? — we should re-emphasise the importance of 
> the validation layer. 
> Then if folks want a convenience API to do both tasks, they're free to 
> write that for their models. (This is what Uri has done for Speedy Net. 
> It's not a bad pattern.) 
>
>
>
>
>
>
> On Fri, 7 Oct 2022 at 04:34, Aaron Smith <aa...@aaronsmith.co> wrote:
>
>> James - to clarify, the duplication I was referring to is having both 
>> Forms and Serializers do validation. I often work with web apps where data 
>> for the same model can arrive via user input, serializer, or created in 
>> some backend process e.g. Celery. If forms/serializers are your validation 
>> layer, you need to duplicate it and worry about how to keep them from 
>> diverging over time as there's no single source of truth. I also don't 
>> relish the thought of needing to use a Form or Serializer every time I 
>> alter a Model's data.
>>
>> Perhaps we think about validation differently. I consider it to be 
>> critical to maintain complex systems with any kind of confidence, any time 
>> data is being created or changed, regardless of where that change comes 
>> from. Bugs can happen anywhere and validation is the best (only?) option to 
>> prevent data-related bugs.
>> On Thursday, October 6, 2022 at 12:03:28 PM UTC-7 James Bennett wrote:
>>
>>> On Thu, Oct 6, 2022 at 9:00 AM Aaron Smith <aa...@aaronsmith.co> wrote:
>>>
>>>> James - The problem with moving validation up the stack, i.e. to 
>>>> logical branches from Model (Form, Serializer) is that you must duplicate 
>>>> validation logic if your data comes from multiple sources or domains (web 
>>>> forms *and* API endpoints *and* CSVs polled from S3. Duplication leads 
>>>> to divergence leads to horrible data integrity bugs and no amount of test 
>>>> coverage can guarantee safety. Even if you consider Django to be "only a 
>>>> web framework" I would still argue that validation should be centralized 
>>>> in 
>>>> the data storage layer. Validity is a core property of data. Serialization 
>>>> and conversion changes between sources and is a different concern than 
>>>> validation.
>>>>
>>>
>>> I would flip this around and point out that the duplication comes from 
>>> seeing the existing data conversion/validation layer and deciding not to 
>>> use it.
>>>
>>> There's nothing that requires you to pass in an HttpRequest instance to 
>>> use a form or a serializer -- you can throw a dict of data from any source 
>>> into one and have it convert/validate for you.  Those APIs are also 
>>> designed to be easy to check and easy to return useful error messages from 
>>> on failed validation, while a model's save() has no option other than to 
>>> throw an exception at you and demand you parse the details out of it 
>>> (because it was designed as part of an overall web framework that already 
>>> had the validation layer elsewhere).
>>>
>>> So I would argue, once again, that the solution to your problem is to 
>>> use the existing data conversion/validation utilities (forms or 
>>> serializers) regardless of the source of the data. If you refuse to, I 
>>> don't think that's Django's problem to solve.
>>>
>>>> -- 
>>
> You received this message because you are subscribed to the Google Groups 
>> "Django developers (Contributions to Django itself)" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to django-develop...@googlegroups.com.
>>
> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/django-developers/458d7bbd-b542-4e9a-ab62-91afdfe4b78fn%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/django-developers/458d7bbd-b542-4e9a-ab62-91afdfe4b78fn%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/35ca622a-60d7-4084-846b-baa409ba19d3n%40googlegroups.com.

Reply via email to