Re: Model-level validation
I see a lot of people mentioning that other ORMs do validation, but not picking up on a key difference: Many ORMs are designed as standalone packages. For example, in Python SQLAlchemy is a standalone DB/ORM package, and other languages have similar popular ORMs. But Django's ORM isn't standalone. It's tightly integrated into Django, and Django is a web framework. And once you focus *specifically* on the web framework use case, suddenly things start going differently. For example: data on the web is "stringly-typed" (effectively, since HTTP doesn't really have data types) and comes in via HTML's form mechanism or other string-y formats like JSON or XML payloads. So you need not just data *validation*, but data *conversion* which works for the web use case. And since the web use case inevitably involves supporting forms/payloads that don't persist to a relational data store -- think of, for example, a contact form that sends an email, or forms that store their results client-side for things like language or theme preferences -- you inevitably end up needing to do data conversion and validation *independently of the ORM*. And at that point, you have to start asking tough questions about whether it's worth having *two* conversion and validation layers, just because "every other ORM has this, so we have to put one in the ORM". Which basically is where Django is. Yes, there are utilities to do your data conversion and validation in the ORM layer if you want to. But Django is, first and foremost, a web framework, which needs to support the web use case I've described above, and so its primary conversion/validation layer can never be the ORM. Personally, I wish model-level validation had never been added even as an option, because in a web framework like Django it's conceptually the wrong place to put the validation logic. Though that battle was lost many years ago, I'd be *strongly* against trying to expand it or start forcing the ORM to default to doing validation work that, in Django, properly belongs to the forms layer (or to serializers if you use DRF). So: Django ships with ModelForm, which does the hard work of auto-deriving as much validation logic as possible from your model definition so you don't have to repeat it. DRF ships with ModelSerializer, which does the same thing for its validation/conversion layer. I would strongly urge people to use them. Trying to force all that validation back into the model layer misses the bigger picture of what Django is and how it works. -- You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/CAL13Cg9KHxksNOAVhcOQWS80%2BP5wJbE48V-Z17h15n-krfUVcA%40mail.gmail.com.
Re: Model-level validation
Uri - that's a great upgrade path (or should I say, non-upgrade path). Agree with `VALIDATE_MODELS_BY_DEFAULT`. Rails also skips validations for some operations, like `update_column`, but they are prominently marked to use with caution, and the other ORMs i've used follow a similar pattern. bulk_create sounds like there's legitimate reason to not validate everything, seems reasonable to exclude it so long as there's a prominent "use with caution" statement in the docs. On Wednesday, October 5, 2022 at 8:35:36 PM UTC-7 Uri wrote: > > אורי > u...@speedy.net > > > On Thu, Oct 6, 2022 at 6:11 AM Aaron Smith wrote: > >> It sounds like there is little support for this being the default. But >> I'd like to propose something that might satisfy the different concerns: >> >> 1) A `validate` kwarg for `save()`, defaulted to `False`. This maintains >> backwards compatibility and also moves the validation behavior users coming >> to Django from other frameworks likely expect, in a more user friendly way >> than overriding save to call `full_clean()`. >> >> And/or... >> >> 2) An optional Django setting (`VALIDATE_MODELS_DEFAULT`?) to change the >> default behavior to `True`. The `validate` kwarg above would override this >> per call, allowing unvalidated saves when necessary. >> >> These changes would be simple, backwards compatible, and give individual >> projects the choice to make Django behave like other ORMs with regard to >> validation. This being the Django developers mailing list I should not be >> surprised that most people here support the status quo, but in my personal >> experience, having had this conversation with dozens of coworkers over the >> years - 100% of them expressed a strong desire for Django to do this >> differently. >> > > +1 > > I would suggest having a setting "VALIDATE_MODELS_BY_DEFAULT", which is > true or false (true by default), whether to call full_clean() on save(), > with an option to call it with "validate=True" or "validate=False" to > override this default. Maybe also allow changing the default for specific > models. > > This is similar to forms that have `def save(self, commit=True):`, and you > can call them with "commit=True" or "commit=False" to save or not save the > results to the database. I also suggest that VALIDATE_MODELS_BY_DEFAULT > will be true by default from some specific future version of Django, so > that if users don't want it, they will have to manually set it to false. > > We should still remember that there are bulk actions such as bulk_create() > or update(), that bypass save() completely, so we have to decide how to > handle them if we want our data to be always validated. > > Uri Rodberg, Speedy Net. > -- You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/4f51341e-bc60-4675-a749-0c60dd9108fbn%40googlegroups.com.
Re: Model-level validation
James - The problem with moving validation up the stack, i.e. to logical branches from Model (Form, Serializer) is that you must duplicate validation logic if your data comes from multiple sources or domains (web forms *and* API endpoints *and* CSVs polled from S3. Duplication leads to divergence leads to horrible data integrity bugs and no amount of test coverage can guarantee safety. Even if you consider Django to be "only a web framework" I would still argue that validation should be centralized in the data storage layer. Validity is a core property of data. Serialization and conversion changes between sources and is a different concern than validation. On Thursday, October 6, 2022 at 12:47:19 AM UTC-7 James Bennett wrote: > I see a lot of people mentioning that other ORMs do validation, but not > picking up on a key difference: > > Many ORMs are designed as standalone packages. For example, in Python > SQLAlchemy is a standalone DB/ORM package, and other languages have similar > popular ORMs. > > But Django's ORM isn't standalone. It's tightly integrated into Django, > and Django is a web framework. And once you focus *specifically* on the web > framework use case, suddenly things start going differently. > > For example: data on the web is "stringly-typed" (effectively, since HTTP > doesn't really have data types) and comes in via HTML's form mechanism or > other string-y formats like JSON or XML payloads. So you need not just data > *validation*, but data *conversion* which works for the web use case. > > And since the web use case inevitably involves supporting forms/payloads > that don't persist to a relational data store -- think of, for example, a > contact form that sends an email, or forms that store their results > client-side for things like language or theme preferences -- you inevitably > end up needing to do data conversion and validation *independently of the > ORM*. > > And at that point, you have to start asking tough questions about whether > it's worth having *two* conversion and validation layers, just because > "every other ORM has this, so we have to put one in the ORM". > > Which basically is where Django is. Yes, there are utilities to do your > data conversion and validation in the ORM layer if you want to. But Django > is, first and foremost, a web framework, which needs to support the web use > case I've described above, and so its primary conversion/validation layer > can never be the ORM. > > Personally, I wish model-level validation had never been added even as an > option, because in a web framework like Django it's conceptually the wrong > place to put the validation logic. Though that battle was lost many years > ago, I'd be *strongly* against trying to expand it or start forcing the ORM > to default to doing validation work that, in Django, properly belongs to > the forms layer (or to serializers if you use DRF). > > So: Django ships with ModelForm, which does the hard work of auto-deriving > as much validation logic as possible from your model definition so you > don't have to repeat it. DRF ships with ModelSerializer, which does the > same thing for its validation/conversion layer. I would strongly urge > people to use them. Trying to force all that validation back into the model > layer misses the bigger picture of what Django is and how it works. > -- You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/90d6d795-b68b-41fe-aacc-b36281755d2en%40googlegroups.com.
Re: Model-level validation
On Thu, Oct 6, 2022 at 9:00 AM Aaron Smith wrote: > James - The problem with moving validation up the stack, i.e. to logical > branches from Model (Form, Serializer) is that you must duplicate > validation logic if your data comes from multiple sources or domains (web > forms *and* API endpoints *and* CSVs polled from S3. Duplication leads to > divergence leads to horrible data integrity bugs and no amount of test > coverage can guarantee safety. Even if you consider Django to be "only a > web framework" I would still argue that validation should be centralized in > the data storage layer. Validity is a core property of data. Serialization > and conversion changes between sources and is a different concern than > validation. > I would flip this around and point out that the duplication comes from seeing the existing data conversion/validation layer and deciding not to use it. There's nothing that requires you to pass in an HttpRequest instance to use a form or a serializer -- you can throw a dict of data from any source into one and have it convert/validate for you. Those APIs are also designed to be easy to check and easy to return useful error messages from on failed validation, while a model's save() has no option other than to throw an exception at you and demand you parse the details out of it (because it was designed as part of an overall web framework that already had the validation layer elsewhere). So I would argue, once again, that the solution to your problem is to use the existing data conversion/validation utilities (forms or serializers) regardless of the source of the data. If you refuse to, I don't think that's Django's problem to solve. > -- You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/CAL13Cg-fB_hMSDz3_Ox8svEiqX%3DhGHxnTLFAkyT55M2NKgGYzg%40mail.gmail.com.
Re: Model-level validation
James - to clarify, the duplication I was referring to is having both Forms and Serializers do validation. I often work with web apps where data for the same model can arrive via user input, serializer, or created in some backend process e.g. Celery. If forms/serializers are your validation layer, you need to duplicate it and worry about how to keep them from diverging over time as there's no single source of truth. I also don't relish the thought of needing to use a Form or Serializer every time I alter a Model's data. Perhaps we think about validation differently. I consider it to be critical to maintain complex systems with any kind of confidence, any time data is being created or changed, regardless of where that change comes from. Bugs can happen anywhere and validation is the best (only?) option to prevent data-related bugs. On Thursday, October 6, 2022 at 12:03:28 PM UTC-7 James Bennett wrote: > On Thu, Oct 6, 2022 at 9:00 AM Aaron Smith wrote: > >> James - The problem with moving validation up the stack, i.e. to logical >> branches from Model (Form, Serializer) is that you must duplicate >> validation logic if your data comes from multiple sources or domains (web >> forms *and* API endpoints *and* CSVs polled from S3. Duplication leads >> to divergence leads to horrible data integrity bugs and no amount of test >> coverage can guarantee safety. Even if you consider Django to be "only a >> web framework" I would still argue that validation should be centralized in >> the data storage layer. Validity is a core property of data. Serialization >> and conversion changes between sources and is a different concern than >> validation. >> > > I would flip this around and point out that the duplication comes from > seeing the existing data conversion/validation layer and deciding not to > use it. > > There's nothing that requires you to pass in an HttpRequest instance to > use a form or a serializer -- you can throw a dict of data from any source > into one and have it convert/validate for you. Those APIs are also > designed to be easy to check and easy to return useful error messages from > on failed validation, while a model's save() has no option other than to > throw an exception at you and demand you parse the details out of it > (because it was designed as part of an overall web framework that already > had the validation layer elsewhere). > > So I would argue, once again, that the solution to your problem is to use > the existing data conversion/validation utilities (forms or serializers) > regardless of the source of the data. If you refuse to, I don't think > that's Django's problem to solve. > >> -- You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/458d7bbd-b542-4e9a-ab62-91afdfe4b78fn%40googlegroups.com.