On 08/20/2014 04:28 PM, Ivan Kharlamov wrote: > On 08/20/2014 03:52 PM, Ivan Kharlamov wrote: >> On 08/20/2014 12:46 PM, Marc Tamlyn wrote: >>> I'd say ArrayField is a straight up data field at the moment. It stores >>> 0-1 lists of data. It's no different to CommaSeparatedIntegerField >>> (seriously, why does that exists...) >>> >>> *If* PG gets the relevant update that will allow `integer[] references` >>> (i.e. ArrayField(ForeignKey)) then this would be different, and would be >>> more like a m2m field. >>> >>> There is an argument that it's 0-N anyway, but in the implementation >>> both within Django and in the database I don't think the distinction is >>> useful at the point, from an ORM point of view in any case. For a forms >>> point of view it's quite different. >>> >>> >>> On 20 August 2014 09:19, Russell Keith-Magee <russ...@keith-magee.com >>> <mailto:russ...@keith-magee.com>> wrote: >>> >>> >>> On Mon, Aug 18, 2014 at 6:03 PM, Anssi Kääriäinen >>> <anssi.kaariai...@thl.fi <mailto:anssi.kaariai...@thl.fi>> wrote: >>> >>> On Monday, August 18, 2014 7:45:17 AM UTC+3, Russell Keith-Magee >>> wrote: >>> >>> I understand what you're driving at here, and I've had >>> similar thoughts over the course of the SoC. The catch is >>> that this makes the API for get_fields() fairly complicated. >>> >>> If every field fits into one specific type, then >>> get_fields() just requires a single boolean flag (do I >>> include fields of type X) for each field type. We can also >>> easily add new field types by adding new booleans to the API. >>> >>> However, if a field fits into multiple categories, then it's >>> impossible (or, at least, exceedingly complicated) to make a >>> single call to get_fields() that will specify all your field >>> requirements. "Get me all non-virtual data fields" requires >>> "virtual=False, data=True, m2m=False", but "Get all virtual >>> data fields that represent m2ms" requires "virtual=True, >>> data=False, m2m=True". You can't pass in both sets of >>> arguments at the same time, so you either have to make >>> multiple calls to get_fields(), or you have to invent some >>> sort of query syntax for get_fields() that allows union >>> queries. >>> >>> Plus, at the end of the day, get_fields() is abstracted >>> behind highly cached and optimised properties for key >>> lookups. These properties are effectively a cached call to >>> get_fields() with a specific set of arguments - so even if >>> get_fields() doesn't expose a "one category per field" >>> requirement, the API will require, at some level, names that >>> have clear (and preferably non-overlapping) membership. >>> >>> >>> If fields are in multiple categories then users will want to do >>> the full range of set operation on the categories. Encoding that >>> in to the API doesn't sound promising. >>> >>> >>> I don't think users actually want to get fields based on >>> the suggested categorization. I feel we get an easier to >>> use and more flexible API if we have higher level >>> categories and allow fields to match multiple >>> categories. As a practical example if I want all >>> relation fields, that is going to be hard using the >>> suggested API. Getting all relation fields is a more >>> realistic use case than getting related virtual objects. >>> >>> >>> Quite probably true. As a point of interest, the current (as >>> in, 1.6) API actually doesn't differentiate between category >>> (a) "pure data" and category (b) "relating data (i.e., FK)" >>> fields - if you ask for "data fields" you get pure data >>> *and* foreign keys. So, at least as far as Django's own >>> usage is concerned, you're correct in saying that taxonomy >>> I've described isn't fully required. >>> >>> Daniel's survey of internal usage reveals that there are >>> three use cases for getting a list of fields in Django's >>> internal API: >>> >>> * Get all data and m2m fields (i.e., categories a, b, and >>> d). This is effectively "all fields on *this* model" >>> >>> * Get all data, m2m, related objects, related m2m, and >>> virtual fields (i.e., categories a, b, d, f, g, h, i - >>> excluding c and e because Django doesn't currently have any >>> fields of this type). This is "all fields on this model, or >>> related to this model" >>> >>> * Get all m2m fields (i.e., category d) >>> >>> So - at the very least, we need names to describe those >>> three groups. My intention with describing a richer taxonomy >>> is to try and give names to other groupings of interest. >>> >>> If we want to have all fields to match single and only >>> single category, then we need to redefine the categories >>> to make sure ForeignKeys as virtual fields are possible, >>> and that more esoteric custom join based fields fit in >>> to the categorization. >>> >>> >>> Agreed - that's why I threw this out there for discussion :-) >>> >>> Properties like "data", "virtual", "external", "related", >>> "relating" - these are high level concepts describing the >>> way a field manifests. However, that doesn't mean we need to >>> expose these properties as part of the formal API. >>> >>> Part of the underlying problem here -- lets say we roll out >>> Django 1.7 with some version of this API, and in 1.8, >>> foreign key fields change to become virtual. That >>> effectively becomes backwards incompatible for queries that >>> are sensitive to a "virtual" flag; but it doesn't change the >>> underlying need to identify that a field is a foreign key. >>> We need to capture the latter use case, but not necessarily >>> the former. >>> >>> >>> Could we go with a minimal API for get_fields()? Instead of >>> having categorization on the get_fields() API, we could provide >>> field flags for the categories. With field flags it is >>> straightforward to filter the return list of get_fields(). As an >>> example, fetching those fields which are relations but which >>> aren't virtual: [f for f in get_fields() if f.relational and not >>> f.virtual]. If this path is taken, then I am not sure how >>> minimal the get_fields() API should be. We likely need flags for >>> at least if the field is defined on local, parent or some remote >>> model. >>> >>> As for changing ForeignKey to virtual field plus concrete field >>> representation - I just realized this will be backwards >>> incompatible no matter what we do regarding categorization. An >>> all-fields including get_fields() call will return separate >>> author (virtual) and author_id (concrete) fields after the >>> split. I am not sure what we can do about this. It would be very >>> unfortunate if we can't refactor the way ForeignKeys work due to >>> the meta API. Any ideas how we can avoid the backwards >>> compatibility trap? >>> >>> >>> I think Daniel and I might have come up with a way to meet both >>> these requirements - a minimalist API for get_fields, with at least >>> some protection against the known incoming backwards compatibility >>> issue. >>> >>> The summary so far: it appears that a complex taxonomy isn't >>> especially helpful - firstly, because any complex taxonomy is going >>> to have edge cases that are hard to categorize, but also because a >>> complex taxonomy leads to a much more complex internal API that is >>> going to be prone to backwards compatibility problems. >>> >>> So - instead of worrying about 'virtual' and other properties like >>> that, lets look at why the _meta API is fundamentally used - to get >>> a list of fields that need to be handled in data processing. This >>> primarily means forms, but other forms of serialisation are also >>> included. In these use cases, there are always going to be per-field >>> differences (even a CharField and an IntegerField require *slightly* >>> different handling), so we won't focus on internal representations, >>> storage mechanisms, or anything like that. Instead, lets focus on >>> cardinality - a field represents some sort of data that has a >>> cardinality with the object on which it is stored. If something has >>> cardinality 1, you can display a single field. If it's cardinality >>> N, you need to display a list, or some sort of inline. >>> >>> This results in 3 categories that are mutually exclusive: >>> >>> a) "Data fields": Fields of cardinality 0-1: >>> >>> * A CharField stores 0 or 1 strings (0 is the case of a nullable >>> field). >>> >>> * An IntegerField stores 0 or 1 integers. >>> >>> * A FileField stores 0 or 1 file paths. >>> >>> * An ImageField stores 0 or 1 file paths - although in being >>> modified, it might modify some other fields. >>> >>> * A ForeignKey stores 0 or 1 references to another object. >>> >>> * A GenericForeignKey stores 0 or 1 references to another object. >>> >>> * A notional "DocumentField" on a NoSQL store references 0 or 1 >>> external documents. >>> >>> b) "ManyToMany Fields": Fields that are locally defined that >>> represent a cardinality 0-N relationship with another object: >>> >>> * Many to Many fields store 0-N references to a second model. >>> >>> c) "Related Objects": Fields that represent a cardinality 0-N >>> relationship with this object, but aren't locally defined: >>> >>> * The 'related' side of a ForeignKey >>> >>> * The 'related' side of a ManyToMany >>> >>> * A GenericRelation representing the reverse side of a >>> GenericForeignKey >>> >>> These three types are mutually exclusive - you either have >>> cardinality 1 *or* cardinality N, not both; and you're either >>> locally defined on this object or you're not. I can't think of an >>> example of "cardinality 1 data that isn't defined on this object", >>> but it would fit into this taxonomy if it were needed; I also can't >>> think of a field definition that would span models. >>> >>> In addition to this basic classification, a field can be marked as >>> "hidden". The immediate use for this is to hide the related_name='+' >>> case of a FK or M2M. Looking forward, it would be used to mask >>> fields that exist, but aren't intended to be user visible - for >>> example, in the potential future case where a ForeignKey is split in >>> two, or a Composite Key, there would be a "hidden" integer field (or >>> fields) storing the actual data, and a virtual (but non-hidden) >>> field that is the public API for manipulating the relationship. This >>> would also be backwards compatible, because the "visible" field list >>> hasn't changed. >>> >>> Fields are also tracked according to their parentage; this is used >>> by tools interacting with inheritance relationships to know which >>> fields are actually on this model, and which are inherited from a >>> base class. >>> >>> This yields the following formal API for _meta: >>> >>> * get_fields(data, many_to_many, related, include_hidden, >>> include_parents) >>> >>> * @property data_fields (=> get_fields(data=True, >>> many_to_many=False, related=False, include_hidden=False, >>> include_parents=True) >>> >>> * @property many_to_many_fields (=> get_fields(data=False, >>> many_to_many=True, related=False, include_hidden=False, >>> include_parents=True) >>> >>> * @property related_objects (=> get_fields(data=False, >>> many_to_many=False, related=True, include_hidden=False, >>> include_parents=True) >>> >>> Does this sound any more sane as an API? >>> >>> My one lingering question is whether the "many_to_many" >>> name/category is too explicit. I can conceive how an ArrayField >>> could be considered a data field (it stores 0-1 arrays of data), or >>> a "many_to_many" field (because it stores 0-N instances of some >>> data). This all hinges on whether the definition for that field >>> category is that it is a relationship with another *model*, or if >>> it's just cardinality N data. It's trivial to call it a Data field >>> and just leave it at that, but I'm wondering if there might be >>> benefit in broadening the definition of "many_to_many". >>> >>> Russ %-) >> >> When I look at this situation from the point of view of forms, there are >> >> 1. Fields of cardinality 0-1 >> 2. Fields of cardinality 0-N >> >> and >> >> a. Fields that do not represent reference to another model (object) >> b. Fields that represent reference to another model (object) >> >> 1. and 2. are mutually exclusive; a. and b. are also mutually exclusive. >> >> IMO, this way the future Django form would not need to care whether the >> field is m2m or ArrayField(ForeignKey)) or ListField(EmbeddedModelField) >> because all of them would be 2.&b. >> >> One may also want to add two mutually-exclusive subcategories to b: >> >> b1. Relationship is locally defined >> b2. Relationship is not locally defined. > > To add more examples to my proposition: > > 1) CharField(), IntegerField(), FileField(), ImageField() > > are all members of both: a. and 1. > > 2) ArrayField(), DictionaryField() > > are all members of both: a. and 2. > > 3) ForeignKey(), GenericForeignKey(), EmbeddedModelField(), > GenericRelation(), > > are all members of both: b. and 1. > > 4) ManyToManyField(), ArrayField(ForeignKey), ListField(EmbeddedModelField) > > are all members of both: b. and 2. > > > As Collin Anderson wrote about "virtual" fields on 08/18/2014 07:12 PM: > >> Also, I think we should avoid discriminating between "virtual" and >> non-virtual (as with local vs parent). Why should it matter how a field >> is stored in the database? I think the distinction will make it harder >> to use non-relational databases. > > One may want to expand his statement and say that the form, ideally, > should not care whether the field relationship is locally defined or not. > > Which is not to say that b1 and b2 subcategories are not useful at all, > but they should not be needed in form representations.
Excuse me for posting multiple emails at a time, but I'd like to make a correction: It just occured to me that I misused the term 'cardinality'. The best way to correct myself is to replace this: 1. Fields of cardinality 0-1 2. Fields of cardinality 0-N with this: 1. Fields that can have 0-1 values. 2. Fields that can have 0-N values. Thanks for brilliant work and best regards, Ivan > >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "Django developers" group. >>> To unsubscribe from this group and stop receiving emails from it, >>> send an email to django-developers+unsubscr...@googlegroups.com >>> <mailto:django-developers+unsubscr...@googlegroups.com>. >>> To post to this group, send email to >>> django-developers@googlegroups.com >>> <mailto:django-developers@googlegroups.com>. >>> Visit this group at http://groups.google.com/group/django-developers. >>> To view this discussion on the web visit >>> >>> https://groups.google.com/d/msgid/django-developers/CAJxq84_OcibE72RKB9T60BJW9AtY8_YYhmhM5dXH36TtW3KsYw%40mail.gmail.com >>> >>> <https://groups.google.com/d/msgid/django-developers/CAJxq84_OcibE72RKB9T60BJW9AtY8_YYhmhM5dXH36TtW3KsYw%40mail.gmail.com?utm_medium=email&utm_source=footer>. >>> >>> For more options, visit https://groups.google.com/d/optout. >>> >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "Django developers" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to django-developers+unsubscr...@googlegroups.com >>> <mailto:django-developers+unsubscr...@googlegroups.com>. >>> To post to this group, send email to django-developers@googlegroups.com >>> <mailto:django-developers@googlegroups.com>. >>> Visit this group at http://groups.google.com/group/django-developers. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/django-developers/CAMwjO1HLabZ7C%3D87Y3F50PWUYDncH1ip_VgtQN-cPOXthk8yHQ%40mail.gmail.com >>> <https://groups.google.com/d/msgid/django-developers/CAMwjO1HLabZ7C%3D87Y3F50PWUYDncH1ip_VgtQN-cPOXthk8yHQ%40mail.gmail.com?utm_medium=email&utm_source=footer>. >>> For more options, visit https://groups.google.com/d/optout. >> > -- You received this message because you are subscribed to the Google Groups "Django developers" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+unsubscr...@googlegroups.com. To post to this group, send email to django-developers@googlegroups.com. Visit this group at http://groups.google.com/group/django-developers. To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/53F497B1.1010303%40gmail.com. For more options, visit https://groups.google.com/d/optout.