Hi, This is my proposal for the customizable serialization idea:
There are two formats - A formatted Google Docs version that's easy on the eyes ( https://docs.google.com/a/vivekn.co.cc/document/pub?id=1GMWW42sY8cLZ2XRtVEDA9BQzmsqnCNULzskDMwqSUXI ) and a plain text version that follows. ------------------------------------------------------------------------------------------------------------------- GSoC Proposal: Customizable Serialization for Django ======= Synopsis ======= Django provides a serialization framework that is very useful for loading and saving fixtures, but not very flexible if one wants to provide an API for a Django application or use a serialization format different from what is defined by Django. Also the current handling of foreign keys and many to many relationships is not really useful outside the context of fixtures. I propose a solution to this problem through a class based serialization framework, that would allow the user to customize the serialization output to a much greater degree and create new output formats on the fly. The main features of this framework would be: 1. It will enable the users to specify the serialization model as a class with configurable field options and methods, similar to Django’s Models API. 2. Specify new output formats and a greater level of control over namespaces, tags and key-value mappings in XML, YAML, JSON. 3. Add metadata and unicode conversion methods to model fields through class methods. 4. Better handling of foreign keys and many-to-many fields with a custom level of nesting. 5. A permission system to provide varying levels of data access. 6. Backward compatibility to ensure the smooth processing of database fixtures. ================= Implementation Details ================= --------------------------------------- Modes and Configurations --------------------------------------- I would like to provide building block configurations for XML, YAML and JSON which the user can customize, which would be based more or less on the existing skeletal structures in core.serialization and core.serialization.base. Also there will be a new Serializer configuration called TextSerializer that can represent any arbitrary format. I will be providing a ‘fixture’ mode to ensure backward compatibility and the seamless working of the ``loaddata`` and ``dumpdata`` commands. Adding metadata to a field The user can define methods beginning with “meta_” to add metadata about each field. And functions starting with “meta2_” can be used to add metadata at the model level. Here is an example: class ExampleSerializer(serializers.Serializer): ... def meta_foo(self, field): ''' Extract some metadata from field and return it. It would be displayed with the attribute ``foo`` ''' Temporarily all mappings between data will be stored in a dict as string to object/dict mappings and would be converted to the desired format at the output stage. In JSON the metadata would be represented inside an object: "key": {"foo": "bar", "value": value} instead of "key": value In XML, two options would be provided, to represent the metadata as individual tags or with tag attributes, through a field option in the class. class Serializer(XMLSerializer): metadata_display_mode = TAGS # or ATTRIBUTES The output would be like: <field> <metadata1>..</metadata1> ... <Value>Value</Value> </field> OR <field name="" md1 = "" ... > Value </field> To select which fields would have which metadata, the arguments should be passed in the ``serialize()`` method as: data = ExampleSerializer.serialize(queryset, fields = ('field1', ('field2',['foo']) ) Each field can be specified in two ways: 1. As a string:-> no metadata will be added. 2. As a 2-element tuple, with the first element a string representing field name and the second a list of strings representing the metadata attributes to be applied on that field. Instead of manually specifying the attributes for each field, the user can add all metadata functions for all the fields using the ``use_all_metadata`` parameter in ``serialize()`` and setting it to True. The existing implementation of ``model.name`` and ``model.pk`` can be described using “meta2_” functions. These will be provided as ``meta2_name`` and ``meta2_pk`` to facilitate loading and dumping of fixtures. --------------------------------------------------- Datatypes and Unicode conversion --------------------------------------------------- The user can specify the protected types (the types that will be passed “as is” without any conversion) as a field variable. The unicode conversion functions for each type can be specified as methods - “unicode_xxx”, where 'xxx' represents the type name. If no method is provided for a type, a default conversion function will be used. class Example(Serializer): ... protected_types = (int, str, NoneType, bool) ... def unicode_tuple(self, object): # Do something with the object ------------------------------------------------- Output formatting and conversion ------------------------------------------------- The user can specify the format of the output , the grouping of fields, tags, namespaces, indentation and much more. Here are some examples: 1. For text based serializers a custom template would be provided: class Foobar(TextSerializer): field_format = "%(key)s :: { %(value)f, %(meta_d1)s, % (meta_d2)}" ## Simple string template, meta_xxx would be replaced by meta_xxx(field) as ## I’ve mentioned above. #The three parameters below are required for text mode field_separator = ";" wrap_begin = "[[" # For external wrapping structure wrap_end = "]]" indent = 4 # indent by 4 spaces, each level. Default is 0. 2. For markup based serializers, users can provide strings for the tag names of fields, field values and models. class XMLFoo(XMLSerializer): mode = "xml" indent = 2 metadata_display_mode = TAGS field_tag_name = "object" # Now all fields will be rendered as <object>...</object> model_tag_name = "model" value_tag_name = "value" ## if metadata_display_mode is set to ``TAGS``, this sets the tag name of the value of the ## model field 3. A class field ``wrap_fields`` will be provided to wrap all fields of a model into a group, as it is done now. If ``wrap_fields`` is set as “all_fields” for example. Then all the fields would be serialized inside an object called “all_fields”. If ``wrap_fields`` is not set, there will be no grouping. Related models and nesting I will modify the current “start_object -> handle_object -> end_object” sequence with a single method for handling a model, so that related models can be handled easily using recursion. An option of ``nesting_depth`` would be provided to the user as a field variable. Default value would be 0, as it is currently. Serializing only specific fields of related models can be done by using the fields argument in the call to serialize. A related model would be represented as “Model_name.field_name” instead of just “field_name”. Instead of the list - ``_current``, I would be using separate lists for each level of nesting. --------------------------------------------------------- New features in the serialize() function --------------------------------------------------------- Apart from the changes I’ve proposed for the ``fields`` argument of serialize, I would like to add a couple of features: • An exclude argument, which would be a list of fields to exclude from the model, this would also contain the fields to exclude in related models. • An extras argument, which would allow properties and data returned by some methods to be serialized. ----------------------------------- Permission Framework ----------------------------------- While creating an API, there may arise a need to give varying levels of access to data to different people. For this I propose a permission framework, where the user can choose to restrict data to certain groups while defining a model. I guess a different name should be used, so that it is not confused with the “Permission” model used in contrib.auth and contrib.admin. Here’s an example class User(models.Model): name = CharField(max_length=128) # No restrictions picture_url = URLField(restrict_to = (‘friends’, ‘self’, ‘admins’)) security_question = CharField(max_length = 200, restrict_to = (‘self’, ‘admins’)) security_answer = CharField(max_length = 200, restrict_to = (‘admins’)) Here different permission groups like ‘self’, ‘friends’ and ‘admins’ are created as a field option. To use this, specify the permission_level in the call to serialize data = serializers.serialize(queryset, permission_level = ‘friends’ ) If no permission_level is given, only unrestricted fields will be serialized. ----------------------------------------------------------------- Representing the existing serialization model ----------------------------------------------------------------- Here is an implementation of the existing serialization format in JSON, this would be the ‘fixture’ mode that I’ve mentioned above. class JsonFixture(JSONSerializer): wrap_fields = "fields" nesting_depth = 0 def meta2_pk(self, model): ... def meta2_model(self, model): … In XML class XMLFixture(XMLSerializer): wrap_fields = "fields" nesting_depth = 0 metadata_display_mode = ATTRIBUTES indent = 4 field_tag_name = "field" model_tag_name = "object" def meta2_pk(self, model): ... def meta2_model(self, model): … def meta_type(self, field): ... =================== Deliverables and Timeline =================== I would be working for about 40-45 hours each week and I would be writing tests, exceptions and error messages along with development. This would more or less be my timeline: Till May 23 I will familiarize myself with community best practices and the version control systems used by Django, read the code of all the relevant modules related to the serialization framework, look at implementations of other serialization libraries in different languages and go through all the model and regression tests related to serialization. Weeks 1 to 2 I will use this time to set up the basic foundations of the projects by : 1. Writing the skeletal structure of the serializer, based on the current implementations in core.serialization.base and core.serialization.python. 2. Setting up basic configurations for JSON, YAML, XML, Text and creating the fixture mode. 3. Making changes to loaddata and dumpdata in core.management.commands to ensure backward compatibility. 4. Using a dict as temporary storage before the final ‘dumping’ stage. 5. Modifying the deserializers to handle custom formats of serialization and specifying the requirements for deserialization. Week 3 1. Implementation of the metadata methods at field and model level using getattr and similar methods. 2. Make changes to the fields argument of ``serialize()``. 3. Representation of output formats of the metadata in JSON/YAML, XML etc. Week 4 1. Implementation of the unicode and datatype conversion methods in a way similar to the metadata methods. 2. Providing the user the choice of ‘protected’ types for the serialization. Week 5 1. Provide all the configurable options for output formatting as discussed above. 2. Add support for string templates and their parsing. 3. Parsing the dict used for temporary storage to generate XML and custom text outputs Week 6 1. Implement serialization of related models like foreign keys and many-to-many fields using recursion. 2. Integrate with ``fields`` argument of ``serialize`` and specify the format for representing a related model. 3. Implement nesting depth limit feature. Week 7 1. Implement the ``exclude`` feature in serialize() which will allow the user to choose fields to exclude while serializing. 2. Adding an ``extras`` argument to serialize(), allowing the user to specify additional properties of a model, which are not field variables but derivatives of field variables and defined as methods or properties in the model by the user. Week 8 1. Implement the permissions framework , which would give varying levels of access to data to different users based on their permission level. 2. Integrate with Models API and field options. 3. Add permission_level argument to serialize(). Weeks 9 - 10 1. Write documentation for the project and provide many examples. 2. Write a few tutorials on how to use the framework. 3. Write some project-level tests, do some extensive final testing and refine the project. ===== About ===== I am Vivek Narayanan, a second year undergrad student at the Institute of Technology, Varanasi majoring in Electronics Engineering. I’m really passionate about computers and have been programming for the past 5-6 years. I have some experience in Python, C/C++, Java, Javascript, HTML, PHP, Haskell and Actionscript. Python is my favorite language and I’ve been using it for a couple of years. While working on a web application, I stumbled upon Django a few months back and I really love its elegant approach to everything. I have submitted patches to tickets #15299 [1] , #12489 [2] and #8809 [3] on the Django Trac. Some of the projects I’ve worked on are: 1. blitzx86: An assembler for the Intel X86 Architecture using lex and yacc. [4] 2. An assembler for the dlx architecture supporting pipeline optimizations. [5] 3. mapTheGraph! - A location based social networking application on web and Android platforms. 4. A social networking game based on forex trading which is under development. 5. pyzok - A python based LAN chat server. [6] 6. aeroFox - A .NET based open source browser for Windows with transparent windows that managed over 100000 downloads. [7] I am a fast learner and can grasp new technologies / languages in a short period of time. Among other things, I enjoy playing tennis and reading books on a wide variety of topics. ==== Links ==== [1] http://code.djangoproject.com/ticket/15299 [2] http://code.djangoproject.com/ticket/12489 [3] http://code.djangoproject.com/ticket/8809 [4] https://github.com/vivekn/blitz8086 [5] https://github.com/vivekn/dlx-compiler [6] https://github.com/vivekn/pyzok [7] http://sourceforge.net/projects/aerofox/files/aerofox/0.4.8.7/ -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.