Hi,

This is my proposal for the customizable serialization idea:

There are two formats - A formatted Google Docs version that's easy on
the eyes ( 
https://docs.google.com/a/vivekn.co.cc/document/pub?id=1GMWW42sY8cLZ2XRtVEDA9BQzmsqnCNULzskDMwqSUXI
) and a plain text version that follows.

-------------------------------------------------------------------------------------------------------------------
GSoC Proposal: Customizable Serialization for Django

=======
Synopsis
=======
Django provides a serialization framework that is very useful for
loading and saving fixtures, but not very flexible if one wants to
provide an API for a Django application or use a serialization format
different from what is defined by Django. Also the current handling of
foreign keys and many to many relationships is not really useful
outside the context of fixtures.

I propose a solution to this problem through a class based
serialization framework, that would allow the user to customize the
serialization output to a much greater degree and create new output
formats on the fly. The main features of this framework would be:

   1. It will enable the users to specify the serialization model as a
class with configurable field options and methods, similar to Django’s
Models API.
   2. Specify new output formats and a greater level of  control over
namespaces, tags and key-value mappings in XML, YAML, JSON.
   3. Add metadata and unicode conversion methods to model fields
through class methods.
   4. Better handling of foreign keys and many-to-many fields  with a
custom level of nesting.
   5. A permission system to provide varying levels of data access.
   6. Backward compatibility to ensure the smooth processing of
database fixtures.

=================
Implementation Details
=================
---------------------------------------
Modes and Configurations
---------------------------------------
I would like to provide building block configurations for XML, YAML
and JSON which the user can customize, which would be based more or
less on the existing skeletal structures in core.serialization and
core.serialization.base. Also there will be a new Serializer
configuration called TextSerializer that can represent any arbitrary
format. I will be providing a ‘fixture’ mode to ensure backward
compatibility and the seamless working of the ``loaddata`` and
``dumpdata`` commands.
Adding metadata to a field

The user can define methods beginning with “meta_” to add metadata
about each field. And functions starting with “meta2_” can be used to
add metadata at the model level. Here is an example:

class ExampleSerializer(serializers.Serializer):

        ...

        def meta_foo(self, field):

           '''

           Extract some metadata from field and return it.

           It would be displayed with the attribute ``foo``

           '''

Temporarily all mappings between data will be stored in a dict as
string to object/dict mappings and would be converted to the desired
format at the output stage.

In JSON the metadata would be represented inside an object:

        "key": {"foo": "bar", "value": value}

instead of

        "key": value

In XML, two options would be provided, to represent the metadata as
individual tags or with tag attributes, through a field option in the
class.

class Serializer(XMLSerializer):

        metadata_display_mode = TAGS # or ATTRIBUTES

The output would be like:

<field>

   <metadata1>..</metadata1>

   ...

   <Value>Value</Value>

</field>

OR

<field name="" md1 = "" ... > Value </field>

To select which fields would have which metadata, the arguments should
be passed in the ``serialize()`` method as:

        data = ExampleSerializer.serialize(queryset, fields =
('field1', ('field2',['foo']) )

Each field can be specified in two ways:

1. As a string:-> no metadata will be added.

2. As a 2-element tuple, with the first element a string representing
field name and the second a list of strings representing the metadata
attributes to be applied on that field.

Instead of manually specifying the attributes for each field, the user
can add all metadata functions for all the fields using the
``use_all_metadata`` parameter in ``serialize()`` and setting it to
True.

The existing implementation of ``model.name`` and ``model.pk`` can be
described using “meta2_” functions. These will be provided as
``meta2_name`` and ``meta2_pk`` to facilitate loading and dumping of
fixtures.

---------------------------------------------------
Datatypes and Unicode conversion
---------------------------------------------------

The user can specify the protected types (the types that will be
passed “as is” without any conversion) as a field variable.

The unicode conversion functions for each type can be specified as
methods - “unicode_xxx”, where 'xxx' represents the type name. If no
method is provided for a type, a default conversion function will be
used.

class Example(Serializer):

        ...

        protected_types = (int, str, NoneType, bool)

        ...

        def unicode_tuple(self, object):

                   # Do something with the object

-------------------------------------------------
Output formatting and conversion
-------------------------------------------------
The user can specify the format of the output , the grouping of
fields, tags, namespaces, indentation and much more. Here are some
examples:

1. For text based serializers a custom template would be provided:

class Foobar(TextSerializer):

        field_format = "%(key)s :: { %(value)f, %(meta_d1)s, %
(meta_d2)}"

        ## Simple string template, meta_xxx would be replaced by
meta_xxx(field) as

        ## I’ve mentioned above.

        #The three parameters below are required for text mode

        field_separator = ";"

        wrap_begin = "[[" # For external wrapping structure

        wrap_end = "]]"

        indent = 4 # indent by 4 spaces, each level. Default is 0.

2. For markup based serializers, users can provide strings for the tag
names of fields, field values and models.

class XMLFoo(XMLSerializer):

        mode = "xml"

        indent = 2

        metadata_display_mode = TAGS

        field_tag_name = "object" # Now all fields will be rendered as
<object>...</object>

        model_tag_name = "model"

        value_tag_name = "value"

         ## if metadata_display_mode is set to ``TAGS``, this sets the
tag name of the value of the
         ## model field

3. A class field ``wrap_fields`` will be provided to wrap all fields
of a model into a group, as it is done now. If ``wrap_fields`` is set
as “all_fields” for example. Then all the fields would be serialized
inside an object called “all_fields”. If ``wrap_fields`` is not set,
there will be no grouping.
Related models and nesting

I will modify the current “start_object -> handle_object ->
end_object” sequence with a single method for handling a model, so
that related models can be handled easily using recursion. An option
of ``nesting_depth`` would be provided to the user as a field
variable. Default value would be 0, as it is currently. Serializing
only specific fields of related models can be done by using the fields
argument in the call to serialize. A related model would be
represented as “Model_name.field_name” instead of just “field_name”.

Instead of the list -  ``_current``, I would be using separate lists
for each level of nesting.

---------------------------------------------------------
New features in the serialize() function
---------------------------------------------------------
Apart from the changes I’ve proposed for the ``fields`` argument of
serialize, I would like to add a couple of features:

• An exclude argument, which would be a list of fields to exclude from
the model, this would also contain the fields to exclude in related
models.

• An extras argument, which would allow properties and data returned
by some methods to be serialized.

-----------------------------------
Permission Framework
-----------------------------------
While creating an API, there may arise a need to give varying levels
of access to data to different people. For this I propose a permission
framework, where the user can choose to restrict data to certain
groups while defining a model. I guess a different name should be
used, so that it is not confused with the “Permission” model used in
contrib.auth and contrib.admin. Here’s an example

class User(models.Model):

        name = CharField(max_length=128)        # No restrictions

picture_url = URLField(restrict_to = (‘friends’, ‘self’, ‘admins’))

security_question = CharField(max_length = 200, restrict_to = (‘self’,
‘admins’))

security_answer = CharField(max_length = 200, restrict_to =
(‘admins’))

Here different permission groups like ‘self’, ‘friends’ and ‘admins’
are created as a field option. To use this, specify the
permission_level in the call to serialize

data = serializers.serialize(queryset, permission_level = ‘friends’ )

If no permission_level is given, only unrestricted fields will be
serialized.

-----------------------------------------------------------------
Representing the existing serialization model
-----------------------------------------------------------------
Here is an implementation of the existing serialization format in
JSON, this would be the ‘fixture’ mode that I’ve mentioned above.

class JsonFixture(JSONSerializer):

        wrap_fields = "fields"

        nesting_depth = 0

        def meta2_pk(self, model):

                 ...

        def meta2_model(self, model):

                 …

In XML

class XMLFixture(XMLSerializer):

        wrap_fields = "fields"

        nesting_depth = 0

        metadata_display_mode = ATTRIBUTES

        indent = 4



        field_tag_name = "field"

        model_tag_name = "object"

        def meta2_pk(self, model):

           ...

        def meta2_model(self, model):

           …

        def meta_type(self, field):

           ...

===================
Deliverables and Timeline
===================

I would be working for about 40-45 hours each week and I would be
writing tests, exceptions and error messages along with development.
This would more or less be my timeline:

Till May 23

I will familiarize myself with community best practices and the
version control systems used by Django, read the code of all the
relevant modules related to the serialization framework, look at
implementations of other serialization libraries in different
languages and go through all the model and regression tests related to
serialization.

Weeks 1 to 2

I will use this time to set up the basic foundations of the projects
by :

   1. Writing the skeletal structure of the serializer, based on the
current implementations in core.serialization.base and
core.serialization.python.
   2. Setting up basic configurations for JSON, YAML, XML, Text and
creating the fixture mode.
   3. Making changes to loaddata and dumpdata in
core.management.commands to ensure backward compatibility.
   4. Using a dict as temporary storage before the final ‘dumping’
stage.
   5. Modifying the deserializers to handle custom formats of
serialization and specifying the requirements for deserialization.

Week 3

   1. Implementation of the metadata methods at field and model level
using getattr and similar methods.
   2. Make changes to the fields argument of ``serialize()``.
   3. Representation of output formats of the metadata in JSON/YAML,
XML etc.

Week 4

   1. Implementation of the unicode and datatype conversion methods in
a way similar to the metadata methods.
   2. Providing the user the choice of ‘protected’ types for the
serialization.

Week 5

   1. Provide all the configurable options for output formatting as
discussed above.
   2. Add support for string templates and their parsing.
   3. Parsing the dict used for temporary storage to generate XML and
custom text outputs

Week 6

   1. Implement serialization of related models like foreign keys and
many-to-many fields using recursion.
   2. Integrate with ``fields`` argument of ``serialize`` and specify
the format for representing a related model.
   3. Implement nesting depth limit feature.

Week 7

   1. Implement the ``exclude`` feature in serialize() which will
allow the user to choose fields to exclude while serializing.
   2. Adding an ``extras`` argument to serialize(), allowing the user
to specify additional properties of a model, which are not field
variables but derivatives of field variables and defined as methods or
properties in the model by the user.

Week 8

   1. Implement the permissions framework , which would give varying
levels of access to data to different users based on their permission
level.
   2. Integrate with Models API and field options.
   3. Add permission_level argument to serialize().

Weeks 9 - 10

   1. Write documentation for the project and provide many examples.
   2. Write a few tutorials on how to use the framework.
   3. Write some project-level tests, do some extensive final testing
and refine the project.

=====
About
=====

I am Vivek Narayanan, a second year undergrad student at the Institute
of Technology, Varanasi majoring in Electronics Engineering. I’m
really passionate about computers and have been programming for the
past 5-6 years. I have some experience in Python, C/C++, Java,
Javascript, HTML, PHP, Haskell and Actionscript. Python is my favorite
language and I’ve been using it for a couple of years. While working
on a web application, I stumbled upon Django a few months back and I
really love its elegant approach to everything.

I have submitted patches to tickets #15299 [1] , #12489 [2] and #8809
[3] on the Django Trac.

Some of the projects I’ve worked on are:

   1. blitzx86: An assembler for the Intel X86 Architecture using lex
and yacc. [4]
   2. An assembler for the dlx architecture supporting pipeline
optimizations. [5]
   3. mapTheGraph! - A location based social networking application on
web and Android platforms.
   4. A social networking game based on forex trading which is under
development.
   5. pyzok - A python based LAN chat server.  [6]
   6. aeroFox - A .NET based open source browser for Windows with
transparent  windows that managed over 100000 downloads. [7]

I am a fast learner and can grasp new technologies / languages in a
short period of time. Among other things, I enjoy playing tennis and
reading books on a wide variety of topics.

====
Links
====

[1] http://code.djangoproject.com/ticket/15299

[2] http://code.djangoproject.com/ticket/12489

[3] http://code.djangoproject.com/ticket/8809

[4] https://github.com/vivekn/blitz8086

[5] https://github.com/vivekn/dlx-compiler

[6] https://github.com/vivekn/pyzok

[7] http://sourceforge.net/projects/aerofox/files/aerofox/0.4.8.7/

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.

Reply via email to