Hi,
In the past 3 weeks, my project has changed a lot. First of all I
changed output of first phase of serialization. Previously it was python
native datatypes. At some point I added dictionary with metadata to it.
Metadata was used in second phase of serialization. Now after first
phase I returned ObjectWithMetadata which is wrapping for python native
datatypes. It's a bit hackish so I don't know it is good solution:
class ObjectWithMetadata(object):
def __init__(self, obj, metadata=None, fields=None):
self._object = obj
self.metadata = metadata or {}
self.fields = fields or {}
def get_object(self):
return self._object
def __getattribute__(self, attr):
if attr not in ['_object', 'metadata', 'fields', 'get_object']:
return self._object.__getattribute__(attr)
else:
return object.__getattribute__(self, attr)
# there is a few more methods like this (for acting like a
MutableMapping and Iterabla) and all are similar
def __getitem__(self, key):
return self._object.__getitem__(key)
...
Thanks to this solution, ObjectWithMetadata is acting like object stored
in _object in almost all cases (also at isinstance tests), and there is
place for storing additional data.
I didn't change deserialization so in output there are python native
datatypes without wrapping. I don't know if this is good because there
is no symmetry in this:
Django object -> python native datatype packed in ObjectWithMetadata ->
json -> python native datatype -> Django object
I have all dumpsdata formats working now (xml, json, yaml). All tests
pass, but there is problem with order of fields in yaml. It will be
fixed soon.
I make new format new_xml which is similar to json and yaml. It's easier
to parsing it.
Old:
<object pk="1" model="serializers.article">
<field to="serializers.author" name="author"
rel="ManyToOneRel">1</field>
<field to="serializers.category" name="categories"
rel="ManyToManyRel">
<object pk="1"></object>
<object pk="2"></object>
</field>
</object>
New:
<object pk="1" model="serializers.article">
<fields>
<author to="serializers.author" rel="ManyToOneRel">1</author>
<categories to="serializers.category" rel="ManyToManyRel">
<object>1</object>
<object>2</object>
</categories>
</fields>
</object>
There is also problem with json and serialization to stream because json
is using extensions written in C (_json) for performance and this leads
to exceptions when ObjectWithAttributes is used, so before pass objects
to json.loads these objects should be unpacked from ObjectWithMetadata.
Probably there is no chance to achieve one of most important requirement
which I have specify - using only one Serializer to serialize Django
Models to multiple formats:
serializers.serialize('json', objects, serializer=MySerializer)
serializers.serialize('xml', objects, serializer=MySerializer)
Trouble is with xml (like always ;). In xml every (model) field must be
converted to string before serializing in xml serializer. In json and
yaml if field have protected type (string, int, datetime etc.) then
nothing is done with it. Converting is done in first phase because only
there is access to field.value_to_string - field method that is used to
convert field value to string. It can be override by user so simple
doing smart_unicode in second phase instead isn't enough.
Most important tasks in TODO:
handling natural keys
tests
x correctness
x performance (I suspect my solution will be worse than actual used
in Django, but how much?)
documentation
https://github.com/grapo/django/tree/soc2012-serialization/django/core/serializers
--
Piotr Grabowski
--
You received this message because you are subscribed to the Google Groups "Django
developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/django-developers?hl=en.