Re: [GSOC] NoSQL Support for the ORM

Waldemar Kornewald Wed, 07 Apr 2010 14:56:02 -0700

On Wed, Apr 7, 2010 at 5:22 PM, Alex Gaynor <alex.gay...@gmail.com> wrote:
>> Other issues that spring to mind:
>>
>>  * What about nonSQL datatypes? List/Set types are a common feature of
>> Non-SQL backends, and are The Right Way to solve a whole bunch of
>> problems. How do you propose to approach these datatypes? What (if
>> any) overlap exists between the use of set data types and m2m? Is
>> there any potential overlap between supporting List/Set types and
>> supporting Arrays in SQL?
>>
>
> Is there overlap between List/Set and Arrays in SQL?  Probably.  In my
> opinion there's no reason, once we have a good clean seperation of
> concerns in the architecture that implementing a ListField would be
> particularly hard.  If we happened to include one in Django, all the
> better (from the perspective of interoperability).


Do all SQL DBs provide an array type? PostgreSQL has it and I think it
can exactly mimic NoSQL lists, but I couldn't find an equivalent in
sqlite and MySQL. Does this possibly stand in the way of integrating
an official ListField into Django or is it OK to have a field that
isn't supported on all DBs? Or can we fall back to storing the list
items in separate entities in that case?

>>  * How does a non-SQL backend integrate with syncdb and other setup
>> tools? What about inspectdb?
>>
>
> Most, but not all non-relational databases don't require table setup
> the way relational DBs do.  MongoDB doesn't require anything at all,
> by contrast Cassandra requires an XML configuration file.  How to
> handle these is a little touchy, but basically I think syncdb should
> stay conceptually pure, generating "tables", if extra config is needed
> backends should ship custom management commands.

Essentially, I agree, but I would add things like auto-generated
CouchDB views to the syncdb process (since syncdb on SQL already takes
care of creating indexes, too).

>>  * Why the choice of MongoDB specifically? Do you have particular
>> experience with MongoDB? Does MongoDB have features that make it a
>> good choice?
>>
>
> MongoDB offers a wide range of filtering options, which from my
> perspective means it presents a greater test of the flexibility of the
> developed APIs.  For this reason GAE would also be a good choice.
> Something like Riak or Cassandra, which basically only have native
> support for get(pk=3) would be a poor test of the flexibility of the
> API.

MongoDB really is a good choice. Out-of-the-box (without manual index
definitions) it provides more features than GAE and most other NoSQL
DBs. MongoDB and GAE should also have the simplest backends.

Why should the Cassandra/CouchDB/Riak/Redis/etc. backend only support
pk=... queries? There's no reason why the backend couldn't maintain
indexes for the other fields and transparently support filters on any
field. I mean, you don't really want developers to manually create and
query separate indexing models for mapping one field value to its
respective primary key in the primary model table. We can do much
better than that.

>>  * Given that you're only proposing a single proof-of-concept backend,
>> have you given any thought to the needs of other backends? It's not
>> hard to envisage that Couch, Cassandra, GAE etc will all have slightly
>> different requirements and problems. Is there a common ground that
>> exists between all data store backends? If there isn't, how do you
>> know that what you are proposing will be sufficient to support them?
>>
>
> To a certain extent this is a matter of knowing the featuresets of the
> databases and, hopefully, having a mentor who is knowledgeable about
> them.  The reality is under the GSOC time constraints attempting to
> write complete backends for multiple databases would probably be
> impossible.

Well, you might be able to quickly adapt the MongoDB backend to GAE
(within GSoC time constraints) due to their similarity. Anyway, there
is common ground between the NoSQL DBs, but this highly depends on
what problem we agree to solve. If we only provide exactly the
features that each DB supports natively, they'll appear dissimilar
because they take very different approaches to indexing and if this
isn't abstracted and automated NoSQL support doesn't really make sense
with Django. OTOH, if the goal is to make an abstraction around their
indexes they can all look very similar from the perspective of
Django's ORM (of course they have different "features" like sharding
or eventual consistency or being in-memory DBs or supporting fast
writes or reads or having transactions or ..., but in the end only few
of these features have any influence on Django's ORM, at all).

Bye,
Waldemar Kornewald

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-develop...@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.

Re: [GSOC] NoSQL Support for the ORM

Reply via email to