Re: NoSQL Support for the ORM

Waldemar Kornewald Fri, 09 Apr 2010 01:43:18 -0700

On Thu, Apr 8, 2010 at 11:03 PM, flo...@gmail.com <flo...@gmail.com> wrote:
> On Apr 8, 12:32 pm, Waldemar Kornewald <wkornew...@gmail.com> wrote:
>
>> What I'm proposing is not a complete emulation of all features at all
>> cost, but simply an automation of the things that are possible and in
>> wide use on nonrel DBs. Moreover, you'd only use these features where
>> actually needed, so this would be a helper that replaces exactly the
>> code you'd otherwise write by hand - nothing more. Denormalization,
>> counters, etc. indeed go over the network, but people still do it
>> because there is no alternative (CouchDB being an exception, but there
>> we can auto-generate a view, so the index is created on the DB: same
>> game, different location).
>
> "Denormalization, counters, etc." is a completely orthogonal problem.
> Solving those problems would help even those who are using relational
> databases, in fact.  But just because it's useful, and precisely
> because it's orthogonal, means it doesn't belong in this summer of
> code project.


I guess we have a misunderstanding here. I never wanted to have it in
this GSoC project. It's clearly out of scope. I just want to make sure
that emulation will not be more difficult.

> I think what you're going to run into is that since CouchDB,
> Cassandra, MongoDB, GAE, Redis, Riak, Voldemort, etc. are all so
> vastly different, that attempting to do *any* emulation will result in
> serious pain down the line.

You are absolutely right that those DBs are vastly different (*at the
low level*) and that's why Django's ORM would suck as a crippled
low-level replacement for the native NoSQL APIs. I can't imagine how
you would map, for instance, Redis' list management features to the
ORM. Depending on your problem you won't get around using Redis'
native API to solve a problem efficiently, no matter how extensive the
emulation layer is. Unless I've misunderstood, Alex wants to build
just a low-level API replacement, but it doesn't make any sense
because the native NoSQL APIs are much better at this task than
Django's ORM will ever be.

But there's a second use-case: Working with object-like data (with
relations between objects). That's where people write indexing code by
hand (column indexes, denormalization indexes, counters, etc.) which
is very unproductive. This use-case is pretty common and it maps
pretty well to Django's ORM *at the high level* because the high-level
usage can look the same on all DBs:
* get by username: User.objects.filter(username=...)
* join via denormalization index: Profile.objects.filter(age=21,
user__username=...)
* keep counter for number of votes for each video: video.vote_set.count()
* etc..

An abstraction/emulation layer can save you a lot of work because you
won't have to maintain the required indexes by hand and those indexes
don't make your query code more complicated. Also, it makes your code
portable (except where you needed the native API for optimization).

This also means that there is a clear separation of purpose: the
native API for a few optimizations and Django's ORM for object-like
data. We already use that distinction when we combine raw SQL with the
ORM.

It's no problem to automatically maintain such indexes on NoSQL DBs.
As long as you're free to store whatever you want in the DB and run
background tasks you have full control over everything.

> It simply doesn't seem reasonable to claim that whatever refactoring
> Alex does, will make "it more difficult or even impossible to
> implement an emulation layer" because all he would be doing is
> decoupling SQL from the query class.  That can *only* make your goal
> easier.

When I said the ORM refactoring should not make it more difficult to
implement the emulation layer Alex said that he was "vehemently
opposed" to emulating SQL features extensively. I don't know what
"extensively" means, but if we have to make a design decision during
the refactoring and Alex says "I don't care about that feature for
NoSQL" we might end up with an ORM that actually makes it more
difficult. That's why I find it important that we agree on a common
goal for the refactoring.

Bye,
Waldemar Kornewald

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-develop...@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.

Re: NoSQL Support for the ORM

Reply via email to