Re: NoSQL Support for the ORM

Waldemar Kornewald Thu, 08 Apr 2010 10:08:40 -0700

On Thu, Apr 8, 2010 at 6:14 PM, Alex Gaynor <alex.gay...@gmail.com> wrote:
> On Wed, Apr 7, 2010 at 4:43 PM, Waldemar Kornewald <wkornew...@gmail.com> 
> wrote:
>> On Wed, Apr 7, 2010 at 5:12 PM, Alex Gaynor <alex.gay...@gmail.com> wrote:
>>> No.  I am vehemently opposed to attempting to extensively emulate the
>>> features of a relational database in a non-relational one.  People
>>> talk about the "object relational" impedance mismatch, much less the
>>> "object-relational non-relational" one.  I have no interest in
>>> attempting to support any attempts at emulating features that just
>>> don't exist on the databases they're being emulated on.
>>
>> This decision has to be based on the actual needs of NoSQL developers.
>> Did you actually work on non-trivial projects that needed
>> denormalization and in-memory JOINs and manually maintained counters?
>> I'm not making this up. The "dumb" key-value store API is not enough.
>> People are manually writing lots of code for features that could be
>> handled by an SQL emulation layer. Do we agree until here?
>>
>
> No, we don't.  People are desiging there data in ways that fit their
> datastore. If all people did was implement a relational model in
> userland code on top of non-relational databases then they'd really be
> missing the point.


Then you're calling everyone a fool. :) What do you call a CouchDB or
Cassandra index mapping usernames to user pks? Its purpose it exactly
to do something that relational DBs provides out-of-the-box. You can't
deny that people do in fact manually maintain such indexes.

So, you're suggestion to write code like this:

# ----------
class User(models.Model):
    username = models.CharField(max_length=200)
    email = models.CharField(max_length=200)
    ...

class UsernameUser(models.Model):
    username = models.CharField(primary_key=True, max_length=200)
    user_id = models.IntegerField()

class EmailUser(models.Model):
    email = models.CharField(primary_key=True, max_length=200)
    user_id = models.IntegerField()

def add_user(username, email):
    user = User.objects.create(username=username, email=email)
    UsernameUser.objects.create(username=username, user_id=user.id)
    EmailUser.objects.create(email=email, user_id=user.id)
    return user

def get_user_by_username(username):
    id = UsernameUser.objects.get(username=username).user_id
    return User.objects.get(id=id)

def get_user_by_email(email):
    id = EmailUser.objects.get(email=email).user_id
    return User.objects.get(id=id)

get_user_by_username('marcus')
get_user_by_email('mar...@marcus.com')
# ----------

What I'm proposing allows you to just write this:

# ----------
class User(models.Model):
    username = models.CharField(max_length=200)
    email = models.CharField(max_length=200)
    ...

User.objects.get(username='marcus')
User.objects.get(email='mar...@marcus.com')
# ----------

Are you seriously saying that people should use the first version of
the code when they work with a simplistic NoSQL DB (note, it's how
they work today with those DBs)?

>> * Django apps written for NoSQL will be portable across all NoSQL DBs
>> without any code changes and in the worst case require only minor
>> changes to switch to SQL
>> * the resulting code is shorter and easier to understand than with a
>> separate API which would only add another layer of indirection you'd
>> have to think about *every* (!) single time you work with models (and
>> if you have to think about this while writing model code you end up
>> with potentially a lot more bugs, as is actually the case in practice)
>> * developers won't have to use and learn a different models API (you'd
>> only need to learn an API for specifying "optimization" rules, but the
>> models would still be the same)
>>
>
> Uhh, the whole point of htis is that there is only a single API.

And what you're suggesting is an API whose semantics are different on
every single backend? How is that better? The indexing API would at
least look and behave the same on all backends, so it's a "learn once
and use anywhere" experience.

>> What if you filter on one field defined in the parent class and
>> another field defined on the child class? Emulating this query would
>> be either very inefficient and (for large datasets) possibly return no
>> results, at all, or require denormalization which I'd find funny in
>> the case of MTI because it brings us back to single-table inheritance,
>> but it might be the only solution that works efficiently on all NoSQL
>> DBs.
>>
>
> Filters on base fields can be implemented fairly easily on databases
> with IN queries.  Otherwise I suppose it raises an exception.

How would that be implemented with an IN filter (you have two
different tables)? What would the (pseudo-)code look like?

Bye,
Waldemar Kornewald

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-develop...@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.

Re: NoSQL Support for the ORM

Reply via email to