RFC: Django history tracking

2006-06-14 Thread Uros Trebec

Hi, everyone!

First: introduction. My name is Uros Trebec and I was lucky enough to
be
selected to implement my idea of "history tracking" in Django. I guess
at least some of you think this is a very nice feature to have in web
framework, so I would like to thank you all who voted for my Summer Of
Code proposal! Thank you!

Ok, to get right to the point: this is a Request For Comment. I would
like to know
what you think about my idea for implementation and how can I make it
better. Here is what I have in mind so far...

(Just for reference: http://zabica.org/~uros/soc/ . Here you can find
my initial project proposal and some diagrams.)


1. PROPOSAL:
Main idea is to create a way to have a content history for every
change in a Model. Current changes tracking is very limited to say the
least, so I will extend/replace that so one could actually see how
something was changed.

1.1 SCOPE:
Changes will have to be made in different parts of Django. Most of
the things should be taken care of inside django.db, except diff-ing
and merging.


USAGE

2. MODELS:
The easiest way to imagine how stuff will work is to have an actual
usage case. So, let's see how Bob would use this feature.

2.1. Basic models:
To enable history tracking Bob has to create a sub-class for those
models that he will like to track:

class Post(models.Model):
author = models.CharField(maxlength=100)
title = models.CharField(maxlength=100)
content = models.TextField()
date = models.dateField()

class History:
pass

This works much like using Admin subclass. The difference is that if
the subclass is present then database will have change to include two
tables for this class:

(the main table - not changed):

CREATE TABLE app_post (
"id" serial NOT NULL PRIMARY KEY,
"author" varchar(100) NOT NULL,
"title" varchar(100) NOT NULL,
"content" text NOT NULL,
"date" datestamp NOT NULL
);


(and the history table):

CREATE TABLE app_post_history (
"id" serial NOT NULL PRIMARY KEY,
"change_date" datestamp NOT NULL,   # required for datetime
revert
"parent_id" integer NOT NULL REFERENCES app_post (id),
"author" varchar(100) NOT NULL, # data from app_post
"title" varchar(100) NOT NULL,  # data from app_post
"content" text NOT NULL,# data from app_post
"date" datestamp NOT NULL   # data from app_post
);

I think this would be enough to be able to save "basic full" version of
changed record. "parent_id" is a ForeignKey to app_post.id so Bob can
actually find the saved revision for a record from app_post and when he
selects a record from _history he knows to which record it belongs.


2.2. Selective models:
But what if Bob doesn't want to have every information in history (why
would someone like to keep an incomplete track of a record is beyond
me, but never mind)? Maybe the 'author' and 'date' of a post can't
change, so he
would like to leave that out. But at the same time, he would like to
know who made the change, but does not need the information when
using Post.

Again, this works like Admin subclass when defining which fields to
use:

class Post(models.Model):
author = models.CharField(maxlength=100)
title = models.CharField(maxlength=100)
content = models.TextField()
date = models.dateField()

class History:
track = ('title', 'content')
additional = {
   "changed_by": 
"models.CharField(maxlength=100)
}


In this case "app_post_history" would look like this:

CREATE TABLE app_post_history (
"id" serial NOT NULL PRIMARY KEY,
"change_date" datestamp NOT NULL,   # required for datetime
revert
"parent_id" integer NOT NULL REFERENCES app_post (id),
"title" varchar(100) NOT NULL,  # data from app_post
"content" text NOT NULL,# data from app_post
"changed_by" varchar(100) NOT NULL  # new field
);



3. VIEWS
3.1. Listing the change-sets:
Ok, so after a few edits Bob would like to see when and what was
added/changed in one specific record.

A view should probably work something like this:

from django.history import list_changes

Re: RFC: Django history tracking

2006-06-19 Thread Uros Trebec

Hi!

> There was a similar thread on this earlier where I commented about a
> slightly different way to store the changes:
> http://groups.google.com/group/django-users/browse_thread/thread/f36f4e48f9579fff/0d3d64b25f3fd506?q=time_from&rnum=1

Thanks for this one, I already found something usefull.

> To summarize, in the past I've used a time_from/time_thru pair of
> date/time columns to make it more efficient to retrieve the version of
> a row as it looked at a particular point in time. Your design of just
> using change_date makes this more difficult.

I don't know what you mean exactly, but I'm not using just
change_date. The ID in *_history table defines the "revision/version
number", so you don't have to use "change_date" to get the exact
revision.

> I can also think of use cases where I want the versioning to track both
> date and time since I would expect multiple changes on the same day.

This one is my fault. What I meant was using datetime for that field,
for said reasons exactly. Good catch!

> Maybe these could also be options?

Such ideas are always welcome. I will try and make it as versatile as possible.

Regards,
Uros

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers
-~--~~~~--~~--~--~---



Re: RFC: Django history tracking

2006-06-19 Thread Uros Trebec

> Sounds nice, this is a feature I'm currently looking for... but I've
> already started my own implementation.

Nice! Do you have anyting in code yet? Any bottlenecks?


> I would just share it with you.
>
> I've build a single table History with :
> - "change"; a  text field which will contain a python pickled
> dictionary: { field: old_value} in case you update a record.

How does this help/work? Why dictionary? Can you explain?

> - type: type of modification (update, delete, insert).

Is this really necesary? How do you make use of it?

> - "obj": the table object. This can come from ContentType

I don't understand...

> - "obj_id": the id of the impacted object.
> - create_date: a timestamp automatically set.


> I'm using it by sub-classing the save methods in each model I want to
> see the history.
> This is quite flexible, because you can decide which field you want to
> track.

I agree. But I fail to see the need for not versioning the whole record/row.


> To facilitate, yet one step further, it would be nice to have a
> PickledField within Model.models of django.

Can you elaborate on that?


> Feedbacks are welcome.

Same here! :) And thanks for your feedback!

Regards,
Uros

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers
-~--~~~~--~~--~--~---



Re: RFC: Django history tracking

2006-06-19 Thread Uros Trebec

Hi!

> Great to see that your RFC is pretty much exactly what I was thinking
> (feature and implementation-wise) when I posted
> http://roups.google.com/group/django-developers/browse_thread/thread/d90001b1d043253e/77d36caaf8cfb071

I'm glad! Thanks for the link too.

> It would be nice to record "who" made the change (optionally when there
> is a user with an id available).

I was thinking of not pushing the use of such fields, because there is
no easy way to figure out how each applications handles
accounts/users. But it's something that it should be made possible
with additional/custom fields, IMHO.


> I thought that storing complete row copies on both inserts and updates
> to original object isn't that bad - it certainly simplifies the
> machinery.

This is true.

> Because the way I was considering using this feature would
> read history tables very infrequent their size wasn't a big factor in
> my mind.

I'm sort-of undecided about this. On one hand you can potentialy have
a lot more data to handle, but on the other, you don't need multiple
SELECTs and merging happen when you want a version from way back.

What do others think about this?


> An admin to view change history "diff" colored output and to revert to
> arbitrary previous version would be an obvious future addition.

I agree. And I do have it on my todo list, but it's not "feature
critical", so it will have to wait until the machinery is done. Or
maybe not... hmm...

Regards,
Uros

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers
-~--~~~~--~~--~--~---



Re: RFC: Django history tracking

2006-06-19 Thread Uros Trebec

On 6/19/06, IanSparks <[EMAIL PROTECTED]> wrote:
> Although you have a date field  in your example model it might not hurt
> to add an automatic timestamp to a model that uses versioning in this
> way.

Changing the versioned model because of use of versioning is something
I would like to avoid. Forcing such things might not be a good idea.
But if there is no other way...

> One that relies on a "data" field that could be changed by a user
> doesn't seem safe to me.

I don't know what you mean by this?

> I'd also like an automatic userid stamp on there over and above the
> "author" which again is a data field not a hidden system field.

As I said before, this is not something that can be easily done,
because various ways of user/account handling. Or am I missing the
point here?

> You might also consider some automatic "revision number" system which
> increments every time the record is changed. This makes it easier to
> "roll back" to the previous entry and can be a lifesaver if something
> happens to whatever is providing the dates.

Every record in *_history table has its own ID which I was going to
use as "revision number".  And to make it easier to find "previous
revision" I was thinking on adding a "prev_rev" column to the table.
What do you think? Would this be enough?

> I think the framework suggested is a great start. I would be interested
> in seeing a feature that tied changes not just to the user who made the
> change but also to the "session" that they made the change in. i.e. if
> my system allows "Dave" to have two active sessions at different
> computers I'd like to track what he did in each session not just what
> date the changes occurred. This is very helpful for user complaints and
> fraud detection.

Hmm, very interesting idea! Do you have any suggestions on how would
this be best implemented? I must admit, I don't have much knowledge
how Django works internally so any help that I can get would be very
appreciated!

Thanks!

Regards,
Uros

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers
-~--~~~~--~~--~--~---



Re: FullHistory branch (Was: Branch Merges)

2006-11-07 Thread Uros Trebec

Hi guys!

I am sorry for not responding to your emails, I've been quite busy with
some other django-related projects.

Anyway, I really appreciate your interest for my work and I'll try to
get the branch up to speed ASAP.

If you have any patches, usage problems or questions, please contact me
directly (or CC: the message) because I often overlook what's going on
here.

best regards,
Uros


--~--~-~--~~~---~--~~
 You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---



svnmerge log policy?

2007-03-14 Thread Uros Trebec

Hi everyone,

I am the maintainer of "full-history" branch and I am currently trying
to bring it up to date.

I am trying to merge current trunk to the branch and the only way to
do that successfully was to use "svnmerge" tool.

Now, the merge apparently went well, without any conflicts, so I would
like to commit the merge. What I am wondering about is the way I
should write the commit log... Will a simple "Merged revisions
3642-4724 via svnmerge from http://code.djangoproject.com/svn/django/trunk";
suffice or should I use the whole "svnmerge-commit-message.txt" which
is 3184 lines long (it does include log entry of every commit to trunk
from revision 3642 to 4724)?

Best regards,
Uros Trebec

PS: Just for reference, here is what I did:
$ svn co http://code.djangoproject.com/svn/django/branches/full-history
(revision 4731)
$ svnmerge init -r 3641 (last trunk merge)
$ svnmerge avail
$ svnmerge merge -r 

Was this ok?


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---