Dynamically Generate Methods

2011-11-18 Thread GZ
Hi,

I have a class Record and a list key_attrs that specifies the names of
all attributes that correspond to a primary key.

I can write a function like this to get the primary key:

def get_key(instance_of_record):
   return tuple(instance_of_record.__dict__[k] for k in key_attrs)

However, since key_attrs are determined at the beginning of the
program while get_key() will be called over and over again, I am
wondering if there is a way to dynamically generate a get_ley method
with the key attributes expanded to avoid the list comprehension/
generator.

For example, if key_attrs=['A','B'], I want the generated function to
be equivalent to the following:

def get_key(instance_of_record):
   return (instance_of_record['A'],instance_of_record['B'] )

I realize I can use eval or exec to do this. But is there any other
way to do this?

Thanks,
gz




-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Dynamically Generate Methods

2011-11-20 Thread GZ
Hi All,

I see. It works.

Thanks,
GZ

On Nov 18, 12:04 pm, Ian Kelly  wrote:
> On Fri, Nov 18, 2011 at 7:51 AM, GZ  wrote:
> > Hi,
>
> > I have a class Record and a list key_attrs that specifies the names of
> > all attributes that correspond to a primary key.
>
> > I can write a function like this to get the primary key:
>
> > def get_key(instance_of_record):
> >   return tuple(instance_of_record.__dict__[k] for k in key_attrs)
>
> > However, since key_attrs are determined at the beginning of the
> > program while get_key() will be called over and over again, I am
> > wondering if there is a way to dynamically generate a get_ley method
> > with the key attributes expanded to avoid the list comprehension/
> > generator.
>
> (Accidentally sent this to the OP only)
>
> This is exactly what the attrgetter factory function produces.
>
> from operator import attrgetter
> get_key = attrgetter(*key_attrs)
>
> But if your attribute names are variable and arbitrary, I strongly
> recommend you store them in a dict instead.  Setting them as instance
> attributes risks that they might conflict with the regular attributes
> and methods on your objects.
>
> Cheers,
> Ian
>
>

-- 
http://mail.python.org/mailman/listinfo/python-list


Close as Many Files/External resourcs as possible in the face of exceptions

2011-11-20 Thread GZ
Hi,

Here is my situation. A parent object owns a list of files (or other
objects with a close() method). The close() method can sometimes fail
and raise an exception. When the parent object's close() method is
called, it needs to close down as many files it owns as possible, even
if the close() function of some files fail. I also want to re-raise at
least one of the original exceptions so that the outer program can
handle it.

What I come up is something like this, suppose L is a list that holds
all the file objects.

is_closed = set()
try:
for f in L:
f.close()
is_closed.add(f)
except:
try:
raise #re-raise immediately, keeping context intact
finally:
for f in L: # close the rest of the file objects
if f not in is_closed:
 try:
 f.close()
 except:
  pass

It will re-raise the first exception and preserve the context and
close as many other files as possible while ignoring any further
exceptions.

But this looks really awkward. And in the case that two files fail to
close, I am not sure the best strategy is to ignore the second
failure.

What is a better way of handling such situation?

Thanks,
gz

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to Get Data from DictReader for CSV Files

2011-11-21 Thread GZ
Hi,

On Nov 21, 7:42 am, ray  wrote:
> I don't see how to get my data from the output.  I can see the data in
> the rows but it is mixed in with the field names.  That is, the data I
> get comes out as:
> fieldname1 : data1 , fieldname2 : data2 , etc.
>
> import csv
> linelist=open( "C:/Users/me/line_list_r0.csv", "rb" )
> csvReader= csv.DictReader( linelist, dialect='excel' )
> for data in csvReader:
>         print data
> linelist.close()
>
> I want to pass this data as arrays or lists to another module such as:
> myfunction(data1, data2, data3)
>
> How do I get the data I want out of the pair fieldname1 : data1?
>
> Thanks,
> ray
>
>

It returns a dict(). You can reference the fields with
data['fieldname1'], etc.
-- 
http://mail.python.org/mailman/listinfo/python-list


Generator Question

2011-12-21 Thread GZ
Hi,

I am wondering what would be the best way to return an iterator that
has zero items.

I just noticed the following two are different:

def f():
   pass
def g():
   if 0:  yield 0
pass

for x in f(): print x
Traceback (most recent call last):
  File "", line 1, in 
TypeError: 'NoneType' object is not iterable

for x in g(): print x
#loop exits without any errors

Now the question here is this:

def h():
if condition=true:
   #I would like to return an itereator with zero length
else:
   for ...: yield x

In other words, when certain condition is met, I want to yield
nothing. How to do?

Thanks,
gz





-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Generator Question

2011-12-24 Thread GZ
I see. Thanks for the clarification.

On Dec 22, 12:35 am, Steven D'Aprano  wrote:
> On Wed, 21 Dec 2011 21:45:13 -0800, GZ wrote:
> > Now the question here is this:
>
> > def h():
> >     if condition=true:
> >        #I would like to return an itereator with zero length
> >     else:
> >        for ...:yieldx
>
> > In other words, when certain condition is met, I want toyieldnothing.
> > How to do?
>
> Actually, there's an even easier way.
>
> >>> def h():
>
> ...     if not condition:
> ...         for c in "abc":
> ...            yieldc
> ...
>
> >>> condition = False
> >>> list(h())
> ['a', 'b', 'c']
> >>> condition = True
> >>> list(h())
>
> []
>
> --
> Steven
>
>

-- 
http://mail.python.org/mailman/listinfo/python-list


Test None for an object that does not implement ==

2011-12-24 Thread GZ
Hi,

I run into a weird problem. I have a piece of code that looks like the
following:

f(, a=None, c=None):
assert  (a==None)==(c==None)


The problem is that == is not implemented sometimes for values in a
and c, causing an exception NotImplementedError.

I ended up doing assert (not a)==(not c), but I think this code has
other issues, for example, when a=[] and c=['a'], the assertion will
fail, although a is not None.

So how do I reliably test if a value is None or not?

Thanks,
gz
-- 
http://mail.python.org/mailman/listinfo/python-list


Nested Function Question

2012-01-06 Thread GZ
Hi,

I am reading the documentation of functools.partial (http://
docs.python.org/library/functools.html#functools.partial) and found
the following 'reference implementation' of functools.partial.

def partial(func, *args, **keywords):
def newfunc(*fargs, **fkeywords):
newkeywords = keywords.copy()
newkeywords.update(fkeywords)
return func(*(args + fargs), **newkeywords)
newfunc.func = func
newfunc.args = args
newfunc.keywords = keywords
return newfunc

I don't understand why the below 3 lines are needed:

newfunc.func = func
newfunc.args = args
newfunc.keywords = keywords


It is as if they are trying to prevent garbage collection, but I don't
get why it is needed. As long as something holds reference to newfunc,
because it in turn references keywords and args, nothing will be
freed. If nothing is referencing newfunc, then everything should be
freed.

Thanks,
GZ
-- 
http://mail.python.org/mailman/listinfo/python-list


low level python read's

2007-01-07 Thread gz
Hi!

I wanted to use python to test a simple character device (on linux) and
I'm running into strange behaviour of read..
I have a short buffer inside my device and the idea is that it blocks
read's when the buffer is empty. For reads that ask for more characters
that the buffer holds the device should return the number of bytes
actually read...

In c i can write:
f = open("/dev/testdevice",O_RDWR);
read(f,buffer,1000);

and i see in my device, that everything is ok.

No I'd love to reproduce this in python..
f = open("/dev/testdevice", "r+",0) # for unbuffered access
f.read(1000)

..but now i see in the device log's that python issued 2 reads! One
that got the whole buffer (less then 1000 chars) and after that python
tries to read more! (and hangs in my device, since the buffer is
empty...

So how do i stop python from trying to be smart and just read *at most*
1000 chars and let it go if he(it?*) reads less?



grzes.
p.s *is python a "he" or an "it"?

-- 
http://mail.python.org/mailman/listinfo/python-list


pyopenglcontext binaries for 2.5 on win32

2007-11-12 Thread gz
no, I don't have them... I need them :)

I'd like to thank Giovanni Bajo for providing binaries for the various
package dependencies, and geting me going with pyopengl.

Unfortunately I only menaged to run a basic example, where there's no
animation. The glwindow only get's redrawn when it's resized, moved...
well generally redrawed as a window.

I would greatly appreciate some hints, about how to process the gui
events in the gl portion, and how to run a continous animation in wx +
pyopengl?

I suspect the whole thing would be way easier with pyopenglcontext,
but I can't seem to find a binary for python 2.5
I can't get it to install with mingw and don't have vc currently
installed. If someone has successfully built it, plesase share.

Although, I think, an example of a running opengl spinning cube,
embedded in some wx menu + buttons, capable of handling, say, mouse
clicks in the glwindow, would work best for me.

I'm not even that keen on wx. I choose it, purely, on the basis that
wx is generaly brought up here frequenter than qt.
(Didn't qt have some licensing change in the last few months that
could potentially change that?)

-- 
http://mail.python.org/mailman/listinfo/python-list


How to use a class property to store function variables?

2010-04-27 Thread GZ
I want to store a reference to a function into a class property.

So I am expecting that:

class A:
 fn = lambda x: x

fn = A.fn
fn(1)

Traceback (most recent call last):
  File "", line 1, in 
TypeError: unbound method () must be called with A instance as
first argument (got int instance instead)


The problem is that A.fn is treated as a bounded method. I really want
A.fn to be a variable that stores a reference to a function. Is there
any way to achieve this?

Thanks,
GZ
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to use a class property to store function variables?

2010-04-27 Thread GZ
Hi Chris,

On Apr 27, 6:43 pm, Chris Rebert  wrote:
> On Tue, Apr 27, 2010 at 4:36 PM, GZ  wrote:
> > I want to store a reference to a function into a class property.
>
> > So I am expecting that:
>
> > class A:
> >     fn = lambda x: x
>
> > fn = A.fn
> > fn(1)
>
> > Traceback (most recent call last):
> >  File "", line 1, in 
> > TypeError: unbound method () must be called with A instance as
> > first argument (got int instance instead)
>
> > The problem is that A.fn is treated as a bounded method. I really want
> > A.fn to be a variable that stores a reference to a function. Is there
> > any way to achieve this?
>
> Use the staticmethod() decorator:
>
> class A(object):
>     @staticmethod
>     def fn(x):
>         return x
>
> #rest same as before
>
> Cheers,
> Chris
> --http://blog.rebertia.com- Hide quoted text -
>
> - Show quoted text -

I do not think it will help me. I am not trying to define a function
fn() in the class, but rather I want to make it a "function reference"
so that I can initialize it any way I like later.

For example, I want to be able to write the following:

A.fn = lambda x : x*x
f = A.fn
f(1)
A.fn = lambda x : x^2
f= A.fn
f(2)

In other words, I want to make A.fn a reference to a function not
known to me at the time I define class A. I want to be able to
initialize it later.


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to use a class property to store function variables?

2010-04-27 Thread GZ
On Apr 27, 9:20 pm, alex23  wrote:
> GZ  wrote:
> > I do not think it will help me. I am not trying to define a function
> > fn() in the class, but rather I want to make it a "function reference"
> > so that I can initialize it any way I like later.
>
> It always helps to try an idea out before dismissing it out of hand.
> Experimentation in the interpreter is cheap and easy.
>
> >>> class A(object):
>
> ...   fn = staticmethod(lambda x: x*x)
> ...>>> A.fn(10)
> 100
> >>> A.fn = staticmethod(lambda x: x**x)
> >>> A.fn(3)
> 27
> >>> def third(x): return x/3
> ...
> >>> A.fn = staticmethod(third)
> >>> A.fn(9)
>
> 3
>
> However, I'm assuming you're wanting to do something like this:
>
> >>> class B(object):
>
> ...   def act(self):
> ...     print self.fn()
>
> That is, providing a hook in .act() that you can redefine on demand.
> If so, note that you only need to decorate functions as staticmethods
> if you're assigning them to the class. If you intend on overriding on
> _instances_, you don't:
>
> >>> B.fn = staticmethod(lambda: 'one') # assign on class
> >>> b = B() # instantiate
> >>> b.act() # act on instance
> one
> >>> B.fn = staticmethod(lambda: 'two') # assign on class
> >>> b.act() # existing instance calls new version on class
> two
> >>> b.fn = staticmethod(lambda: 'three') # assign on instance
> >>> b.act()
>
> Traceback (most recent call last):
>   File "", line 1, in 
>   File "", line 3, in act
> TypeError: 'staticmethod' object is not callable>>> b.fn = lambda: 'three' # 
> look Ma, no staticmethod!
> >>> b.act()
>
> three
>
> Incidentally, this is known as the Strategy pattern, and you can see a
> simple example of it in Python 
> here:http://en.wikipedia.org/wiki/Strategy_pattern#Python
>
> Hope this helps.

Ah, this totally works. The key is to use the staticmethod function.
Thanks a lot.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to use a class property to store function variables?

2010-04-27 Thread GZ
On Apr 27, 9:20 pm, alex23  wrote:
> GZ  wrote:
> > I do not think it will help me. I am not trying to define a function
> > fn() in the class, but rather I want to make it a "function reference"
> > so that I can initialize it any way I like later.
>
> It always helps to try an idea out before dismissing it out of hand.
> Experimentation in the interpreter is cheap and easy.
>
> >>> class A(object):
>
> ...   fn = staticmethod(lambda x: x*x)
> ...>>> A.fn(10)
> 100
> >>> A.fn = staticmethod(lambda x: x**x)
> >>> A.fn(3)
> 27
> >>> def third(x): return x/3
> ...
> >>> A.fn = staticmethod(third)
> >>> A.fn(9)
>
> 3
>
> However, I'm assuming you're wanting to do something like this:
>
> >>> class B(object):
>
> ...   def act(self):
> ...     print self.fn()
>
> That is, providing a hook in .act() that you can redefine on demand.
> If so, note that you only need to decorate functions as staticmethods
> if you're assigning them to the class. If you intend on overriding on
> _instances_, you don't:
>
> >>> B.fn = staticmethod(lambda: 'one') # assign on class
> >>> b = B() # instantiate
> >>> b.act() # act on instance
> one
> >>> B.fn = staticmethod(lambda: 'two') # assign on class
> >>> b.act() # existing instance calls new version on class
> two
> >>> b.fn = staticmethod(lambda: 'three') # assign on instance
> >>> b.act()
>
> Traceback (most recent call last):
>   File "", line 1, in 
>   File "", line 3, in act
> TypeError: 'staticmethod' object is not callable>>> b.fn = lambda: 'three' # 
> look Ma, no staticmethod!
> >>> b.act()
>
> three
>
> Incidentally, this is known as the Strategy pattern, and you can see a
> simple example of it in Python 
> here:http://en.wikipedia.org/wiki/Strategy_pattern#Python
>
> Hope this helps.

Another question: I am not sure how staticmethod works internally. And
the python doc does not seem to say. What does it do?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to use a class property to store function variables?

2010-04-28 Thread GZ
On Apr 28, 1:20 am, Chris Rebert  wrote:
> On Tue, Apr 27, 2010 at 11:02 PM, GZ  wrote:
> > On Apr 27, 9:20 pm, alex23  wrote:
> >> GZ  wrote:
> >> > I do not think it will help me. I am not trying to define a function
> >> > fn() in the class, but rather I want to make it a "function reference"
> >> > so that I can initialize it any way I like later.
>
> >> It always helps to try an idea out before dismissing it out of hand.
> >> Experimentation in the interpreter is cheap and easy.
>
> >> >>> class A(object):
>
> >> ...   fn = staticmethod(lambda x: x*x)
> >> ...>>> A.fn(10)
> >> 100
> >> >>> A.fn = staticmethod(lambda x: x**x)
> >> >>> A.fn(3)
> >> 27
> >> >>> def third(x): return x/3
> >> ...
> >> >>> A.fn = staticmethod(third)
> >> >>> A.fn(9)
>
> >> 3
>
> >> However, I'm assuming you're wanting to do something like this:
>
> >> >>> class B(object):
>
> >> ...   def act(self):
> >> ...     print self.fn()
>
> >> That is, providing a hook in .act() that you can redefine on demand.
> >> If so, note that you only need to decorate functions as staticmethods
> >> if you're assigning them to the class. If you intend on overriding on
> >> _instances_, you don't:
>
> >> >>> B.fn = staticmethod(lambda: 'one') # assign on class
> >> >>> b = B() # instantiate
> >> >>> b.act() # act on instance
> >> one
> >> >>> B.fn = staticmethod(lambda: 'two') # assign on class
> >> >>> b.act() # existing instance calls new version on class
> >> two
> >> >>> b.fn = staticmethod(lambda: 'three') # assign on instance
> >> >>> b.act()
>
> >> Traceback (most recent call last):
> >>   File "", line 1, in 
> >>   File "", line 3, in act
> >> TypeError: 'staticmethod' object is not callable>>> b.fn = lambda: 'three' 
> >> # look Ma, no staticmethod!
> >> >>> b.act()
>
> >> three
>
> >> Incidentally, this is known as the Strategy pattern, and you can see a
> >> simple example of it in Python 
> >> here:http://en.wikipedia.org/wiki/Strategy_pattern#Python
>
> >> Hope this helps.
>
> > Another question: I am not sure how staticmethod works internally. And
> > the python doc does not seem to say. What does it do?
>
> It involves the relatively arcane magic of "descriptors".
> Seehttp://docs.python.org/reference/datamodel.html#implementing-descriptors
> or for a more complete but advanced explanation, the "Static methods
> and class methods" section 
> ofhttp://www.python.org/download/releases/2.2.3/descrintro/
>
> Understanding exactly how staticmethod() and friends work is not too
> essential in practice though.
>
> Cheers,
> Chris
> --http://blog.rebertia.com- Hide quoted text -
>
> - Show quoted text -

Got it. I appreciate your help.
-- 
http://mail.python.org/mailman/listinfo/python-list


Remembering the context

2010-04-28 Thread GZ
Hi All,

I am looking at the following code:

def fn():

def inner(x):
 return tbl[x]

tbl={1:'A', 2:'B'}
f1 = inner   # I want to make a frozen copy of the values of tbl
in f1
tbl={1:'C', 2:'D'}
f2 = inner
   return (f1,f2)

f1,f2 = fn()
f1(1)  # output C
f2(1) # output C

What I want is for f1 to make a frozen copy of tbl at the time f1 is
made and f2 to make another frozen copy of tbl at the time f2 is made.
In other words, I want f1(1)=='A' and f2(1)=='C'.

One way to do this is to use functools.partial

def fn():

def inner(tbl, x):
 return tbl[x]

tbl={1:'A', 2:'B'}
f1 = functools.partial(inner,tbl)   # I want to make a frozen copy
of the values of tbl in f1
tbl={1:'C', 2:'D'}
f2 = functools.partial(inner,tbl)
   return (f1,f2)

I am wondering if there is any other way to do this.
-- 
http://mail.python.org/mailman/listinfo/python-list


Diff of Text

2010-06-03 Thread GZ
Hi All,

I am looking for an algorithm that can compare to source code files
line by line and find the minimum diff. I have looked at the difflib
included in python. The problem is that it is designed to make the
diff results easier for humans to read, instead of minimize the size
of the output differencial. I would like an algorithm implementation
that gives the absolute minimum difference between the two files.

Can you help me?

Thanks,
gz
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Diff of Text

2010-06-04 Thread GZ
Hi Pat,

On Jun 4, 2:55 pm, Patrick Maupin  wrote:
> On Jun 3, 9:54 pm, GZ  wrote:
>
> > Hi All,
>
> > I am looking for an algorithm that can compare to source code files
> > line by line and find the minimum diff. I have looked at the difflib
> > included in python. The problem is that it is designed to make the
> > diff results easier for humans to read, instead of minimize the size
> > of the output differencial. I would like an algorithm implementation
> > that gives the absolute minimum difference between the two files.
>
> > Can you help me?
>
> > Thanks,
> > gz
>
> There's an "rsync.py" module in pypi -- one would think that would
> have to solve that same problem...
>
> Regards,
> Pat

No, rsync does not solve my problem.

I want a library that does unix 'diff' like function, i.e. compare two
strings line by line and output the difference. Python's difflib does
not work perfectly for me, because the resulting differences are
pretty big. I would like an algorithm that generates the smallest
differences.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Diff of Text

2010-06-04 Thread GZ
On Jun 4, 8:37 pm, Lie Ryan  wrote:
> On06/05/10 07:51, GZ wrote:
>
>
>
>
>
> > Hi Pat,
>
> > On Jun 4, 2:55 pm, Patrick Maupin  wrote:
> >> On Jun 3, 9:54 pm, GZ  wrote:
>
> >>> Hi All,
>
> >>> I am looking for an algorithm that can compare to source code files
> >>> line by line and find the minimum diff. I have looked at the difflib
> >>> included in python. The problem is that it is designed to make the
> >>> diff results easier for humans to read, instead of minimize the size
> >>> of the output differencial. I would like an algorithm implementation
> >>> that gives the absolute minimum difference between the two files.
>
> >>> Can you help me?
>
> >>> Thanks,
> >>> gz
>
> >> There's an "rsync.py" module in pypi -- one would think that would
> >> have to solve that same problem...
>
> >> Regards,
> >> Pat
>
> > No, rsync does not solve my problem.
>
> > I want a library that does unix 'diff' like function, i.e. compare two
> > strings line by line and output the difference. Python's difflib does
> > not work perfectly for me, because the resulting differences are
> > pretty big. I would like an algorithm that generates the smallest
> > differences.
>
> is n=0 not short enough?
>
> pprint.pprint(list(difflib.context_diff(s, t, n=0)))

This still does not do what I want it to do. It only displays the diff
results in a different format. I want a different algorithm to
generate a smaller diff -- in other words less differences
-- 
http://mail.python.org/mailman/listinfo/python-list


vector addition

2010-06-05 Thread GZ
Hi,

I am looking for a fast internal vector representation so that
(a1,b2,c1)+(a2,b2,c2)=(a1+a2,b1+b2,c1+c2).

So I have a list

l = ['a'a,'bb','ca','de'...]

I want to count all items that start with an 'a', 'b', and 'c'.

What I can do is:

count_a = sum(int(x[1]=='a') for x in l)
count_b = sum(int(x[1]=='b') for x in l)
count_c = sum(int(x[1]=='c') for x in l)

But this loops through the list three times, which can be slow.

I'd like to have something like this:
count_a, count_b, count_c =
sum( (int(x[1]=='a',int(x[1]=='b',int(x[1]=='c')   for x in l)

I hesitate to use numpy array, because that will literally create and
destroy a ton of the arrays, and is likely to be slow.






-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Diff of Text

2010-06-05 Thread GZ
Hi Lie,

On Jun 5, 2:53 am, Lie Ryan  wrote:
> On 06/05/10 15:43, GZ wrote:
>
>
>
>
>
> > On Jun 4, 8:37 pm, Lie Ryan  wrote:
> >> On06/05/10 07:51, GZ wrote:
> >>> No, rsync does not solve my problem.
>
> >>> I want a library that does unix 'diff' like function, i.e. compare two
> >>> strings line by line and output the difference. Python's difflib does
> >>> not work perfectly for me, because the resulting differences are
> >>> pretty big. I would like an algorithm that generates the smallest
> >>> differences.
>
> >> is n=0 not short enough?
>
> >> pprint.pprint(list(difflib.context_diff(s, t, n=0)))
>
> > This still does not do what I want it to do. It only displays the diff
> > results in a different format. I want a different algorithm to
> > generate a smaller diff -- in other words less differences
>
> No, I meant I was confirming that you already have turned off context
> lines (i.e. the n=0 part), right?
>
> Also, what's the nature of the changes? You might be able to minimize
> difflib's output by using word-based or character-based diff-ing instead
> of traditional line-based diff-ing.
>
> diff output is fairly compressable, so you might want to look at zipping
> the output.- Hide quoted text -
>
> - Show quoted text -

Thanks for your response.

The verboseness of the format is not really my problem and I only care
about line by line comparison for now.

Let me think of a better way to express what I mean by a "smaller
diff." After I diff the two strings, I will have something like this:

  AAA
- BBB
+ CCC
+ DDD
- EEE

It means the first line does not change, the second line is replaced
by the third line, the forth line is new, and the fifth line is
deleted.

I define the "smallness" of the diff algorithm as "the sum of the
total number of minuses and pluses". In my above example, it is 4 (two
minuses and 2 pluses). Note that no matter what format we use to
represent the diff, this number is the same.

Python's difflib does not really minimize this number. It tries to
make this number small, but also tries to yield matches that “look
right” to people at the cost of increasing this number. (http://
docs.python.org/library/difflib.html).

What I am looking for is an algo that can really minimize this number.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Diff of Text

2010-06-05 Thread GZ
On Jun 5, 8:42 pm, Ben Finney  wrote:
> GZ  writes:
> > Let me think of a better way to express what I mean by a "smaller
> >diff." After Idiffthe two strings, I will have something like this:
>
> >   AAA
> > - BBB
> > + CCC
> > + DDD
> > - EEE
>
> > It means the first line does not change, the second line is replaced
> > by the third line, the forth line is new, and the fifth line is
> > deleted.
>
> Are you drawing a distinction between:
>
>   * “line FOO is replaced by line BAR”
>   * “line FOO is deleted, line BAR is added”
>
> Your wording seems to make that distinction, but I don't see how it's
> useful or meaningful in a discussion aboutdiff. Are they not exactly
> the same?
>
> --
>  \     “Injustice is relatively easy to bear; what stings is justice.” |
>   `\                                                 —Henry L. Mencken |
> _o__)                                                                  |
> Ben Finney

I should distinguish between modifications and additions. In my above
example, one line is modified/replaced, one line is added and one line
is deleted. There are a total of 3 edits. I am looking for an
alternative python library other than difflib that minimizes this
number (edit distance).
-- 
http://mail.python.org/mailman/listinfo/python-list


Sequential Object Store

2010-08-07 Thread GZ
Hi All,

I need to store a large number of  large objects to file and then
access them sequentially. I am talking about a few thousands of
objects and each with size of a few hundred kilobytes, and total file
size a few gigabytes. I tried shelve, but it is not good at
sequentially accessing the data. In essence, shelve.keys() takes
forever.

I am wondering if there is a module that can persist a stream of
objects without having to load everything into memory. (For this
reason, I think Pickle is out, too, because it needs everything to be
in memory.)

Thanks,
GZ
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Sequential Object Store

2010-08-09 Thread GZ
Hi Alex,

On Aug 7, 6:54 pm, Alex Willmer  wrote:
> On Aug 7, 5:26 pm, GZ  wrote:
>
> > I am wondering if there is a module that can persist a stream of
> > objects without having to load everything into memory. (For this
> > reason, I think Pickle is out, too, because it needs everything to be
> > in memory.)
>
> From the pickle docs it looks like you could do something like:
>
> try:
>     import cPickle as pickle
> except ImportError
>     import pickle
>
> file_obj = open('whatever', 'wb')
> p = pickle.Pickler(file_obj)
>
> for x in stream_of_objects:
>     p.dump(x)
>     p.memo.clear()
>
> del p
> file_obj.close()
>
> then later
>
> file_obj = open('whatever', 'rb')
> p = pickle.Unpickler(file_obj)
>
> while True:
>     try:
>         x = p.load()
>         do_something_with(x)
>     except EOFError:
>         break
>
> Your loading loop could be wrapped in a generator function, so only
> one object should be held in memory at once.

This totally works!

Thanks!
-- 
http://mail.python.org/mailman/listinfo/python-list


minidom help -- line number

2010-08-13 Thread GZ
Hi All,

I am writing a little program that reads the minidom tree built from
an xml file. I would like to print out the line number of the xml file
on the parts of the tree that are not valid. But I do not seem to find
a way to correspond minidom nodes to line numbers.

Can anyone give me some help?

Thanks,
gz
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: minidom help -- line number

2010-08-14 Thread GZ
On Aug 14, 12:07 pm, Thomas Jollans  wrote:
> On Saturday 14 August 2010, it occurred to GZ to exclaim:
>
> > Hi All,
>
> > I am writing a little program that reads the minidom tree built from
> > an xml file. I would like to print out the line number of the xml file
> > on the parts of the tree that are not valid. But I do not seem to find
> > a way to correspond minidom nodes to line numbers.
>
> The DOM does not contain things like line number information. You work with
> the structure of the document, not the appearance of the file you happen to be
> using as a source. You can't use line numbers with minidom.
>
> For stream-based parsers like SAX and eXpat (both in the standard library)
> this makes more sense, and they both allow you to check the current line
> number in one way or another.

So I am basically out of luck if I want to tie back to the original
file for file error reporting etc. sounds like a deficiency to me.
-- 
http://mail.python.org/mailman/listinfo/python-list