Re: [Tutor] Replacement for __del__

2009-05-19 Thread A.T.Hofkamp

Strax-Haber, Matthew (LARC-D320) wrote:

A folder is created during object instantiation. This is necessary because
multiple other methods depend on the existence of that folder, and in the
future things may be running in parallel so it has to be there for the
entire life of the object. Before the object is deleted, I want that temp
folder to be deleted (ala shutil.rmtree). Is there a better way to do this?


It sounds like the object has 2 functions, namely managing your temp disk 
space, and the functionality of whatever the other methods of the object do.


I would suggest to split the temp disk space management to a seperate function 
that acts as a kind of main function:


def main():
  # make temp disk space

  obj = YourObject()
  obj.doit()  # Run the application

  # Application is finished here, cleanup the disk space.

This gives you explicit control when the disk space is cleared.

For a multi-threaded app you may want to wait somewhere until all threads are 
finished before proceeding to clean up the disk.



Regardless of whether I should be using __del__, can anyone explain this to
me (to satisfy curiosity):
My class sub-types dict. In __init__, there is a call to
self.update(pickle.load()). For some reason, this triggers
__del__. Why?


Somewhere an instance is created and garbage collected.

This is why you really don't want to use __del__. It is very good way to make 
your app behave strangely and give you weeks of debugging fun :p



Albert

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] performance loss -- profiling

2009-05-19 Thread spir
Le Mon, 18 May 2009 23:16:13 +0100,
"Alan Gauld"  s'exprima ainsi:

> "spir"  wrote
> 
> > Also, it's the first time I really have to cope with machine-time; 
> > so I'm totally new to technics like using a profiler. 
> > Any hints on the topics heartfully welcome :-)
> 
> Profilers are a bit like debuggers. Very useful when needed 
> but usually a point of last resort.
> 
> First, what OS are you on?
> Linux/Unix has a host of tools that can help pinpoint the causes, 
> such as whether there is excess disk I/O or memory use or 
> if its another process stealing CPU resource.
> 
> Windows can do these things too but the tools and techniques 
> are totally different - and require a bit more experience to 
> interpret IMHO.
> 
> But my first bet would be to check the CPU and memory usage 
> to see if either is high. Also if it uses the network, either to go to 
> a server(databases? files?) then check network traffic.
> 
> If you can spot the nature of the hot spot you can often guess 
> where in the code the issues are likely to be.
> 
> Other common causes to consider are:
> - Locking of files? ie concurrent access issues?
> - Queuing for shared resources? Have you started running multiple 
>   instances?
> - data base query time - have you increased the volume of 
>   data being searched? Did you index the key columns?
> - Directory path traversal - have you moved the location of 
> any key files? Either the executables/scriopts or the data?
> 
> Just some ideas before you resort to profiling.

Thank you for all these tips, Alan.
I've found the naughty bug.
[It brings another question: see separate thread.]

For the story, the app in question is a parsing and processing tool (see 
http://spir.wikidot.com/pijnu). I've put muuuch attention to usability, esp. 
feedback to developper. To help users fix grammars, which is always a difficult 
job, parse errors are very complete and even include a visual pointer to the 
failure location in source text.
The main trick is that "wrapper" patterns also report errors from sub-patterns. 
For instance, a sequence will tell the user which pattern in the sequence has 
failed and add this pattern's own error on output; a choice will report why 
_each_ pattern in the choice has failed. To do this, wrapper patterns have to 
record sub-pattern errors: _my_ error was that they recorded the error message 
(which is complicated to create) instead of a reference to the error object. 
Error objects correctly don't write their message until they are requested -- 
but wrapper patterns asked them to do it.
As a consequence, the app spent ~98% of its time writing error messages that 
never got printed ;-) My biggest test's running time had raised from ~1.5s to 
more than 1mn!

Denis
--
la vita e estrany
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] clean text

2009-05-19 Thread spir
Hello,

This is a follow the post on performance issues.
Using a profiler, I realized that inside error message creation, most of the 
time was spent in a tool func used to clean up source text output.
The issue is that when the source text holds control chars such as \n, then the 
error message is hardly readible. MY solution is to replace such chars with 
their repr():

def _cleanRepr(text):
''' text with control chars replaced by repr() equivalent '''
result = ""
for char in text:
n = ord(char)
if (n < 32) or (n > 126 and n < 160):
char = repr(char)[1:-1]
result += char
return result

For any reason, this func is extremely slow. While the rest of error message 
creation looks very complicated, this seemingly innocent consume > 90% of the 
time. The issue is that I cannot use repr(text), because repr will replace all 
non-ASCII characters. I need to replace only control characters.
How else could I do that?

Denis
--
la vita e estrany
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Replacement for __del__

2009-05-19 Thread spir
Le Tue, 19 May 2009 09:12:34 +0200,
"A.T.Hofkamp"  s'exprima ainsi:

> A folder is created during object instantiation.

Do you mean a filesytem directory? I may be wrong, bit it seems there is a bit 
of confusion between things and their representation as python objects.

You shouldn't call __del__, id est try to do garbage collection instead of 
letting python do it. This is for the python object side. But I think (someone 
confirms/contradicts?) there is no harm in overloading a class's __del__ to 
accomplish additional tasks such as deleting temp disk space/dirs/files that 
need to exist only during object creation.
If this looks you what you intend to do, then there are alternatives such as 
using "try...finally" to ensure cleanup (also "with"?).

Sorry if I misunderstood.

Denis
--
la vita e estrany
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] performance loss -- profiling

2009-05-19 Thread Kent Johnson
On Mon, May 18, 2009 at 2:49 PM, spir  wrote:
> Hello,
>
> I have an big performance problem with an app I'm currently working on.
> It "suddenly" runs at least 5 times slower that it used to. The issue is, 
> beeing in a finalization phase, I'm all the time touching thingies here and 
> there. But performance change is visible only when running on big test data, 
> which I haven't done for a few days.
>
> As a consequence I'm unable to find what bug I've introduced that causes such 
> a change. If you of any rational method to diagnose such problems, thank you 
> very much.
>
> Also, it's the first time I really have to cope with machine-time; so I'm 
> totally new to technics like using a profiler. Any hints on the topics 
> heartfully welcome :-)

Are you using a source code control system? If so, you can back out
changes and see where the performance changes. Some SCCS will even
help you with this, for example the Mercurial bisect command automates
a binary search through the repository changesets.

The Python profiler is not hard to run. Interpreting the results is
more difficult :-) See the docs to get started:
http://docs.python.org/library/profile.html

Because profiling can slow your code down considerably, if you have a
general idea where the slowdown is, you can just profile that.

The output of the profiler is a file which you can load in an
interactive session using pstats.Stats. You will want to sort the
stats by cumulative time and print. Look for something that is taking
an unexpectedly large amount of the time, or that is being called more
than expected.

RunSnakeRun looks like a useful program for visualizing the profile results:
http://www.vrplumber.com/programming/runsnakerun/

PProfUI is a simpler profile viewer that I have found useful:
http://webpages.charter.net/erburley/pprofui.html

Once you find a hot spot, you have to figure out why it is hot. Is it
being called to much? Is there a better algorithm or data structure?
The profiler won't help with this part. line_profiler might be useful,
I haven't used it:
http://packages.python.org/line_profiler/

Kent
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] clean text

2009-05-19 Thread spir
Le Tue, 19 May 2009 11:36:17 +0200,
spir  s'exprima ainsi:

> Hello,
> 
> This is a follow the post on performance issues.
> Using a profiler, I realized that inside error message creation, most of
> the time was spent in a tool func used to clean up source text output. The
> issue is that when the source text holds control chars such as \n, then the
> error message is hardly readible. MY solution is to replace such chars with
> their repr():
> 
> def _cleanRepr(text):
>   ''' text with control chars replaced by repr() equivalent '''
>   result = ""
>   for char in text:
>   n = ord(char)
>   if (n < 32) or (n > 126 and n < 160):
>   char = repr(char)[1:-1]
>   result += char
>   return result

Changed to:

def _cleanRepr(text):
''' text with control chars replaced by repr() equivalent '''
chars = []
for char in text:
n = ord(char)
if (n < 32) or (n > 126 and n < 160):
char = repr(char)[1:-1]
chars.append(char)
return ''.join(chars)

But what else can I do?

Denis
--
la vita e estrany
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] clean text

2009-05-19 Thread Kent Johnson
On Tue, May 19, 2009 at 5:36 AM, spir  wrote:
> Hello,
>
> This is a follow the post on performance issues.
> Using a profiler, I realized that inside error message creation, most of the 
> time was spent in a tool func used to clean up source text output.
> The issue is that when the source text holds control chars such as \n, then 
> the error message is hardly readible. MY solution is to replace such chars 
> with their repr():
>
> def _cleanRepr(text):
>        ''' text with control chars replaced by repr() equivalent '''
>        result = ""
>        for char in text:
>                n = ord(char)
>                if (n < 32) or (n > 126 and n < 160):
>                        char = repr(char)[1:-1]
>                result += char
>        return result
>
> For any reason, this func is extremely slow. While the rest of error message 
> creation looks very complicated, this seemingly innocent consume > 90% of the 
> time. The issue is that I cannot use repr(text), because repr will replace 
> all non-ASCII characters. I need to replace only control characters.
> How else could I do that?

I would get rid of the calls to ord() and repr() to start. There is no
need for ord() at all, just compare characters directly:
if char < '\x20' or '\x7e' < char < '\xa0':

To eliminate repr() you could create a dict mapping chars to their
repr and look up in that.

You should also look for a way to get the loop into C code. One way to
do that would be to use a regex to search and replace. The replacement
pattern in a call to re.sub() can be a function call; the function can
do the dict lookup. Here is something to try:

import re

# Create a dictionary of replacement strings
repl = {}
for i in range(0, 32) + range(127, 160):
c = chr(i)
repl[c] = repr(c)[1:-1]

def sub(m):
''' Helper function for re.sub(). m will be a Match object. '''
return repl[m.group()]

# Regex to match control chars
controlsRe = re.compile(r'[\x00-\x1f\x7f-\x9f]')

def replaceControls(s):
''' Replace all control chars in s with their repr() '''
return controlsRe.sub(sub, s)


for s in [
'Simple test',
'Harder\ntest',
'Final\x00\x1f\x20\x7e\x7f\x9f\xa0test'
]:
print replaceControls(s)


Kent
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] clean text

2009-05-19 Thread A.T.Hofkamp

spir wrote:

def _cleanRepr(text):
''' text with control chars replaced by repr() equivalent '''
chars = []
for char in text:
n = ord(char)
if (n < 32) or (n > 126 and n < 160):
char = repr(char)[1:-1]
chars.append(char)
return ''.join(chars)

But what else can I do?


You seem to break down the string to single characters, replace a few of them, 
and then build the whole string back.


Maybe you can insert larger chunks of text that do not need modification, ie 
something like


start = 0
for idx, char in text:
n = ord(char)
if n < 32 or 126 < n < 160:
chars.append(text[start:idx])
chars.append(repr(char)[1:-1])
start = idx + 1
chars.append(text[start:])
return ''.join(chars)


An alternative of the above is to keep track of the first occurrence of each 
of the chars you want to split on (after some 'start' position), and compute 
the next point to break the string as the min of all those positions instead 
of slowly 'walking' to it by testing each character seperately.


That would reduce the number of iterations you do in the loop, at the cost of 
maintaining a large number of positions of the next breaking point.



Albert
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] performance loss -- profiling

2009-05-19 Thread Tim Golden

Kent Johnson wrote:

The Python profiler is not hard to run. Interpreting the results is
more difficult :-) See the docs to get started:
http://docs.python.org/library/profile.html


Also, it's quite useful to run it as a module:

 python -mcProfile 

You have a certain amount of configurability via
the -s param.

TJG
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] clean text

2009-05-19 Thread Sander Sweers
2009/5/19 spir :
> def _cleanRepr(text):
>        ''' text with control chars replaced by repr() equivalent '''
>        result = ""
>        for char in text:
>                n = ord(char)
>                if (n < 32) or (n > 126 and n < 160):
>                        char = repr(char)[1:-1]
>                result += char
>        return result

Is this faster? It replaces all occurrences of each control character
in ctrlcmap for the whole string you pass to it.

ctrlcmap = dict((chr(x), repr(chr(x))[1:-1]) for x in range(160) if x
< 32 or x > 126 and x < 160)
teststring = chr(127) + 'mytestring'

def _cleanRepr(text, ctrlcmap):
for ctrlc in ctrlcmap.keys():
 text = text.replace(ctrlc, ctrlcmap[ctrlc])
return text

>>> teststring
'\x7fmytestring'

>>> _cleanRepr(teststring, ctrlcmap)
'\\x7fmytestring'

Greets
Sander
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] clean text

2009-05-19 Thread Kent Johnson
By the way, the timeit module is very helpful for comparing the speed
of different implementations of an algorithm such as are being
presented in this thread. You can find examples in the list archives:
http://search.gmane.org/?query=timeit&group=gmane.comp.python.tutor

Kent
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] odbc connection with python

2009-05-19 Thread mustafa akkoc
how can i make odbc connection language and i wanna make gui project after
connecting database anyone has document ?

-- 
Mustafa Akkoc
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] clean text

2009-05-19 Thread python
Denis,

Untested idea: 

1. Fill a dict with pre-calculated repr() values for chars you want to
replace (replaceDict)

2. Create a set() of chars that you want to replace (replaceSet).

3. Replace if (n < 32) ... test with if char in replaceSet

4. Lookup the replacement via replaceDict[ char ] vs. calculating via
repr()

5. Have result = list(), then replace result += char with result.append(
char )

6. Return ''.join( result )

Does this help?

Malcolm
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] odbc connection with python

2009-05-19 Thread Emile van Sebille

On 5/19/2009 5:47 AM mustafa akkoc said...
how can i make odbc connection language and i wanna make gui project 
after connecting database anyone has document ? 



There's an odbc module in python.  I'd start with the docs on that and 
then google 'python odbc example' for more info and examples.


Emile

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Replacement for __del__

2009-05-19 Thread David Stanek
On Tue, May 19, 2009 at 5:55 AM, spir  wrote:
> Le Tue, 19 May 2009 09:12:34 +0200,
> "A.T.Hofkamp"  s'exprima ainsi:
>
>> A folder is created during object instantiation.
>
> Do you mean a filesytem directory? I may be wrong, bit it seems there is a 
> bit of confusion between things and their representation as python objects.
>
> You shouldn't call __del__, id est try to do garbage collection instead of 
> letting python do it. This is for the python object side.

Correct.

> But I think (someone confirms/contradicts?) there is no harm in overloading a 
> class's __del__ to accomplish additional tasks such as deleting temp disk 
> space/dirs/files that need to exist only during object creation.

Incorrect. I usually tell people to never supply a __del__
implementation. By the time it's executed many of the resources that
you want to clean up could have already been garbage collected. To
protect against this I see people doing things like::

def __del__(self):
try:
self.f.close()
except:
pass

To me this is just silly.

-- 
David
blog: http://www.traceback.org
twitter: http://twitter.com/dstanek
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] clean text

2009-05-19 Thread Lie Ryan

spir wrote:

Hello,

This is a follow the post on performance issues.
Using a profiler, I realized that inside error message creation, most of the 
time was spent in a tool func used to clean up source text output.
The issue is that when the source text holds control chars such as \n, then the 
error message is hardly readible. MY solution is to replace such chars with 
their repr():

def _cleanRepr(text):
''' text with control chars replaced by repr() equivalent '''
result = ""
for char in text:
n = ord(char)
if (n < 32) or (n > 126 and n < 160):
char = repr(char)[1:-1]
result += char
return result

For any reason, this func is extremely slow. While the rest of error message 
creation looks very complicated, this seemingly innocent consume > 90% of the 
time. The issue is that I cannot use repr(text), because repr will replace all 
non-ASCII characters. I need to replace only control characters.
How else could I do that?

Denis
--
la vita e estrany
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor



If you're using python 3x., how about using the str.translate() ?

In python 3.x, str.translate() accepts dictionary argument which can do 
a single-char to multi-char replacement.


controls = list(range(0, 32)) + list(range(127, 160))
table = {char: repr(chr(char))[1:-1] for char in controls}

def _cleanRepr(text):
return text.translate(table)


___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] clean text

2009-05-19 Thread spir
Le Tue, 19 May 2009 11:36:17 +0200,
spir  s'exprima ainsi:

[...]

Thank you Albert, Kent, Sanders, Lie, Malcolm.

This time regex wins! Thought it wouldn't because of the additional func call 
(too bad we cannot pass a mapping to re.sub). Actually the diff. is very small 
;-) The relevant  change is indeed using a dict.
Replacing string concat with ''.join() is slower (tested with 10 times and 100 
times bigger strings too). Strange...
Membership test in a set is only very slightly faster than in dict keys.

I did a test with random strings of typical length for my app. Timing is ~ 
stable.

===
### various funcs ###
# original
def cleanRepr0(text):
''' text with control chars replaced by repr() equivalent '''
chars = ""
for char in text:
n = ord(char)
if (n < 32) or (n > 126 and n < 160):
char = repr(char)[1:-1]
chars += char
return chars

# use list
def cleanRepr1(text):
chars = []
for char in text:
n = ord(char)
if (n < 32) or (n > 126 and n < 160):
char = repr(char)[1:-1]
chars.append(char)
return ''.join(chars)

control_chars = set( chr(n) for n in (range(0, 32) + range(127, 160)) )
control_char_map = dict( (c, repr(c)[1:-1]) for c in control_chars )

# use map
def cleanRepr2(text):
chars = ""
for char in text:
if char in control_char_map:
char = control_char_map[char]
chars += char
return chars

# use map & set
def cleanRepr3(text):
chars = []
for char in text:
if char in control_chars:
char = control_char_map[char]
chars.append(char)
return ''.join(chars)
def cleanRepr3(text):
chars = ""
for char in text:
if char in control_chars:
char = control_char_map[char]
chars += char
return chars

import re
controlsRe = re.compile(r'[\x00-\x1f\x7f-\x9f]')

# use regex
def substChar(m):
''' Helper function for re.sub(). m will be a Match object. '''
return control_char_map[m.group()]
def cleanRepr4(text):
return controlsRe.sub(substChar, text)


### timing ###
#helper func to generate random string
from time import time
import random

def randomString():
count = random.randrange(11,111)
chars = [chr(random.randrange(1, 255)) for n in range(count)]
return ''.join(chars)

def timeAll():
t0=t1=t2=t3=t4=0
for n in range():
s = randomString()
t = time() ; cleanRepr0(s) ; t0 += time() - t
t = time() ; cleanRepr1(s) ; t1 += time() - t
t = time() ; cleanRepr2(s) ; t2 += time() - t
t = time() ; cleanRepr3(s) ; t3 += time() - t
t = time() ; cleanRepr4(s) ; t4 += time() - t
print ( "original:  %.3f\n"
"list:  %.3f\n"
"map:   %.3f\n"
"map & set: %.3f\n"
"regex: %.3f\n"
%(t0,t1,t2,t3,t4) )

timeAll()
===
==>
original:   0.692
list:   0.829
map:0.364
map & set:  0.349
regex:  0.341
===

Denis
--
la vita e estrany
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] clean text

2009-05-19 Thread python
Denis,

Thank you for sharing your detailed analysis with the list.

I'm glad on didn't bet money on the winner :)  ... I'm just as surprised
as you that the regex solution was the fastest.

Malcolm

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] clean text

2009-05-19 Thread Emile van Sebille

On 5/19/2009 10:19 AM spir said...

Le Tue, 19 May 2009 11:36:17 +0200,
spir  s'exprima ainsi:

[...]

Thank you Albert, Kent, Sanders, Lie, Malcolm.

This time regex wins! Thought it wouldn't because of the additional func call 
(too bad we cannot pass a mapping to re.sub). Actually the diff. is very small 
;-) The relevant  change is indeed using a dict.
Replacing string concat with ''.join() is slower (tested with 10 times and 100 
times bigger strings too). Strange...
Membership test in a set is only very slightly faster than in dict keys.


Hmm... this seems faster assuming it does the same thing...

xlate = dict( (chr(c),chr(c)) for c in range(256))
xlate.update(control_char_map)

def cleanRepr5(text):
return "".join([ xlate[c] for c in text ])


Emile





I did a test with random strings of typical length for my app. Timing is ~ 
stable.

===
### various funcs ###
# original
def cleanRepr0(text):
''' text with control chars replaced by repr() equivalent '''
chars = ""
for char in text:
n = ord(char)
if (n < 32) or (n > 126 and n < 160):
char = repr(char)[1:-1]
chars += char
return chars

# use list
def cleanRepr1(text):
chars = []
for char in text:
n = ord(char)
if (n < 32) or (n > 126 and n < 160):
char = repr(char)[1:-1]
chars.append(char)
return ''.join(chars)

control_chars = set( chr(n) for n in (range(0, 32) + range(127, 160)) )
control_char_map = dict( (c, repr(c)[1:-1]) for c in control_chars )

# use map
def cleanRepr2(text):
chars = ""
for char in text:
if char in control_char_map:
char = control_char_map[char]
chars += char
return chars

# use map & set
def cleanRepr3(text):
chars = []
for char in text:
if char in control_chars:
char = control_char_map[char]
chars.append(char)
return ''.join(chars)
def cleanRepr3(text):
chars = ""
for char in text:
if char in control_chars:
char = control_char_map[char]
chars += char
return chars

import re
controlsRe = re.compile(r'[\x00-\x1f\x7f-\x9f]')

# use regex
def substChar(m):
''' Helper function for re.sub(). m will be a Match object. '''
return control_char_map[m.group()]
def cleanRepr4(text):
return controlsRe.sub(substChar, text)


### timing ###
#helper func to generate random string
from time import time
import random

def randomString():
count = random.randrange(11,111)
chars = [chr(random.randrange(1, 255)) for n in range(count)]
return ''.join(chars)

def timeAll():
t0=t1=t2=t3=t4=0
for n in range():
s = randomString()
t = time() ; cleanRepr0(s) ; t0 += time() - t
t = time() ; cleanRepr1(s) ; t1 += time() - t
t = time() ; cleanRepr2(s) ; t2 += time() - t
t = time() ; cleanRepr3(s) ; t3 += time() - t
t = time() ; cleanRepr4(s) ; t4 += time() - t
print ( "original: %.3f\n"
"list: %.3f\n"
"map:  %.3f\n"
"map & set:%.3f\n"
"regex:%.3f\n"
%(t0,t1,t2,t3,t4) )

timeAll()
===
==>
original:   0.692
list:   0.829
map:0.364
map & set:  0.349
regex:  0.341
===

Denis
--
la vita e estrany
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor



___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] clean text

2009-05-19 Thread spir
Le Tue, 19 May 2009 10:49:15 -0700,
Emile van Sebille  s'exprima ainsi:

> On 5/19/2009 10:19 AM spir said...
> > Le Tue, 19 May 2009 11:36:17 +0200,
> > spir  s'exprima ainsi:
> > 
> > [...]
> > 
> > Thank you Albert, Kent, Sanders, Lie, Malcolm.
> > 
> > This time regex wins! Thought it wouldn't because of the additional func
> > call (too bad we cannot pass a mapping to re.sub). Actually the diff. is
> > very small ;-) The relevant  change is indeed using a dict. Replacing
> > string concat with ''.join() is slower (tested with 10 times and 100
> > times bigger strings too). Strange... Membership test in a set is only
> > very slightly faster than in dict keys.
> 
> Hmm... this seems faster assuming it does the same thing...
> 
> xlate = dict( (chr(c),chr(c)) for c in range(256))
> xlate.update(control_char_map)
> 
> def cleanRepr5(text):
>  return "".join([ xlate[c] for c in text ])
> 
> 
> Emile

Thank you, Emile.
I thought at this solution (having a dict for all chars). But I cannot use it 
because later I will extend the app to cope with unicode (~ 100_000 chars). So 
that I really need to filter which chars have to be converted.
A useful help I guess would be to have a builtin func that returns conventional 
char/string repr without "'...'" around.

Denis

PS
By the way, you don't need (anymore) to build a list comprehension for an outer 
func that walks through a sequence:
   "".join( xlate[c] for c in text )
is a shortcut for
   "".join( (xlate[c] for c in text) )
[a generator expression already inside () needs no additional parens -- as long 
as there is no additional arg -- see PEP 289 
http://www.python.org/dev/peps/pep-0289/]
--
la vita e estrany
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Replacement for __del__

2009-05-19 Thread Alan Gauld


"David Stanek"  wrote 

But I think (someone confirms/contradicts?) there is no harm 
in overloading a class's __del__ to accomplish additional tasks 
such as deleting temp disk space/dirs/files that need to exist 
only during object creation.


Incorrect. 


Why?


implementation. By the time it's executed many of the resources that
you want to clean up could have already been garbage collected. 


True, but the things Denis refers to - temp disk space etc - will 
never be garbage collected. The GC only tidies up the stuff in 
memory.


If the object, as seems to be the case here, creates a file on disk 
which it uses as a semaphore/lock then GC will leave that file 
on the disk. A __del__ would enable it to be removed.



def __del__(self):
   try:
   self.f.close()
   except:
   pass


This is, I agree, pointless. but very different to:

def __ del__(self):
try:
   os.remove(self.lockname)
except: pass


Of course the question of whether using disk storage to indicate 
object lifetime is a good design pattern is another issue altogether!


--
Alan Gauld
Author of the Learn to Program web site
http://www.alan-g.me.uk/

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] clean text

2009-05-19 Thread Kent Johnson
On Tue, May 19, 2009 at 1:19 PM, spir  wrote:
> Thank you Albert, Kent, Sanders, Lie, Malcolm.
>
> This time regex wins! Thought it wouldn't because of the additional func call 
> (too bad we cannot pass a mapping to re.sub). Actually the diff. is very 
> small ;-) The relevant  change is indeed using a dict.

The substChar() function is only called when a control character is
found, so the relative time between the regex version and the next
best will depend on the character mix. Your random strings seem a bit
heavy on control chars.

My guess is that the reason regex is a win is because it gets rid of
the explicit Python-coded loop.

> Replacing string concat with ''.join() is slower (tested with 10 times and 
> 100 times bigger strings too). Strange...
> Membership test in a set is only very slightly faster than in dict keys.

String concatenation has been optimized for this use case in recent
versions of Python.

Kent
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] clean text

2009-05-19 Thread Alan Gauld


"spir"  wrote


def _cleanRepr(text):
''' text with control chars replaced by repr() equivalent '''
result = ""
for char in text:
 n = ord(char)
 if (n < 32) or (n > 126 and n < 160):
 char = repr(char)[1:-1]
 result += char
return result


I haven't read the rest of the responses yet but my first 
suggestion is to invert the process and use replace()


Instead of looping through every char in the string search 
for the illegal chars. A regex should enable you to do 
a single findall search. If you find any you can loop 
over the instances using replace() or sub() to change 
them. The translate module might even let you do all 
changes in one go.


That should be faster I think.

HTH


--
Alan Gauld
Author of the Learn to Program web site
http://www.alan-g.me.uk/

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] clean text

2009-05-19 Thread Emile van Sebille

On 5/19/2009 11:22 AM spir said...


I thought at this solution (having a dict for all chars). But I cannot use it 
because later I will extend the app to cope with unicode (~ 100_000 chars). So 
that I really need to filter which chars have to be converted.


That seems somewhat of a premature optimization.  Dicts are very 
efficient -- I don't imagine 100k+ entries will slow it down, but then 
that's what should be tested so you'll know.



A useful help I guess would be to have a builtin func that returns conventional 
char/string repr without "'...'" around.


Like this?

>>> print repr(''.join(chr(ii) for ii in range(20,40)))
'\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !"#$%&\''
>>>

Emile

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] odbc connection with python

2009-05-19 Thread Alan Gauld


"mustafa akkoc"  wrote

how can i make odbc connection language and i wanna make gui project 
after

connecting database anyone has document ?


There are lots of GUI options for python but if you want to do a database
centred GUI and have no previous knowledge to leverage then dabo
is probably your best bet. It includes a GUI builder and has strong links
to databases.

http://dabodev.com/

Caveat: I've only read the web pages, not used it! But it has had some
good reviews on this list before.


--
Alan Gauld
Author of the Learn to Program web site
http://www.alan-g.me.uk/


___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Replacement for __del__

2009-05-19 Thread spir
Le Tue, 19 May 2009 19:27:16 +0100,
"Alan Gauld"  s'exprima ainsi:

> > def __del__(self):
> >try:
> >self.f.close()
> >except:
> >pass  
> 
> This is, I agree, pointless. but very different to:
> 
> def __ del__(self):
>  try:
> os.remove(self.lockname)
>  except: pass
> 
> 
> Of course the question of whether using disk storage to indicate 
> object lifetime is a good design pattern is another issue altogether!

But there are many other (sensible) kinds of uses for temp storage on disk. 
Including some for the developper's own feedback.

Denis
--
la vita e estrany
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Allow only one instance of a process

2009-05-19 Thread Roger
On Sunday 17 May 2009 21:54:54 Kent Johnson wrote:
> On Sat, May 16, 2009 at 10:26 PM, Sylvain Ste-Marie
>
>  wrote:
> > I'm currently writing a script to batch convert video for my psp
> >
> > It basically looks into a folder for video and launch ffmpeg:
> >
> > ffmpeg -i "videoname" -f psp -r 29.97 -b 768k -ar 24000 -ab 64k -s
> > 368x208 "videoname.mp4"
> >
> > my idea is basically to check for pid but how do i do that?
>
> Look at psutil:
> http://code.google.com/p/psutil/
>
> But you will have a race condition if you do something like this:
>
> if ffmpeg is not running:
>   start ffmpeg
>
> A file lock is a better solution. Google 'python file lock' for recipes.
>

As a Java programmer just starting with Python, this answer surprised me. I
would've been googling for the Python equivalent of the Singleton pattern. 
I guess it's going to take longer than I thought to get my head around the
differences. 

With the lock file solution, if your program crashes, don't you have to 
undertake some form of manual/automated recovery to remove any left
over lock files?

Regards
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Allow only one instance of a process

2009-05-19 Thread spir
Le Tue, 19 May 2009 23:09:33 +0300,
Roger  s'exprima ainsi:

> As a Java programmer just starting with Python, this answer surprised me. I
> would've been googling for the Python equivalent of the Singleton pattern. 
> I guess it's going to take longer than I thought to get my head around the
> differences. 

http://steve.yegge.googlepages.com/singleton-considered-stupid

Denis
--
la vita e estrany
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Allow only one instance of a process

2009-05-19 Thread Kent Johnson
On Tue, May 19, 2009 at 4:09 PM, Roger  wrote:

> As a Java programmer just starting with Python, this answer surprised me. I
> would've been googling for the Python equivalent of the Singleton pattern.
> I guess it's going to take longer than I thought to get my head around the
> differences.

A Singleton is a way to allow only a single instance of a class within
a single running program (process). The OP was asking how to ensure he
only created one process running a program. That is a very different
problem.

And yeah, singletons are evil and usually better served by modules in Python.

Kent
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor