On Tue, May 19, 2009 at 5:36 AM, spir <denis.s...@free.fr> wrote: > Hello, > > This is a follow the post on performance issues. > Using a profiler, I realized that inside error message creation, most of the > time was spent in a tool func used to clean up source text output. > The issue is that when the source text holds control chars such as \n, then > the error message is hardly readible. MY solution is to replace such chars > with their repr(): > > def _cleanRepr(text): > ''' text with control chars replaced by repr() equivalent ''' > result = "" > for char in text: > n = ord(char) > if (n < 32) or (n > 126 and n < 160): > char = repr(char)[1:-1] > result += char > return result > > For any reason, this func is extremely slow. While the rest of error message > creation looks very complicated, this seemingly innocent consume > 90% of the > time. The issue is that I cannot use repr(text), because repr will replace > all non-ASCII characters. I need to replace only control characters. > How else could I do that?
I would get rid of the calls to ord() and repr() to start. There is no need for ord() at all, just compare characters directly: if char < '\x20' or '\x7e' < char < '\xa0': To eliminate repr() you could create a dict mapping chars to their repr and look up in that. You should also look for a way to get the loop into C code. One way to do that would be to use a regex to search and replace. The replacement pattern in a call to re.sub() can be a function call; the function can do the dict lookup. Here is something to try: import re # Create a dictionary of replacement strings repl = {} for i in range(0, 32) + range(127, 160): c = chr(i) repl[c] = repr(c)[1:-1] def sub(m): ''' Helper function for re.sub(). m will be a Match object. ''' return repl[m.group()] # Regex to match control chars controlsRe = re.compile(r'[\x00-\x1f\x7f-\x9f]') def replaceControls(s): ''' Replace all control chars in s with their repr() ''' return controlsRe.sub(sub, s) for s in [ 'Simple test', 'Harder\ntest', 'Final\x00\x1f\x20\x7e\x7f\x9f\xa0test' ]: print replaceControls(s) Kent _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor