Re: PATCH: Speed up direct string concatenation by 20+%!

2006-10-04 Thread Nicko
Larry Hastings wrote:
> It's *slightly* slower for two:
>
> def addTwoThings(a, b):
> return a + b
> for i in range(1000):
> x = addTwoThings("aaa", "bbb")
...
> But starts paying off already, even with three:
>
> def addThreeThings(a, b, c):
> return a + b + c
> for i in range(1000):
> x = addThreeThings("aaa", "bbb", "ccc")

I note that in both of those tests you didn't actually ever realise the
concatenated string.  Can you give us figures for these tests having
forced the concatenated string to be computed?

Cheers,
Nicko

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Can I do it using python?? about xterm and telnet

2006-07-03 Thread Nicko
placid wrote:
> Jim Segrave wrote:
> > In article <[EMAIL PROTECTED]>,
> > valpa <[EMAIL PROTECTED]> wrote:
> > >I'm a net admin for about 20 unix servers, and I need to frequently
> > >telnet on to them and configure them.
> > >It is a tiring job to open a xterm and telnet, username, password to
> > >each server.
> >
> > Don't use telnet. it's clumsy and has security issues.
>
> if youre behind a firewall then it shouldnt matter.

No, no, no!  If you have 20 unix servers then this is likely not a tiny
company.  Most security breaches (according to the FBI/CSI computer
crime survey) are perpetrated by insiders.  If you log in using telnet,
and have to enter passwords that allow configurations to be changed,
then anyone on the local net can get those passwords.  Use SSH instead.
 Even SSH with passwords is hugely more secure than telnet.

Nicko

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python loops

2006-09-02 Thread Nicko
Fredrik Lundh wrote:
> [EMAIL PROTECTED] wrote:
>
> > I thought the xrange was preferred?  for x in xrange(length):
>
> preferred by premature optimization freaks, perhaps.

There's a huge difference between not being profligate with resources
and premature optimisation. In the case of the idiom "for i in
range(x):..." there absolutely no utility whatsoever in creating and
recording the list of objects. Unless it makes a difference to code
structure or maintainability, I think that not creating stacks of
objects you don't need is basic code hygiene and not freakish premature
optimisation.

>   in practice, if the
> range is reasonably small and you're going to loop over all the integers,
> it doesn't really matter.

This is true, but getting into habits that don't matter most of the
time, but have an performance and stability impact some of the time, is
worth discouraging.

> (the range form creates a list and N integers up front; the xrange form
> creates an iterator object up front and N integers while you're looping.
> what's faster depends on what Python version you're using, and some-
> times also on the phase of the moon)

Using range() is only faster on lists so small then the cost is tiny
anyway. On any substantial loop it is quite a bit slower and has been
since python 2.3

Nicko

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python loops

2006-09-03 Thread Nicko
Fredrik Lundh wrote:
> Nicko wrote:
>
> > ... In the case of the idiom "for i in
> > range(x):..." there absolutely no utility whatsoever in creating and
> > recording the list of objects.
>
> for short lists, both objects create the *same* number of objects.

This is true for long lists too, if you iterate over the full range,
but what I wrote was "creating and recording". The range() function
generates a variable-sized, potentially large object and retains all of
the items in the range while xrange() generates a fairly small, fixed
sized object and only hangs on to one item at a time.  Furthermore,
it's not at all uncommon for loops to be terminated early. With range()
you incur the cost of creating all the objects, and a list large enough
to hold them, irrespective of if you are going to use them.

> if you cannot refrain from pulling arguments out of your ass, you not
> really the right person to talk about hygiene.

I'm impressed but your mature argument. Clearly, in the face of such
compelling reasoning, I shall have to concede that we should all
generate our range lists up front.

Nicko

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python loops

2006-09-04 Thread Nicko
Steve Holden wrote:
> Nicko wrote:
> > Fredrik Lundh wrote:
> >>if you cannot refrain from pulling arguments out of your ass, you not
> >>really the right person to talk about hygiene.
> >
> > I'm impressed but your mature argument. Clearly, in the face of such
> > compelling reasoning, I shall have to concede that we should all
> > generate our range lists up front.
> >
> I'm impressed that you think any of this will be news to the effbot,
> whose sagacity is exceeded only by his irritability in the face of
> ignorance.

Well, I may not have written as much "award winning" Python software as
Fredrik, but sagacity usually implies wisdom and good judgement rather
than mere knowledge.  Still I'm hard pressed to see why suggesting
that, when I want to iterate a number of times, I should use an
iterator that goes around a number of times (and appreciate the bounded
storage requirements that result) rather than writing down the list of
numbers and then selecting each in turn, should warrant such an
outburst.

One wonders if the authors of PEP 3100, the outline plans Python 3.0,
are all premature optimisation freaks too. After all it states that
sooner or later the built in range function will return an iterator and
those of us who prefer to iterate rather than count on our
fingers/lists will have yet another optimisation; we'll not need to put
an "x" in front of our ranges.

Nicko

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: list index()

2007-09-01 Thread Nicko
On Aug 30, 7:00 pm, Steve Holden <[EMAIL PROTECTED]> wrote:

> You can also generate the files that are in one directory but ot the
> other with
>
> (afiles | bfiles) - (afiles & bfiles)

Or just (afiles ^ bfiles).

Nicko

--
(lambda f: lambda *a:f(f,*a))(
lambda f,l,i:l[i][1]+f(f,l,l[i][0]) if l[i][0]>0 else "")(
sorted(enumerate('[EMAIL PROTECTED]'),key=lambda a:a[1]),0)[1:]

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Database in memory

2007-04-10 Thread Nicko
Jim wrote:
> I have an application that will maintain an in-memory database in the
> form of a list of lists.  Does anyone know of a way to search for and
> retreive "records" from such a structure?

The answer very much depends on the manner in which you want to do the
look-up.  If you only need to do exact-match look-up for items with
unique keys (e.g. find the single record where SSN=1234567890) then
using a dictionary is by far the best solution.  It's fast and it's
easy.

If you expect to do exact-match look-up where the keys are not unique
then build a dictionary containing 'set' objects which are the sets of
records which have the given key. This lets you neatly find the
intersection of selections on multiple criteria (e.g. matches =
zipcode_index["94101"] & hometype_index["condo"] ).

If you need to do range matching (e.g. 2 <= salary < 5) then
your best bet is to keep a list of the records sorted in the ordering
of the key, do a binary search to find where the lower and upper
bounds lie within the sorted list and then take a slice.  If you also
have some index dictionaries containing sets then you can combine
these two methods with something like 'matches =
set(salary_index[lo_sal:hi_sal]) & zipcode_index["81435"] '

Having said all that, if you think that there is any possibility that
you might ever want to expand the functionality of your program to
require either (a) more complex and flexible searching and/or (b)
putting the database somewhere else, then I would strongly suggest
that you use PySQLite.  SQLite is an efficient in-memory database with
an SQL engine and the Python interface conforms to the DB-API spec, so
you won't need to change your code (much) if you want to move the
database to some MySQL, Oracle, Sybase or DB2 server at a later date.
Furthermore SQLite is included in Python 2.5 as standard.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Database in memory

2007-04-10 Thread Nicko
On Apr 10, 1:10 pm, "Nicko" <[EMAIL PROTECTED]> wrote:
> If you expect to do exact-match look-up where the keys are not unique
> then build a dictionary containing 'set' objects which are the sets of
> records which have the given key. This lets you neatly find the
> intersection of selections on multiple criteria (e.g. matches =
> zipcode_index["94101"] & hometype_index["condo"] ).

Just FYI, if you're going to go this route then the items that you are
indexing have to be hashable, which the built in 'list' type is not.
Tuples are, or you can make some custom class (or your own subtype of
list) which implements the __hash__ method based on some 'primary key'
value from your data.  Or you could just go for SQLite...

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Iteration for Factorials

2007-10-26 Thread Nicko
On Oct 25, 2:36 am, Paul Rubin <http://[EMAIL PROTECTED]> wrote:
> Lou Pecora <[EMAIL PROTECTED]> writes:
> > There might even be an array method that can be adapted to get the
> > product.  Is there a product method? (analogous to a sum method)
>
> The "reduce" function which is being removed from python in 3.0.
>
> import operator
> def factorial(n):
>   return reduce(operator.mul, xrange(1,n+1))

Since reduce is being removed, and Guido is known not to like its use
anyway, I propose the following code for Py2.5 and later:

import math
def fact(n):
return math.exp(sum((math.log(i) for i in range(1,n+1 if n
>= 0 else None

If you don't like the rounding errors you could try:

def fact(n):
d = {"p":1L}
def f(i): d["p"] *= i
map(f, range(1,n+1))
return d["p"]

It is left as an exercise to the reader as to why this code will not
work on Py3K

Nicko


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: division by 7 efficiently ???

2007-02-01 Thread Nicko
On Feb 1, 3:13 am, [EMAIL PROTECTED] wrote:
> Its not an homework. I appeared for EA sports interview last month. I
> was asked this question and I got it wrong. I have already fidlled
> around with the answer but I don't know the correct reasoning behind
> it.

In that case, observer that a/b == a * (1/b), and if b is constant you
can compute (1/b) in advance.  Since the problem was set by game
programmers I'd hazard that they are OK with relatively limited
precision and the answer that they were looking for was:
a = (b * 045L) >> 32
Note that the constant there is in octal.

Nicko

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: division by 7 efficiently ???

2007-02-02 Thread Nicko
On Feb 2, 4:21 pm, "Bart Ogryczak" <[EMAIL PROTECTED]> wrote:
> On Feb 1, 2:00 pm, "Nicko" <[EMAIL PROTECTED]> wrote:
>
> > precision and the answer that they were looking for was:
> > a = (b * 045L) >> 32
> > Note that the constant there is in octal.
>
> 045L? Shouldn´t it be  044?
> Or more generally,
> const = (1< a = (b * const)>>bitPrecision

It's to do with rounding. What you actually need is
ceiling((1<http://mail.python.org/mailman/listinfo/python-list


Re: division by 7 efficiently ???

2007-02-02 Thread Nicko
On Feb 1, 8:25 pm, "Krypto" <[EMAIL PROTECTED]> wrote:
> The correct answer as told to me by a person is
>
> (N>>3) + ((N-7*(N>>3))>>3)
>
> The above term always gives division by 7

No it doesn't.  The above term tends towards N * (9/64), with some
significant rounding errors.  9/64 is a fairly poor (6 bit)
approximation of 1/7 but the principle is the same as the solution I
proposed above.

Nicko

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Sorting Large File (Code/Performance)

2008-01-25 Thread Nicko
On Jan 24, 9:26 pm, [EMAIL PROTECTED] wrote:
> > If you really have a 2GB file and only 2GB of RAM, I suggest that you don't 
> > hold your breath.
>
> I am limited with resources. Unfortunately.

As long as you have at least as much disc space spare as you need to
hold a copy of the file then this is not too hard.  Split the file
into chunks that are small enough to fit in memory, sort each chunk
and write it to a file and then interleave the chunks.  Below is a
cheap and cheesy outline of code to do this, from which you can start.

For files which are hugely larger than your available memory you can
do this recursively but for files which are 10 to 100 times too big
the single-pass code below will probably work just fine.  The
complexity is technically O(n.(log(c)+(n/c))) where n is the size of
input and c is the chunk size; once n/c (the number of chunks) exceeds
log(c) the cost of merging the chunks will start to dominate, though a
recursive version would be slowed by needing a lot more disc access.

#!/usr/bin/env python
from itertools import islice
from tempfile import TemporaryFile
import sys

# Tweak this number to fill your memory
lines_per_chunk = 10

chunkfiles = []
mergechunks = []

while True:
chunk = list(islice(sys.stdin, lines_per_chunk))
if not chunk:
   break
chunk.sort()
f = TemporaryFile()
f.writelines(chunk)
f.seek(0)
mergechunks.append((chunk[0], len(chunkfiles)))
chunkfiles.append(f)

while mergechunks:
# The next line is order O(n) in the number of chunks
(line, fileindex) = min(mergechunks)
mergechunks.remove((line, fileindex))
sys.stdout.write(line)
nextline = chunkfiles[fileindex].readline()
if nextline == "":
chunkfiles[fileindex].close()
else:
mergechunks.append((nextline, fileindex))

-- 
http://mail.python.org/mailman/listinfo/python-list