Re: [Tutor] puzzling for-loop behavior

Luke Paireepinart Wed, 24 Jun 2009 05:17:19 -0700

Yash wrote:

Hello,
I am a newbie to python. I am trying to practice writing a code in
python and am trying to write a function to generate prime numbers
using sieve of Eratosthenes. (code follows at the end of mail)


The way it is coded, if I change the for loop over "j" from
for j in range( (2*i*(i + 1)), np, (2*i +1)):
to
for j in range( (2*i*(i + 1)), np, 2*(2*i +1)):
I should get the same output for the two (which I do).

The second for loop has to make roughly half the iterations compared
to the first, because I increased the step-size by factor 2.
However when I time the code for the two different for loops, I
observe that the code takes longer to run as compared the first for
loop.

How do you know it's taking longer to run?  How are you profiling your code?

I just ran your code through timeit, and you're correct, changing thatline does make the code take longer to run.The reason for this is not extremely clear-cut and you'll have to dofurther analysis to confirm this (look into the "timeit" module amongothers)

but here is my suspicion:
try printing the return from primes

print primes(1000)

try the line the original way you had it, and then try it with thesecond line. You do NOT get the same output.

In fact, it's significantly different.

let's see exactly how many extra numbers we have with the second loopstyle (primes2 is the same function, but with your second loop definition.)


x = set(primes(10000))
y = set(primes2(10000))
print len(y)
print len(x)
print len(y - x)

this gives us the fact that there are 1881 incorrect results in "y"!

Also, it claims there are 1229 primes less than 10,000 which seemsunrealistically-high to me (though I very well may be incorrect.) so youmay want to look into that.

So what it comes down to is that the reason the second way is takinglonger is because it's simply got a larger set that it's dealing with -when you iterate over the items to generate plst after your for loop,it's having to iterate over a larger set, and when returning the list,it's having to return a larger amount of data (although I don't thinkthat would actually have any speed difference as it would just returnthe reference to the set...)

The answer to your original question - why does it take longer toexecute over a smaller range - it doesn't. It won't take longer toexecute over a smaller range.When you suspect something like this, that is completelycounter-intuitive, write a new program that tests JUST that feature, andcompare based on that.As you've seen from this problem, there can be minor nuances in yourprogram that effect runtime in very odd ways that you did not expect.

Moral of the story: print your output, use various methods at yourdisposal to check that they're actually equal (use set subtraction /intersections and such, rather than relying on eyeballing it) whencomparing data sets.

Also a few style notes ( just helpful suggestions, not meant to beoverly critical):1) please, please! don't use semicolons at the end of your lines. It isNOT required syntax in Python and it therefore just looks cluttered /unnecessary.2) you do not need to call print() as a function unless you're usingPython 3.0, but it doesn't hurt to. you can just use print 'somestring'instead.3) be consistent with your spacing. for example, in some cases you havelines ("else :") with an extra space before the colon, and sometimes youdon't ("if plst[i]:"). I would suggest you use no space between thecondition and the colon, but it doesn't really matter so long as you'reconsistent. Just don't be inconsistent.4) don't put the condition of your "if" statement in parenthesis if it'snot necessary (the line "if ( Nmax < 2 ) :" is functionally equivalentto "if Nmax < 2:")5) you catch an exception if Nmax can't be coerced to an integer, outputa "sorry, Nmax couldn't be converted" message, and then you go ahead anduse Nmax later in the function. Except clauses do NOT terminate afunction automatically. You need to either force a return of thefunction in your except clauses (after your "print" statements, put a"return []" or something of that sort), or not bother doing the typecoercion in the first place, or just don't catch the exceptions yourselfand let them bubble up to the next level of execution.

This seems to be counter intuitive. Can someone explain this behavior ?

I hope I have already done so.

Also if I replace the for loop by while loop, it takes longer to run,
which I guess could be because in case of for loop it kind of knows
how many iterations to make, but in while loop it has to keep checking
every time.

If you would like a detailed explanation of why one program takes longerto run than the other, reply to this thread withsome code with both functions, the one with the "for" loop and the onewith the "while" loop, and I'll try to explain why.It's not something inherent in while loops vs. for loops, it's somethingspecific to your application, so we can't help you analyze it withoutactual code.


HTH,
-Luke
_______________________________________________
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] puzzling for-loop behavior

Reply via email to