Python 2-3 compatibility

2013-06-02 Thread Jason Swails
Hello Everyone,

I have a Python script that I wrote to support a project that I work on
(that runs primarily on Unix OSes).  Given its support role in this
package, this script should not introduce any other dependencies.  As a
result, I wrote the script in Python 2, since every Linux currently ships
with 2.4--2.7 as its system Python (RHEL 5, still popular in my field,
ships with 2.4).

However, I've heard news that Ubuntu is migrating to Python 3 soon (next
release??), and that's a platform we actively try to support due to its
popularity.  I've tried writing the code to support both 2 and 3 as much as
possible, but with priority put on supporting 2.4 over 3.

Now that Python 3-compatibility is about to become more important, I'm
looking for a way to catch and store exceptions in a compatible way.

Because Python 2.4 and 2.5 don't support the

except Exception as err:

syntax, I've used

except Exception, err:

Is there any way of getting this effect in a way compatible with Py2.4 and
3.x?  Of course I could duplicate every module with 2to3 and do
sys.version_info-conditional imports, but I'd rather avoid duplicating all
of the code if possible.

Any suggestions are appreciated.

Thanks!
Jason

-- 
Jason M. Swails
Quantum Theory Project,
University of Florida
Ph.D. Candidate
352-392-4032
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PyWart: The problem with "print"

2013-06-02 Thread Jason Swails
On Sun, Jun 2, 2013 at 1:20 PM, Chris Angelico  wrote:

>
> Hmm. Could be costly. Hey, you know, Python has something for testing that.
>
> >>> timeit.timeit('debugprint("asdf")','def debugprint(*args):\n\tif not
> DEBUG: return\n\tprint(*args)\nDEBUG=False',number=100)
> 0.5838018519113444
>
> That's roughly half a second for a million calls to debugprint().
> That's a 580ns cost per call. Rather than fiddle with the language,
> I'd rather just take this cost. Oh, and there's another way, too: If
> you make the DEBUG flag have effect only on startup, you could write
> your module thus:
>

This is a slightly contrived demonstration... The time lost in a function
call is not just the time it takes to execute that function.  If it
consistently increases the frequency of cache misses then the cost is much
greater -- possibly by orders of magnitude if the application is truly
bound by the bandwidth of the memory bus and the CPU pipeline is almost
always saturated.

I'm actually with RR in terms of eliminating the overhead involved with
'dead' function calls, since there are instances when optimizing in Python
is desirable.  I actually recently adjusted one of my own scripts to
eliminate branching and improve data layout to achieve a 1000-fold
improvement in efficiency (~45 minutes to 0.42 s. for one example) --- all
in pure Python.  The first approach was unacceptable, the second is fine.
 For comparison, if I add a 'deactivated' debugprint call into the inner
loop (executed 243K times in this particular test), then the time of the
double-loop step that I optimized takes 0.73 seconds (nearly doubling the
duration of the whole step).  The whole program is too large to post here,
but the relevant code portion is shown below:

 i = 0
 begin = time.time()
 for mol in owner:
for atm in mol:
   blankfunc("Print i %d" % i)
   new_atoms[i] = self.atom_list[atm]
   i += 1
 self.atom_list = new_atoms
 print "Done in %f seconds." % (time.time() - begin)

from another module:

DEBUG = False

[snip]

def blankfunc(instring):
   if DEBUG:
  print instring

Also, you're often not passing a constant literal to the debug print --
you're doing some type of string formatting or substitution if you're
really inspecting the value of a particular variable, and this also takes
time.  In the test I gave the timings for above, I passed a string the
counter substituted to the 'dead' debug function.  Copy-and-pasting your
timeit experiment on my machine yields different timings (Python 2.7):

>>> import sys
>>> timeit.timeit('debugprint("asdf")','def debugprint(*args):\n\tif not
DEBUG: return\n\tsys.stdout.write(*args)\nDEBUG=False',number=100)
0.15644001960754395

which is ~150 ns/function call, versus ~1300 ns/function call.  And there
may be even more extreme examples, this is just one I was able to cook up
quickly.

This is, I'm sad to say, where my alignment with RR ends.  While I use
prints in debugging all the time, it can also become a 'crutch', just like
reliance on valgrind or gdb.  If you don't believe me, you've never hit a
bug that 'magically' disappears when you add a debugging print statement
;-).

The easiest way to eliminate these 'dead' calls is to simply comment-out
the print call, but I would be quite upset if the interpreter tried to
outsmart me and do it automagically as RR seems to be suggesting.  And if
you're actually debugging, then you typically only add a few targeted print
statements -- not too hard to comment-out.  If it is, and you're really
that lazy, then by all means add a new, greppable function call and use a
sed command to comment those lines out for you.

BTW: *you* in the above message refers to a generic person -- none of my
comments were aimed at anybody in particular

All the best,
Jason

P.S. All that said, I would agree with ChrisA's suggestion that the
overhead is negligible is most cases...
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PyWart: The problem with "print"

2013-06-02 Thread Jason Swails
On Sun, Jun 2, 2013 at 11:10 PM, Dan Sommers  wrote:

> On Sun, 02 Jun 2013 20:16:21 -0400, Jason Swails wrote:
>
> > ... If you don't believe me, you've never hit a bug that 'magically'
> > disappears when you add a debugging print statement ;-).
>
> Ah, yes.  The Heisenbug.  ;-)
>

Indeed.  Being in the field of computational chemistry/physics, I was
almost happy to have found one just to say I was hunting a Heisenbug.  It
seems to be a term geared more toward the physics-oriented programming
crowd.


> We used to run into those back in the days of C and assembly langua
> 
> ge.
> 
> They're much harder to see in the wild with Python.
> 
>

Yea, I've only run into Heisenbugs with Fortran or C/C++.  Every time I've
seen one it's been due to an uninitialized variable somewhere -- something
valgrind is quite good at pinpointing.  (And yes, a good portion of our
code is -still- in Fortran -- but at least it's F90+ :).
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PyWart: The problem with "print"

2013-06-03 Thread Jason Swails
On Mon, Jun 3, 2013 at 1:12 PM, Ian Kelly  wrote:

> On Sun, Jun 2, 2013 at 6:16 PM, Jason Swails 
> wrote:
> > I'm actually with RR in terms of eliminating the overhead involved with
> > 'dead' function calls, since there are instances when optimizing in
> Python
> > is desirable.  I actually recently adjusted one of my own scripts to
> > eliminate branching and improve data layout to achieve a 1000-fold
> > improvement in efficiency (~45 minutes to 0.42 s. for one example) ---
> all
> > in pure Python.  The first approach was unacceptable, the second is fine.
> > For comparison, if I add a 'deactivated' debugprint call into the inner
> loop
> > (executed 243K times in this particular test), then the time of the
> > double-loop step that I optimized takes 0.73 seconds (nearly doubling the
> > duration of the whole step).
>
> It seems to me that your problem here wasn't that the time needed for
> the deactivated debugprint was too great. Your problem was that a
> debugprint that executes 243K times in 0.73 seconds is going to
> generate far too much output to be useful, and it had no business
> being there in the first place.  *Reasonably* placed debugprints are
> generally not going to be a significant time-sink for the application
> when disabled.


Well in 'debug' mode I wouldn't use an example that executed the loop 200K
times -- I'd find one that executed a manageable couple dozen, maybe.
 When 'disabled,' the print statement won't do anything except consume
clock cycles and potentially displace useful cache (the latter being the
more harmful, since most applications are bound by the memory bus).  It's
better to eliminate this dead call when you're not in 'debugging' mode.
 (When active, it certainly would've taken more than 0.73
seconds) Admittedly such loops should be tight enough that debugging
statements inside the inner loop are generally unnecessary, but perhaps not
always.

But unlike RR, who suggests some elaborate interpreter-wide, ambiguous
ignore-rule to squash out all of these functions, I'm simply suggesting
that sometimes it's worth commenting-out debug print calls instead of 'just
leaving them there because you won't notice the cost' :).

> The easiest way to eliminate these 'dead' calls is to simply comment-out
> the
> > print call, but I would be quite upset if the interpreter tried to
> outsmart
> > me and do it automagically as RR seems to be suggesting.
>
> Indeed, the print function is for general output, not specifically for
> debugging.  If you have the global print deactivation that RR is
> suggesting, then what you have is no longer a print function, but a
> misnamed debug function.
>

Exactly.  I was just trying to make the point that it is -occasionally-
worth spending the time to comment-out certain debug calls rather than
leaving 'dead' function calls in certain places.

All the best,
Jason

-- 
Jason M. Swails
Quantum Theory Project,
University of Florida
Ph.D. Candidate
352-392-4032
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PyWart: The problem with "print"

2013-06-03 Thread Jason Swails
On Mon, Jun 3, 2013 at 1:12 PM, Ian Kelly  wrote:

> On Sun, Jun 2, 2013 at 6:16 PM, Jason Swails 
> wrote:
> > I'm actually with RR in terms of eliminating the overhead involved with
> > 'dead' function calls, since there are instances when optimizing in
> Python
> > is desirable.  I actually recently adjusted one of my own scripts to
> > eliminate branching and improve data layout to achieve a 1000-fold
> > improvement in efficiency (~45 minutes to 0.42 s. for one example) ---
> all
> > in pure Python.  The first approach was unacceptable, the second is fine.
> > For comparison, if I add a 'deactivated' debugprint call into the inner
> loop
> > (executed 243K times in this particular test), then the time of the
> > double-loop step that I optimized takes 0.73 seconds (nearly doubling the
> > duration of the whole step).
>
> It seems to me that your problem here wasn't that the time needed for
> the deactivated debugprint was too great. Your problem was that a
> debugprint that executes 243K times in 0.73 seconds is going to
> generate far too much output to be useful, and it had no business
> being there in the first place.  *Reasonably* placed debugprints are
> generally not going to be a significant time-sink for the application
> when disabled.


Well in 'debug' mode I wouldn't use an example that executed the loop 200K
times -- I'd find one that executed a manageable couple dozen, maybe.
 When 'disabled,' the print statement won't do anything except consume
clock cycles and potentially displace useful cache (the latter being the
more harmful, since most applications are bound by the memory bus).  It's
better to eliminate this dead call when you're not in 'debugging' mode.
 Admittedly such loops should be tight enough that debugging statements
inside the inner loop are generally unnecessary, but perhaps not always.

But unlike RR, who suggests some elaborate interpreter-wide, ambiguous
ignore-rule to squash out all of these functions, I'm simply suggesting
that sometimes it's worth commenting-out debug print calls instead of 'just
leaving them there because you won't notice the cost' :).

> The easiest way to eliminate these 'dead' calls is to simply comment-out
> the
> > print call, but I would be quite upset if the interpreter tried to
> outsmart
> > me and do it automagically as RR seems to be suggesting.
>
> Indeed, the print function is for general output, not specifically for
> debugging.  If you have the global print deactivation that RR is
> suggesting, then what you have is no longer a print function, but a
> misnamed debug function.
>

Exactly.  I was just trying to make the point that it is -occasionally-
worth spending the time to comment-out certain debug calls rather than
leaving 'dead' function calls in certain places.

All the best,
Jason
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PyWart: The problem with "print"

2013-06-03 Thread Jason Swails
ack, sorry for the double-post.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bools and explicitness [was Re: PyWart: The problem with "print"]

2013-06-04 Thread Jason Swails
On Tue, Jun 4, 2013 at 11:44 AM, Rick Johnson
wrote:

>
> This implicit conversion seems like a good idea at first,
> and i was caught up in the hype myself for some time: "Hey,
> i can save a few keystrokes, AWESOME!". However, i can tell
> you with certainty that this implicit conversion is folly.
> It is my firm belief that truth testing a value that is not
> a Boolean should raise an exception. If you want to convert
> a type to Boolean then pass it to the bool function:
>
> lst = [1,2,3]
> if bool(lst):
> do_something
>
> This would be "explicit enough"


i
f lst:
do_something

is equivalent to

if bool(lst):
   do_something

why not just have your editor autobool so you can spend more time coding
and less time stamping around?  That way the person that finds booled code
more readable can have what he wants and the people that find it less
readable can have what they want.

Win-win

BTW, you should do pointless comparisons like

if condition is True:
do_something

rather than

if condition == True
do_something
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python netcdf

2013-06-05 Thread Jason Swails
On Wed, Jun 5, 2013 at 9:07 PM, Sudheer Joseph  wrote:

> Dear Members,
>   Is there a way to get the time:origin attribute from a
> netcdf file as string using the Python netcdf?
>

Attributes of the NetCDF file and attributes of each of the variables can
be accessed via the dot-operator, as per standard Python.

For instance, suppose that your NetCDF file has a Conventions attribute,
you can access it via:

ncfile.Conventions

Suppose that your variable, time, has an attribute "origin", you can get it
via:

ncfile.variables['time'].origin

Of course there's the question of what NetCDF bindings you're going to use.
 The options that I'm familiar with are the ScientificPython's NetCDFFile
class (Scientific.IO.NetCDF.NetCDFFile), pynetcdf (which is just the
ScientificPython's class in a standalone format), and the netCDF4 package.
 Each option has a similar API with attributes accessed the same way.

An example with netCDF4 (which is newer, has NetCDF 4 capabilities, and
appears to be more supported):

from netCDF4 import Dataset

ncfile = Dataset('my_netcdf_file.nc', 'r')

origin = ncfile.variables['time'].origin

etc. etc.

The variables and dimensions of a NetCDF file are stored in dictionaries,
and the data from variables are accessible via slicing:

time_data = ncfile.variables['time'][:]

The slice returns a numpy ndarray.

HTH,
Jason
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Idiomatic Python for incrementing pairs

2013-06-07 Thread Jason Swails
On Fri, Jun 7, 2013 at 10:32 PM, Tim Chase wrote:

> Playing around, I've been trying to figure out the most pythonic way
> of incrementing multiple values based on the return of a function.
> Something like
>
>   def calculate(params):
> a = b = 0
> if some_calculation(params):
>   a += 1
> if other_calculation(params):
>   b += 1
> return (a, b)
>
>   alpha = beta = 0
>   temp_a, temp_b = calculate(...)
>   alpha += temp_a
>   beta += temp_b
>
> Is there a better way to do this without holding each temporary
> result before using it to increment?
>

alpha = beta = 0
alpha, beta = (sum(x) for x in zip( (alpha, beta), calculate(...) ) )

It saves a couple lines of code, but at the expense of readability IMO.  If
I was reading the first, I'd know exactly what was happening immediately.
 If I was reading the second, it would take a bit to decipher.  In this
example, I don't see a better solution to what you're doing.

All the best,
Jason
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Idiomatic Python for incrementing pairs

2013-06-08 Thread Jason Swails
On Sat, Jun 8, 2013 at 2:47 AM, Peter Otten <[email protected]> wrote:
>
> You can hide the complexity in a custom class:
>
> >>> class T(tuple):
> ... def __add__(self, other):
> ... return T((a+b) for a, b in zip(self, other))
> ...
> >>> t = T((0, 0))
> >>> for pair in [(1, 10), (2, 20), (3, 30)]:
> ... t += pair
> ...
> >>> t
> (6, 60)
>
> (If you are already using numpy you can do the above with a numpy.array
> instead of writing your own T.)
>

I do this frequently when I want data structures that behave like vectors
but don't want to impose the numpy dependency on users. (Although I usually
inherit from a mutable sequence so I can override __iadd__ and __isub__).
It seemed overkill for the provided example, though...

All the best,
Jason
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Newbie: question regarding references and class relationships

2013-06-10 Thread Jason Swails
On Mon, Jun 10, 2013 at 7:26 PM, Dave Angel  wrote:

> On 06/10/2013 06:54 PM, Chris Angelico wrote:
>
>> On Tue, Jun 11, 2013 at 8:39 AM, Grant Edwards 
>> wrote:
>>
>>> On 2013-06-10, Terry Jan Reedy  wrote:
>>>
>>>  Another principle similar to 'Don't add extraneous code' is 'Don't
 rebind builtins'.

>>>
>>> OK, we've all done it by accident (especially when starting out), but
>>> are there people that rebind builtins intentionally?
>>>
>>
>> There are times when you don't care what you shadow, like using id for
>> a database ID.
>>
>> ChrisA
>>
>>
> And times where you're deliberately replacing a built-in
>
> try:
>input = raw_input
> except 


Yes but this is a hack to coerce Python2/3 compatibility.  You're no doubt
correct that it's intentional rebinding with a definite aim, but if the
PyArchitects had their way, this would be unnecessary (and discouraged) as
well.

The first time I remember rebinding a builtin was completely accidental
(and at the very beginning of me learning and using Python).

# beginner's crappy code
range = [0, 20]

# bunches of code

for i in range(len(data)):
   if data[i] > range[0] and data[i] < range[1]:
  do_something

TypeError: 'list' object is not callable... # what the heck does this mean??


That one drove me nuts. Took me hours to find.  I still avoid rebinding
builtins just from the memory of the experience :)

--Jason
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: My son wants me to teach him Python

2013-06-13 Thread Jason Swails
On Thu, Jun 13, 2013 at 11:33 PM, Rick Johnson  wrote:

> On Thursday, June 13, 2013 3:18:57 PM UTC-5, Joshua Landau wrote:
>
> > [...]
> > GUI is boring. I don't give a damn about that. If I had it
> > my way, I'd never write any interfaces again (although
> > designing them is fine). Console interaction is faster to
> > do and it lets me do the stuff I *want* to do quicker.
>
> And are you willing to provide *proof* that the console is
> faster? Or is this merely just your "opinion"? I would be
> ready and willing to compete in a "Pepsi challenge" to
> disprove your claim if needed.  For instance, if i want to
> open a text file on my machine, i merely navigate to the
> file via my file browser interface, using clicks along the
> way, and then the final double click will open the text file
> using it's default program. Are you telling me you can type
> the address faster (much less remember the full path) than i
> can point and click? And if you think you're going to cheat
> by creating an "environment variable", well i can still win
> by doing the same thing with a "shortcut".
>

One of my favorite parts about the Mac over windows, actually, is that I
can open up a console.  Coupled with MacPorts or some other equivalent
package manager, I have what amounts to a working Linux environment
(almost).  The "open" command is particularly useful (equivalent to
double-clicking, with the ability to specify "-a " to invoke
a right-click->select program->[scan through program list]->click much more
easily.

Coupled with tab completion (as Chris mentioned), and a full history of
visited directories, I can navigate my file system in a console much faster
than I can navigate in the GUI.  It matters little to my productivity how
fast you can navigate a GUI.

But batch processing is, in general, much easier to do in the console in my
experience.  Two tasks I've wanted to do that were of general interest to
computer users (not specifically my work) that I wouldn't have bothered in
a GUI environment:

1) Autocrop a whole bunch of images (~100) to remove extraneous white space
around all of the edges. In the console with imagemagick:

bash$ for image in *.png; do convert $image -trim tmp.png; mv tmp.png
$image; done

2) Compress my library of 2000 jpegs, since I didn't need high-quality
jpegs AND raw images from my camera on my disk consuming space:

bash$ for image in `find . -name "*.jpg"`; do convert $image -quality 70
tmp.jpg; mv tmp.jpg $image; done

Using the console I was able to do both tasks in ~20 seconds quite easily.


> > Also - Python is pretty much the only language that
> > *isn't* overkill; once you take more than the first few
> > steps a language that's *consistent* will help more with
> > learning, a mon avis, than these "quicker" languages ever
> > will. Python is consistent and simple.
>
> Your statement is an oft cited misconception of the Python
> neophyte. I'm sorry to burst your bubble whilst also raining
> on your parade, but Python is NOT consistent. And the more i
> learn about Python the more i realize just how inconsistent
> the language is. Guido has the correct "base idea", however
> he allowed the evolution to become a train wreck.
>

You're right.  NameError's should not be listed with the full traceback
-- the last entry on the call stack is obviously the right way to treat
this special-case exception [1].  BTW, this comment amounts to
Contradiction. [2]

It's sometimes difficult to reconcile several of your comments...


> > [...]
> > Basically, "kid" is a *very* generic term and there are
> > people who like GUIs and there are people who like
> > internals
>
> Your statement is true however it ignores the elephant in
> the room. You can "prefer" console over GUI all day long but
> that does not negate the fact that GUI's outperform the
> console for many tasks. With the exception of text based
> games, the console is as useful for game programming as
> [something not useful]
>

Just because you click through a GUI faster does not mean that everyone
else does, too.

And I've developed some Tkinter-based apps (that you can even download if
you were so interested)---I did all the development on the command-line
using vim and git.

All the best,
Jason

[1] http://mail.python.org/pipermail/python-list/2013-March/642963.html
[2]
https://upload.wikimedia.org/wikipedia/commons/a/a7/Graham%27s_Hierarchy_of_Disagreement1.svg
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: My son wants me to teach him Python

2013-06-14 Thread Jason Swails
On Fri, Jun 14, 2013 at 3:21 AM, Chris Angelico  wrote:

> On Fri, Jun 14, 2013 at 4:13 PM, Steven D'Aprano
>  wrote:
> > Here's another Pepsi Challenge for you:
> >
> > There is a certain directory on your system containing 50 text files, and
> > 50 non-text files. You know the location of the directory. You want to
> > locate all the text files in this directory containing the word
> > "halibut", then replace the word "halibut" with "trout", but only if the
> > file name begins with a vowel.
>
> That sounds extremely contrived, to be honest.


I agree that it sounds contrived, but I've found analogous tasks to be
quite common in the program suite I work on, actually.

We have a set of regression tests for obvious reasons.  To give an order of
magnitude estimate here, there are over 1100 saved test files that get
compared when we run the test suite.

When a change is made to the information reporting (for instance, if we
added a new input variable) or version number that is printed in the output
files, we have ourselves ~2K files.  We then have to scan through all 2K
files (some of which are ASCII, others of which are binary), typically
armed with a regex that identifies the formatting change we just
implemented and change the saved test files (all file names that end in
.save) to the 'new' format. Our task is to find only those files that end
in .save and replace only those files that differ only by the trivial
formatting change to avoid masking a bug in the test suite. [I'm actually
doing this now]

On the whole, it sounds quite similar to Steven's example (only
significantly more files), and is something not even RR could do in a GUI
faster than I can run a script.
-- 
http://mail.python.org/mailman/listinfo/python-list


on git gc --aggressive [was Re: Version Control Software]

2013-06-16 Thread Jason Swails
On Sat, Jun 15, 2013 at 11:55 PM, rusi  wrote:

> On Jun 16, 4:14 am, Chris Angelico  wrote:
> > On Sun, Jun 16, 2013 at 12:16 AM, Roy Smith  wrote:
> > > The advantage of DVCS is that everybody has a full copy of the repo.
> > > The disadvantage of the DVCS is that every MUST have a full copy of the
> > > repo.  When a repo gets big, you may not want to pull all of that data
> > > just to get the subtree you need.
> >
> > Yeah, and depending on size, that can be a major problem. While git
> > _will_ let you make a shallow clone, it won't let you push from that,
> > so it's good only for read-only repositories (we use git to manage
> > software deployments at work - shallow clones are perfect) or for
> > working with patch files.
> >
> > Hmm. ~/cpython/.hg is 200MB+, but ~/pike/.git is only 86MB. Does
> > Mercurial compress its content? A tar.gz of each comes down, but only
> > to ~170MB and ~75MB respectively, so I'm guessing the bulk of it is
> > already compressed. But 200MB for cpython seems like a lot.
>
> [I am assuming that you have run  "git gc --aggressive" before giving
> those figures]
>

Off-topic, but this is a bad idea in most cases.  This is a post including
an email from Torvalds proclaiming how and why git gc --aggressive is dumb
in 99% of cases and should rarely be used:

http://metalinguist.wordpress.com/2007/12/06/the-woes-of-git-gc-aggressive-and-how-git-deltas-work/

All the best,
Jason
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Version Control Software

2013-06-16 Thread Jason Swails
On Sun, Jun 16, 2013 at 9:30 AM, Chris “Kwpolska” Warrick <
[email protected]> wrote:

> On Sun, Jun 16, 2013 at 1:14 AM, Chris Angelico  wrote:
> > Hmm. ~/cpython/.hg is 200MB+, but ~/pike/.git is only 86MB. Does
> > Mercurial compress its content? A tar.gz of each comes down, but only
> > to ~170MB and ~75MB respectively, so I'm guessing the bulk of it is
> > already compressed. But 200MB for cpython seems like a lot.
>
> Next time, do a more fair comparison.
>
> I created an empty git and hg repository, and created a file promptly
> named “file” with DIGIT ONE (0x31; UTF-8/ASCII–encoded) and commited
> it with “c1” as the message, then I turned it into “12” and commited
> as “c2” and did this one more time, making the file “123” at commit
> named “c3”.
>
> [kwpolska@kwpolska-lin .hg@default]% cat * */* */*/* 2>/dev/null | wc -c
> 1481
> [kwpolska@kwpolska-lin .git@master]% cat * */* */*/* */*/*/* 2>/dev/null
> | wc -c
> 16860 ← WRONG!
>
> There is just one problem with this: an empty git repository starts at
> 15216 bytes, due to some sample hooks.  Let’s remove them and try
> again:
>
> [kwpolska@kwpolska-lin .git@master]% rm hooks/*
> [kwpolska@kwpolska-lin .git@master]% cat * */* */*/* */*/*/* */*/*/*
> 2>/dev/null | wc -c
> 2499
>
> which is a much more sane number.  This includes a config file (in the
> ini/configparser format) and such.  According to my maths skils (or
> rather zsh’s skills), new commits are responsible for 1644 bytes in
> the git repo and 1391 bytes in the hg repo.
>

This is not a fair comparison, either.  If we want to do a fair comparison
pertinent to this discussion, let's convert the cpython mercurial
repository into a git repository and allow the git repo to repack the diffs
the way it deems fit.

I'm using the git-remote-hg.py script [
https://github.com/felipec/git/blob/fc/master/contrib/remote-helpers/git-remote-hg.py]
to clone a mercurial repo into a native git repo.  Then, in one of the rare
cases, using git gc --aggressive. [1]

The result:

Git:
cpython_git/.git $ du -h --max-depth=1
40K ./hooks
145M ./objects
20K ./logs
24K ./refs
24K ./info
146M .

Mercurial:
cpython/.hg $ du -h --max-depth=1
209M ./store
20K ./cache
209M .


And to help illustrate the equivalence of the two repositories:

Git:

cpython_git $ git log | head; git log | tail

commit 78f82bde04f8b3832f3cb6725c4bd9c8d705d13b
Author: Brett Cannon 
Date:   Sat Jun 15 23:24:11 2013 -0400

Make test_builtin work when executed directly

commit a7b16f8188a16905bbc1d49fe6fd940078dd1f3d
Merge: 346494a af14b7c
Author: Gregory P. Smith 
Date:   Sat Jun 15 18:14:56 2013 -0700
Author: Guido van Rossum 
Date:   Mon Sep 10 11:15:23 1990 +

Warning about incompleteness.

commit b5e5004ae8f54d7d5ddfa0688fc8385cafde0e63
Author: Guido van Rossum 
Date:   Thu Aug 9 14:25:15 1990 +

Initial revision

Mercurial:

cpython $ hg log | head; hg log | tail

changeset:   84163:5b90da280515
bookmark:master
tag: tip
user:Brett Cannon 
date:Sat Jun 15 23:24:11 2013 -0400
summary: Make test_builtin work when executed directly

changeset:   84162:7dee56b6ff34
parent:  84159:5e8b377942f7
parent:  84161:7e06a99bb821
user:Guido van Rossum 
date:Mon Sep 10 11:15:23 1990 +
summary: Warning about incompleteness.

changeset:   0:3cd033e6b530
branch:  legacy-trunk
user:Guido van Rossum 
date:Thu Aug 09 14:25:15 1990 +
summary: Initial revision

They both appear to have the same history.  In this particular case, it
seems that git does a better job in terms of space management, probably due
to the fact that it doesn't store duplicate copies of identical source code
that appears in different files (it tracks content, not files).

That being said, from what I've read both git and mercurial have their
advantages, both in the performance arena and the features/usability arena
(I only know how to really use git).  I'd certainly take a DVCS over a
centralized model any day.

All the best,
Jason

[1] I know I just posted in this thread about --aggressive being bad, but
the packing from the translation was horrible --> the translated git repo
was ~2 GB in size.  An `aggressive' repacking was necessary to allow git to
decide how to pack the diffs.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Using Python to automatically boot my computer at a specific time and play a podcast

2013-06-16 Thread Jason Swails
On Sun, Jun 16, 2013 at 3:06 PM, C. N. Desrosiers wrote:

> Hi,
>
> I'm planning to buy a Macbook Air and I want to use it as a sort of alarm.
>  I'd like to write a program that boots my computer at a specific time,
> loads iTunes, and starts playing a podcast.  Is this sort of thing possible
> in Python?
>

Python cannot do this by itself, as has already been mentioned.

If you're using a Mac, you can schedule your computer to turn on (and/or
off) using System Preferences->Energy Saver->Schedule...

Then run a Python script in a cron job.

In fact, you could do this in bash ;)

HTH,
Jason
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why is the argparse module so inflexible?

2013-06-27 Thread Jason Swails
On Thu, Jun 27, 2013 at 2:50 PM, Ethan Furman  wrote:

>
> If the OP is writing an interactive shell, shouldn't `cmd` be used instead
> of `argparse`?  argparse is, after all, intended for argument parsing of
> command line scripts, not for interactive work.
>

He _is_ using cmd.  He's subclassed cmd.Cmd and trying to use argparse to
handle argument parsing in the Cmd.precmd method to preprocess the user
input.

Let's say that one of the user's commands requires an integer but they give
a float or a string by accident: I believe the OP wants ArgumentParser to
pass the appropriate ArgumentError exception up to the calling routine
rather than catching it and calling ArgumentParser.error(), thereby
destroying information about the exception type and instead requiring him
to rely on parsing the far less reliable error message (since it's
reasonable to expect that kind of thing to change from version to version).
 This way his calling routine can indicate to the user what was wrong with
the last command and soldier on without bailing out of the program (or
doing something ugly like catching a SystemExit and parsing argparse's
error message).

As it stands now, ArgumentParser has only limited utility in this respect
since all parsing errors result in a SystemExit exception being thrown.
 Having subclassed cmd.Cmd myself in one of my programs and written my own
argument parsing class to service it, I can appreciate what the OP is
trying to do (and it's clever IMO).  While argparse was not specifically
designed for what the OP is trying to do, it would satisfy his needs nicely
except for the previously mentioned issue.

An alternative is, of course, to simply subclass ArgumentParser and copy
over all of the code that catches an ArgumentError to eliminate the
internal exception handling and instead allow them to propagate the call
stack.

All the best,
Jason
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why is the argparse module so inflexible?

2013-06-27 Thread Jason Swails
On Thu, Jun 27, 2013 at 8:22 PM, Tim Chase wrote:

> On 2013-06-28 09:02, Cameron Simpson wrote:
> > On 27Jun2013 11:50, Ethan Furman  wrote:
> > | If the OP is writing an interactive shell, shouldn't `cmd` be used
> > | instead of `argparse`?  argparse is, after all, intended for
> > | argument parsing of command line scripts, not for interactive
> > work.
> >
> > I invoke command line scripts interactively. There's no special
> > case here.
> >
> > To add to the use case stats, I also subclass cmd and parse
> > interactive command lines. I'm beginning to be pleased I'm still
> > using Getopt for that instead of feeling I'm lagging behind the
> > times.
>
> I too have several small utilities that use a combination of cmd.Cmd,
> shlex.shlex(), and command-processing libraries.  However, much like
> Cameron's code using getopt, my older code is still using optparse
> which gives me the ability to override the error() method's default
> sys.exit() behavior and instead raise the exception of your choice.
>

There's nothing in argparse preventing this.  There's still an
ArgumentParser.error() method that you can override to raise an exception.
 The problem is that the original exception ArgumentParser raised when it
hit a parsing error was lost as soon as the parsing routine caught said
exception.

Therefore, your new error() method must parse the message being passed to
it in order to determine what error occurred and raise the corresponding
exception of your choice, or simply settle with telling the user there was
a generic argument parsing error that they have to figure out.

Being a prolific user of argparse myself (I use it or optparse in nearly
every script I write, although I greatly prefer argparse), I recognize it
as an incredibly feature-packed, convenient, easy-to-use library.  It's too
bad that the utility of this library for non-commandline argument parsing
is limited by a seemingly unnecessary feature.

Of course, in RRsPy4k this whole module will just be replaced with "raise
PyWart('All interfaces must be graphical')" and this whole thread will be
moot. :)

All the best,
Jason
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Important features for editors

2013-07-04 Thread Jason Swails
On Thu, Jul 4, 2013 at 2:52 PM, Ferrous Cranus  wrote:

> Στις 4/7/2013 9:40 μμ, ο/η Grant Edwards έγραψε:
>
>  On 2013-07-04, ??  wrote:
>>
>>>
>>> If you guys want to use it i can send you a patch for it. I know its
>>> illegal thing to say but it will help you use it without buying it.
>>>
>>
>> A new low.  Now he's offering to help people steal others' work.
>>
>
> Like you never downloaded serials/keygens/patch/cracks for warez and
> torrents websites.
>
> What a hypocritism.my intensions was to help the OP.


No, I don't.  Ever.  And I imagine many others on this list are the same
way.  If I don't want to pay asking price for a product, I won't use it.
 Period.  There's an open source solution to almost everything.  Learn to
use that instead.

Since many people on this list are in the business of making software, I'd
be willing to bet you are in the minority here. (The rather despised
minority, based on previous comments).  Even people that strongly believe
all software should be free would avoid Sublime Text (or start an inspired
open source project), not crack it.  Example: look at the history of git.

What you are doing (and offering to help others do in a public forum)
damages the very product you claim to like.  Commercial software is
maintained and improved by the funding provided by product sales, which you
deprive them of by your behavior.

The original offer was misguided at best, but this attempted defense that
casts everyone else down to your level (and avoids admitting wrongdoing) is
reprehensible.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Callable or not callable, that is the question!

2013-07-11 Thread Jason Swails
On Thu, Jul 11, 2013 at 9:05 AM, Ulrich Eckhardt <
[email protected]> wrote:

> Hello!
>
> I just stumbled over a case where Python (2.7 and 3.3 on MS Windows) fail
> to detect that an object is a function, using the callable() builtin
> function. Investigating, I found out that the object was indeed not
> callable, but in a way that was very unexpected to me:
>
> class X:
> @staticmethod
> def example():
> pass
> test1 = example
> test2 = [example,]
>
> X.example() # OK
> X.test1() # OK
> X.test2[0]() # TypeError: 'staticmethod' object is not callable
>

Interestingly, you can actually use this approach to 'fake' staticmethod
before staticmethod was even introduced.  By accessing example from inside
the test2 class attribute list, there is no instance bound to that method
(even if an instance was used to access it).

Using Python 3.3:

Python 3.3.2 (default, Jun  3 2013, 08:29:09)
[GCC 4.5.4] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> class X:
... def example(): pass
... test = example,
...
>>> X.test[0]()
>>>

Using Python 2.0 (pre-staticmethod):

Python 2.0.1 (#1, Aug 28 2012, 20:25:41)
[GCC 4.5.3] on linux3
Type "copyright", "credits" or "license" for more information.
>>> class X:
... def example(): pass
... test = example,
...
>>> X.test[0]()
>>> staticmethod
Traceback (most recent call last):
  File "", line 1, in ?
NameError: There is no variable named 'staticmethod'

Once you change test into an instance attribute, you get back to the
expected behavior

Python 3.3.2 (default, Jun  3 2013, 08:29:09)
[GCC 4.5.4] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> class X:
... def example(self): pass
... test = example,
...
>>> inst = X()
>>> inst.example()
>>> inst.test[0]()
Traceback (most recent call last):
  File "", line 1, in 
TypeError: example() missing 1 required positional argument: 'self'
>>> inst.test = inst.example,
>>> inst.test[0]()
>>>

All the best,
Jason
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Strange behaviour with os.linesep

2013-07-23 Thread Jason Swails
On Tue, Jul 23, 2013 at 7:42 AM, Vincent Vande Vyvre <
[email protected]> wrote:

> On Windows a script where de endline are the system line sep, the files
> are open with a double line in Eric4, Notepad++ or Gedit but they are
> correctly displayed in the MS Bloc-Notes.
>
> Example with this code:
> --**
> # -*- coding: utf-8 -*-
>
> import os
> L_SEP = os.linesep
>
> def write():
> strings = ['# -*- coding: utf-8 -*-\n',
> 'import os\n',
> 'import sys\n']
> with open('writetest.py', 'w') as outf:
> for s in strings:
> outf.write(s.replace('\n', L_SEP))
>

I must ask why you are setting strings with a newline line ending only to
replace them later with os.linesep.  This seems convoluted compared to
doing something like

def write():
strings = ['#-*- coding: utf-8 -*-', 'import os', 'import sys']
with open('writetest.py', 'w') as outf:
for s in strings:
outf.write(s)
outf.write(L_SEP)

Or something equivalent.

If, however, the source strings come from a file you've created somewhere
(and are loaded by reading in that file line by line), then I can see a
problem.  DOS line endings are carriage returns ('\r\n'), whereas standard
UNIX files use just newlines ('\n').  Therefore, if you are using the code:

s.replace('\n', L_SEP)

in Windows, using a Windows-generated file, then what you are likely doing
is converting the string sequence '\r\n' into '\r\r\n', which is not what
you want to do.  I can imagine some text editors interpreting that as two
endlines (since there are 2 \r's).  Indeed, when I execute the code:

>>> l = open('test.txt', 'w')
>>> l.write('This is the first line\r\r\n')
>>> l.write('This is the second\r\r\n')
>>> l.close()

on UNIX and open the resulting file in gedit, it is double-spaced, but if I
just dump it to the screen using 'cat', it is single-spaced.

If you want to make your code a bit more cross-platform, you should strip
out all types of end line characters from the strings before you write
them.  So something like this:

with open('writetest.py', 'w') as outf:
for s in strings:
outf.write(s.rstrip('\r\n'))
outf.write(L_SEP)

Hope this helps,
Jason
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: tkinter progress bar

2013-07-23 Thread Jason Swails
On Tue, Jul 23, 2013 at 5:38 AM,  wrote:

> Dear Christian,
>
> Thanks for the help. Can you please add a source example as I am new with
> Tkinter.
>

 http://docs.python.org/2/library/ttk.html#progressbar

You can do something like this:

#!/usr/bin/env python

import Tkinter as tk
import ttk
import time

class MainApp(tk.Frame):

   def __init__(self, master):
  tk.Frame.__init__(self, master)
  self.progress = ttk.Progressbar(self, maximum=10)
  self.progress.pack(expand=1, fill=tk.BOTH)
  self.progress.bind("", self._loop_progress)

   def _loop_progress(self, *args):
  for i in range(10):
 self.progress.step(1)
 # Necessary to update the progress bar appearance
 self.update()
 # Busy-wait
 time.sleep(2)


if __name__ == '__main__':
   root = tk.Tk()
   app = MainApp(root)
   app.pack(expand=1, fill=tk.BOTH)
   root.mainloop()


This is a simple stand-alone app that (just) demonstrates how to use the
ttk.Progressbar widget.

HTH,
Jason
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Strange behaviour with os.linesep

2013-07-23 Thread Jason Swails
On Tue, Jul 23, 2013 at 9:26 AM, Vincent Vande Vyvre <
[email protected]> wrote:

> Le 23/07/2013 15:10, Vincent Vande Vyvre a écrit :
>
>  The '\n' are in the original file.
>>
>> I've tested these other versions:
>>
>> --**-
>> def write():
>> strings = ['# -*- coding: utf-8 -*-\n',
>> 'import os\n',
>> 'import sys\n']
>> with open('writetest.py', 'w') as outf:
>> txt = L_SEP.join([s.rstip() for s in strings]):
>> outf.write(txt)
>> --
>>
>> --**-
>> def write():
>> strings = ['# -*- coding: utf-8 -*-',
>> 'import os',
>> 'import sys']
>> with open('writetest.py', 'w') as outf:
>> txt = L_SEP.join( strings):
>> outf.write(txt)
>> --
>>
>> Las, no changes, always correctly displayed in MS bloc-notes but with
>> double line in other éditors.
>>
>>
> Also with:
>
> --**--
> def count():
> with open('c:\\Users\\Vincent\\**writetest.py', 'r') as inf:
> lines = inf.readlines()
> for l in lines:
> print(l, len(l))
>

Unrelated comment, but in general it's (much) more efficient to iterate
through a file rather than iterate through a list of strings generated by
readlines():

def count():
with open('c:\\Users\\Vincent\\writetest.py', 'r') as inf:
for l in lines:
print(l, len(l))

It's also fewer lines of code.

('# -*- coding: utf-8 -*-\r\n', 25)
> ('import os\r\n', 11)
> ('import sys', 10)


Then it seems like there is an issue with your text editors that do not
play nicely with DOS-style line endings.  Gedit on my linux machine
displays the line endings correctly (that is, '\r\n' is rendered as a
single line).

Good luck,
Jason
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Critic my module

2013-07-27 Thread Jason Swails
You've gotten plenty of good advice from people discussing the coding and
coding style itself, I'll provide some feedback from the vantage point of a
perspective user.


On Thu, Jul 25, 2013 at 9:24 AM, Devyn Collier Johnson <
[email protected]> wrote:

> Aloha Python Users!
>
>I made a Python3 module that allows users to use certain Linux shell
> commands from Python3 more easily than using os.system(),
> subprocess.Popen(), or subprocess.getoutput(). This module (once placed
> with the other modules) can be used like this
>
> import boash; boash.ls()
>

I actually wrote a program recently in which I wanted access to unix "ls"
command, and I wanted it to behave as close to the real, UNIX "ls" as
possible.

This would seem like a perfect use-case for your module, but the problem is
that the 'ls' command in your module does not behave much like the real
'ls' command.  You never let any of the 'system' commands in your module
access any arguments.  More often than not, I use "ls" with several
command-line arguments, like:

ls --color=auto -lthr dir_basename*/

Even if you're just spawning 'ls' directly, this is actually non-trivial to
implement.  You need globbing on all non-option arguments, you may want to
pass up the return code somehow, depending on what the user wants to do:

[bash ]$ ls nodir
ls: nodir: No such file or directory
[bash ]$ echo $?
1

Also, 'ls' in the terminal behaves like "ls -C" when called from your
module.  In the framework of my program, my 'ls' command looks like this:

class ls(Action):
   """
   Lists directory contents. Like UNIX 'ls'
   """
   needs_parm = False
   def init(self, arg_list):
  from glob import glob
  self.args = []
  # Process the argument list to mimic the real ls as much as possible
  while True:
 try:
arg = arg_list.get_next_string()
if not arg.startswith('-'):
   # Glob this argument
   globarg = glob(arg)
   if len(globarg) > 0:
  self.args.extend(globarg)
   else:
  self.args.append(arg)
else:
   self.args.append(arg)
 except NoArgument:
break

   def __str__(self):
  from subprocess import Popen, PIPE
  process = Popen(['/bin/ls', '-C'] + self.args, stdout=PIPE,
stderr=PIPE)
  out, err = process.communicate('')
  process.wait()
  return out + err

[I have omitted the Action base class, which processes the user
command-line arguments and passes it to the init() method in arg_list --
this listing was just to give you a basic idea of the complexity of getting
a true-er 'ls' command].

Your 'uname' command is likewise limited (and the printout looks strange:

>>> print(platform.uname())
('Linux', 'Batman', '3.3.8-gentoo', '#1 SMP Fri Oct 5 14:14:57 EDT 2012',
'x86_64', 'AMD FX(tm)-6100 Six-Core Processor')

Whereas:

[bash $] uname -a
Linux Batman 3.3.8-gentoo #1 SMP Fri Oct 5 14:14:57 EDT 2012 x86_64 AMD
FX(tm)-6100 Six-Core Processor AuthenticAMD GNU/Linux

You may want to change that to:

def uname():
print(' '.join(platform.uname()))

Although again, oftentimes people want only something specific from uname
(like -m or -n).

HTH,
Jason
-- 
http://mail.python.org/mailman/listinfo/python-list


question about numpy, subclassing, and a DeprecationWarning

2012-06-27 Thread Jason Swails
Hello,

I'm running into an unexpected issue in a program I'm writing, and I was
hoping someone could provide some clarification for me.  I'm trying to
subclass numpy.ndarray (basically create a class to handle a 3D grid).
 When I instantiate a numpy.ndarray, everything works as expected.  When I
call numpy.ndarray's constructor directly within my subclass, I get a
deprecation warning about object.__init__ not taking arguments.  Presumably
this means that ndarray's __init__ is somehow (for some reason?) calling
object's __init__...

This is some sample code:

>>> import numpy as np
>>> class derived(np.ndarray):
... def __init__(self, stuff):
... np.ndarray.__init__(self, stuff)
...
>>> l = derived((2,3))
__main__:3: DeprecationWarning: object.__init__() takes no parameters
>>> l
derived([[  8.87744455e+159,   6.42896975e-109,   5.56218818e+180],
   [  1.79996515e+219,   2.41625066e+198,   5.15855295e+307]])
>>>

Am I doing something blatantly stupid?  Is there a better way of going
about this?  I suppose I could create a normal class and just put the grid
points in a ndarray as an attribute to the class, but I would rather
subclass ndarray directly (not sure I have a good reason for it, though).
 Suggestions on what I should do?

Thanks!
Jason
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: consistent input() for Python 2 and 3

2012-08-02 Thread Jason Swails
On Thu, Aug 2, 2012 at 5:49 AM, Ulrich Eckhardt <
[email protected]> wrote:

> Hi!
>
> I'm trying to write some code that should work with both Python 2 and 3.
> One of the problems there is that the input() function has different
> meanings, I just need the raw_input() behaviour of Python 2.
>
>
> My approach is to simply do this:
>
>   try:
>   # redirect input() to raw_input() like Python 3
>   input = raw_input
>   except NameError:
>   # no raw input, probably running Python 3 already
>   pass
>
>
> What do you think? Any better alternatives?
>

Depending on how much user input is needed in your application, you can
always use the 'cmd' module: http://docs.python.org/library/cmd.html

It is present in both Python 2 and Python 3 and should just 'do the right
thing'.  It also seamlessly integrates readline (if present),
command-completion, and provides a built-in help menu for defined commands.

It's written in pure Python, and in my opinion, the best form of
documentation for that module is the source code itself.

I haven't used it in Python 3, but I imagine it can be used in a way that
easily supports Python 2 and 3.  If you have only one or two places where
you need user-input, this is probably overkill.

HTH,
Jason
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to get initial absolute working dir reliably?

2012-08-18 Thread Jason Swails
On Sat, Aug 18, 2012 at 11:19 AM, kj  wrote:

>
> Basically, I'm looking for a read-only variable (or variables)
> initialized by Python at the start of execution, and from which
> the initial working directory may be read or computed.
>

This will work for Linux and Mac OS X (and maybe Cygwin, but unlikely for
native Windows): try the PWD environment variable.

>>> import os
>>> os.getcwd()
'/Users/swails'
>>> os.getenv('PWD')
'/Users/swails'
>>> os.chdir('..')
>>> os.getcwd()
'/Users'
>>> os.getenv('PWD')
'/Users/swails'

Of course this environment variable can still be messed with, but there
isn't much reason to do so generally (if I'm mistaken here, someone please
correct me).

Hopefully this is of some help,
Jason
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Make error when installing Python 1.5

2012-08-28 Thread Jason Swails
On Sun, Aug 26, 2012 at 9:54 PM, Steven D'Aprano <
[email protected]> wrote:

> Yes, you read the subject line right -- Python 1.5. Yes, I am nuts ;)
>
> (I like having old versions of Python around for testing historical
> behaviour.)
>
> On Debian squeeze, when I try to build Python 1.5, I get this error:
>
> fileobject.c:590: error: conflicting types for ‘getline’
> /usr/include/stdio.h:651: note: previous declaration of ‘getline’ was here
> make[1]: *** [fileobject.o] Error 1
> make[1]: Leaving directory `/home/steve/personal/python/Python-1.5.2/
> Objects'
> make: *** [Objects] Error 2
>

FWIW, I got the same error when I tried (Gentoo, with both GCC 4.1.2 and
4.5.3), and it worked just fine when I tried it on a CentOS 5 machine
(consistent with your observations).  There's a reasonably easy fix,
though, that appears to work.

You will need the compile line for that source file (and you'll need to go
into the Objects/ dir).  For me it was:

gcc -g -O2 -I./../Include -I.. -DHAVE_CONFIG_H   -c -o fileobject.o
fileobject.c

Following Cameron's advice, use the -E flag to produce a pre-processed
source file, such as the command below:

gcc -E -g -O2 -I./../Include -I.. -DHAVE_CONFIG_H   -c -o fileobject_.c
fileobject.c

Edit this fileobject_.c file and remove the stdio prototype of getline.
 Then recompile using the original compile line (on fileobject_.c):

gcc -g -O2 -I./../Include -I.. -DHAVE_CONFIG_H   -c -o fileobject.o
fileobject_.c

For me this finishes fine.  Then go back to the top-level directory and
resume "make".  It finished for me (and seems to be working):

Batman src # python1.5
Python 1.5.2 (#1, Aug 28 2012, 20:13:23)  [GCC 4.5.3] on linux3
Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam
>>> import sys
>>> dir(sys)
['__doc__', '__name__', '__stderr__', '__stdin__', '__stdout__', 'argv',
'builtin_module_names', 'copyright', 'exc_info', 'exc_type', 'exec_prefix',
'executable', 'exit', 'getrefcount', 'hexversion', 'maxint', 'modules',
'path', 'platform', 'prefix', 'ps1', 'ps2', 'setcheckinterval',
'setprofile', 'settrace', 'stderr', 'stdin', 'stdout', 'version']
>>> sys.version
'1.5.2 (#1, Aug 28 2012, 20:13:23)  [GCC 4.5.3]'
>>>

Good luck,
Jason
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: portable way of locating an executable (like which)

2012-09-20 Thread Jason Swails
On Thu, Sep 20, 2012 at 5:06 PM, Gelonida N  wrote:

> I'd like to implement the equivalent functionality of the unix command
> /usr/bin/which
>
> The function should work under Linux and under windows.
>
> Did anybody already implement such a function.
> If not, is there a portable way of splitting the environment variable PATH?
>

I've used the following in programs I write:

def which(program):
   def is_exe(fpath):
  return os.path.exists(fpath) and os.access(fpath, os.X_OK)

   fpath, fname = os.path.split(program)
   if fpath:
  if is_exe(program):
 return program
   else:
  for path in os.getenv("PATH").split(os.pathsep):
 exe_file = os.path.join(path, program)
 if is_exe(exe_file):
return exe_file
   return None

IIRC, I adapted it from StackOverflow.  I know it works on Linux and Mac OS
X, but not sure about windows (since I don't know if PATH works the same
way there).

HTH,
Jason
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Tkinter Create/Destory Button

2012-10-22 Thread Jason Swails
On Fri, Oct 19, 2012 at 4:11 PM,  wrote:

> I am trying to create a button in Tkinter and then when it is pressed
> delete it/have it disappear. Does anyone know the simplest script to do
> that with. Thanks for your help.
>

Note that there is a _big_ difference between having a button 'disappear'
and having it destroyed or deleted.  If it 'disappears', it still exists
and can be re-activated whenever you want (as long as you keep an active
reference to it).  Once you destroy the widget, there is no recovering it.

My suggestion is to provide some code that you've tried, and allow people
to help explain why it did not work the way you thought it would.

And take a look at http://effbot.org/tkinterbook/tkinter-index.htm -- it
has proven to be a helpful reference to me.  The actual code to do what you
want is not complex (it took me ~20 lines), but learning how to do it is
quite helpful.

Good luck,
Jason
-- 
http://mail.python.org/mailman/listinfo/python-list


confusion with decorators

2013-01-30 Thread Jason Swails
Hello,

I was having some trouble understanding decorators and inheritance and all
that.  This is what I was trying to do:

# untested
class A(object):
   def _protector_decorator(fcn):
  def newfcn(self, *args, **kwargs):
 return fcn(self, *args, **kwargs)
  return newfcn

   @_protector_decorator
   def my_method(self, *args, **kwargs):
  """ do something here """

class B(A):
   def _protector_decorator(fcn):
  def newfcn(self, *args, **kwargs):
 raise MyException('I do not want B to be able to access the
protected functions')
  return newfcn

The goal of all that was to be able to change the behavior of my_method
inside class B simply by redefining the decorator. Basically, what I want
is B.my_method() to be decorated by B._protector_decorator, but in the code
I'm running it's decorated by A._protector_decorator.

I presume this is because once the decorator is applied to my_method in
class A, A.my_method is immediately bound to the new, 'decorated' function,
which is subsequently inherited (and not decorated, obviously), by B.

Am I correct here?  My workaround was to simply copy the method from class
A to class B, after which B._protector_decorator decorated the methods in
B.  While this doesn't make the use of decorators completely pointless (the
decorators actually do something in each class, it's just different), it
does add a bunch of code duplication which I was at one point hopeful to
avoid.

I'm still stumbling around with decorators a little, but this exercise has
made them a lot clearer to me.

Thanks!
Jason
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: confusion with decorators

2013-01-31 Thread Jason Swails
On Thu, Jan 31, 2013 at 12:46 AM, Steven D'Aprano <
[email protected]> wrote:

> On Wed, 30 Jan 2013 19:34:03 -0500, Jason Swails wrote:
>
> > Hello,
> >
> > I was having some trouble understanding decorators and inheritance and
> > all that.  This is what I was trying to do:
> >
> > # untested
> > class A(object):
> >def _protector_decorator(fcn):
> >   def newfcn(self, *args, **kwargs):
> >  return fcn(self, *args, **kwargs)
> >   return newfcn
>
> Well, that surely isn't going to work, because it always decorates the
> same function, the global "fcn".
>

I don't think this is right.  fcn is a passed function (at least if it acts
as a decorator) that is declared locally in the _protector_decorator scope.
 Since newfcn is bound in the same scope and fcn is not defined inside
newfcn, I'm pretty sure that newfcn will just grab the fcn passed into the
decorator.

The following code illustrates what I'm trying to say (I think):

test.py:
#!/usr/bin/env python

a = 3

print 'Global namespace:', a

def myfunc(a):
   def nested_func():
  print 'nested_func a is:', a, 'id(a) =', id(a)

   print 'passed a is:', a, 'id(a) = ', id(a)
   nested_func()

myfunc(10)

$ python test.py
Global namespace: 3
passed a is: 10 id(a) =  6416096
nested_func a is: 10 id(a) = 6416096

Likewise, newfcn will use the function bound to the passed argument to the
decorator.  This syntax appears to work in my 'real' program.


> You probably want to add an extra parameter to the newfcn definition:
>
> def newfcn(self, fcn, *args, **kwargs):
>

I don't think I want to do that, since fcn  will simply become the first
argument that I pass to the decorated myfunc(), and if it's not callable
I'll get a traceback.

Also, I trust you realise that this is a pointless decorator that doesn't
> do anything useful? It just adds an extra layer of indirection, without
> adding any functionality.
>

Naturally.  I tried to contrive the simplest example to demonstrate what I
wanted.  In retrospect I should've picked something functional instead.

> Am I correct here?  My workaround was to simply copy the method from
> > class A to class B, after which B._protector_decorator decorated the
> > methods in B.
>
> That's not a work-around, that's an anti-pattern.
>
> Why is B inheriting from A if you don't want it to be able to use A's
> methods? That's completely crazy, if you don't mind me saying so. If you
> don't want B to access A's methods, simply don't inherit from A.
>
> I really don't understand what you are trying to accomplish here.
>

Again, my example code is over-simplified.  A brief description of my class
is a list of 'patch' (diff) files with various attributes.  If I want
information from any of those files, I instantiate a Patch instance (and
cache it for later use if desired) and return any of the information I want
from that patch (like when it was created, who created it, what files will
be altered in the patch, etc.).

But a lot of these patches are stored online, so I wanted a new class (a
RemotePatchList) to handle lists of patches in an online repository.  I can
do many of the things with an online patch that I can with one stored
locally, but not everything, hence my desire to squash the methods I don't
want to support.

I'd imagine a much more sensible approach is to generate a base class that
implements all methods common to both and simply raises an exception in
those methods that aren't.  I agree it doesn't make much sense to inherit
from an object that has MORE functionality than you want.

However, my desire to use decorators was not to disable methods in one
class vs. another.  The _protector_decorator (a name borrowed from my
actual code), is designed to wrap a function call inside a try/except, to
account for specific exceptions I might raise inside.  One of my classes
deals with local file objects, and the other deals with remote file objects
via urllib.  Naturally, the latter has other exceptions that can be raised,
like HTTPError and the like.  So my desire was to override the decorator to
handle more types of exceptions, but leave the underlying methods intact
without duplicating them.

I can do this without decorators easily enough, but I thought the decorator
syntax was a bit more elegant and I saw an opportunity to learn more about
them.

Possibly Java.
>

I took a Java class in high school once ~10 years ago... haven't used it
since. :)  Truth be told, outside of Python, the languages I can work in
are Fortran (and to a much lesser extent), C and C++.

import functools
>

I need to support Python 2.4, and the docs suggest this is 2.5+.  Too bad,
too, since functools appears pretty useful.

Thanks for the help!
Jason
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: confusion with decorators

2013-01-31 Thread Jason Swails
On Thu, Jan 31, 2013 at 10:28 AM, Chris Angelico  wrote:

>
> >> Well, that surely isn't going to work, because it always decorates the
> >> same function, the global "fcn".
> >
> >
> > I don't think this is right.  fcn is a passed function (at least if it
> acts
> > as a decorator) that is declared locally in the _protector_decorator
> scope.
> > Since newfcn is bound in the same scope and fcn is not defined inside
> > newfcn, I'm pretty sure that newfcn will just grab the fcn passed into
> the
> > decorator.
>
> Yet it adds a level of indirection that achieves nothing. Why not simply:
> def _protector_decorator(fcn):
>   return fcn
>
> ? I'm not understanding the purpose here.
>

Bad example.  A better (longer) one that is closer to my true use-case:


from somewhere.exceptions import MyTypeError
from somewhere.different import AuthorClass, RemoteAuthorClass
from urllib2 import HTTPError

class A(object):

   authorclass = AuthorClass

   def __init__(self, obj_list):
  """
  Instantiate a list of obj_list objects that may have an "author"
  attribute
  """
  self.things = []
  for o in obj_list:
 if not isinstance(o, self.authorclass):
raise MyTypeError('Bad type given to constructor')
 self.things.append(o)

   def _protector(fcn):
  def newfcn(self, *args, **kwargs):
 try:
return fcn(self, *args, **kwargs) # returns a string
 except AttributeError:
return 'Attribute not available.'
 except IndexError:
return 'Not that many AuthorClasses loaded'

  return newfcn

   @_protector
   def author(self, idx):
  return self.things[idx].author

   @_protector
   def description(self, idx):
  return self.things[idx].description

   @_protector
   def hobbies(self, idx):
  return self.things[idx].hobbies

class B(A):

   authorclass = RemoteAuthorClass

   def _protector(fcn):
  def newfcn(self, *args, **kwargs):
 try:
return fcn(self, *args, **kwargs)
 except AttributeError:
return 'Attribute not available'
 except IndexError:
return 'Not that many RemoteAuthorClasses loaded'
 except HTTPError:
return 'Could not connect'
  return fcn

Basically, while RemoteAuthorClass and AuthorClass are related (via
inheritance), the RemoteAuthorClass has the potential for HTTPError's now.
 I could just expand the A class decorator to catch the HTTPError, but
since that should not be possible in AuthorClass, I'd rather not risk
masking a bug.  I'm under no impressions that the above code will decorate
A-inherited functions with the B-decorator (I know it won't), but that's
the effect I'm trying to achieve...

Thanks!
Jason

-- 
Jason M. Swails
Quantum Theory Project,
University of Florida
Ph.D. Candidate
352-392-4032
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: confusion with decorators

2013-01-31 Thread Jason Swails
On Thu, Jan 31, 2013 at 11:00 AM, Jason Swails wrote:

>
>
> On Thu, Jan 31, 2013 at 10:28 AM, Chris Angelico  wrote:
>
>>
>> >> Well, that surely isn't going to work, because it always decorates the
>> >> same function, the global "fcn".
>> >
>> >
>> > I don't think this is right.  fcn is a passed function (at least if it
>> acts
>> > as a decorator) that is declared locally in the _protector_decorator
>> scope.
>> > Since newfcn is bound in the same scope and fcn is not defined inside
>> > newfcn, I'm pretty sure that newfcn will just grab the fcn passed into
>> the
>> > decorator.
>>
>> Yet it adds a level of indirection that achieves nothing. Why not simply:
>> def _protector_decorator(fcn):
>>   return fcn
>>
>> ? I'm not understanding the purpose here.
>>
>
> Bad example.  A better (longer) one that is closer to my true use-case:
>
>
> from somewhere.exceptions import MyTypeError
> from somewhere.different import AuthorClass, RemoteAuthorClass
> from urllib2 import HTTPError
>
> class A(object):
>
>authorclass = AuthorClass
>
>def __init__(self, obj_list):
>   """
>   Instantiate a list of obj_list objects that may have an "author"
>   attribute
>   """
>   self.things = []
>   for o in obj_list:
>  if not isinstance(o, self.authorclass):
> raise MyTypeError('Bad type given to constructor')
>  self.things.append(o)
>
>def _protector(fcn):
>   def newfcn(self, *args, **kwargs):
>  try:
> return fcn(self, *args, **kwargs) # returns a string
>  except AttributeError:
> return 'Attribute not available.'
>  except IndexError:
> return 'Not that many AuthorClasses loaded'
>
>   return newfcn
>
>@_protector
>def author(self, idx):
>   return self.things[idx].author
>
>@_protector
>def description(self, idx):
>   return self.things[idx].description
>
>@_protector
>def hobbies(self, idx):
>   return self.things[idx].hobbies
>
> class B(A):
>
>authorclass = RemoteAuthorClass
>
>def _protector(fcn):
>   def newfcn(self, *args, **kwargs):
>  try:
> return fcn(self, *args, **kwargs)
>  except AttributeError:
> return 'Attribute not available'
>  except IndexError:
> return 'Not that many RemoteAuthorClasses loaded'
>  except HTTPError:
> return 'Could not connect'
>   return fcn
>
> Basically, while RemoteAuthorClass and AuthorClass are related (via
> inheritance), the RemoteAuthorClass has the potential for HTTPError's now.
>  I could just expand the A class decorator to catch the HTTPError, but
> since that should not be possible in AuthorClass, I'd rather not risk
> masking a bug.  I'm under no impressions that the above code will decorate
> A-inherited functions with the B-decorator (I know it won't), but that's
> the effect I'm trying to achieve...
>

The approach I'm switching to here is to make the decorators wrappers
instead that are passed the functions that need to be called.  Basically,
wrap at run-time rather than 'compile time' (i.e., when the Python code is
'compiled' into class definitions).  That way each child of the main class
can simply change the wrapping behavior by implementing the wrapping
functions instead of duplicating all of the code.  And since this part of
the code is not performance-intensive, I don't care about the overhead of
extra function calls.

It seems to me to be the more appropriate course of action here, since
decorators don't seem to naturally lend themselves to what I'm trying to do.

--Jason
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: confusion with decorators

2013-01-31 Thread Jason Swails
On Thu, Jan 31, 2013 at 6:16 PM, Steven D'Aprano <
[email protected]> wrote:

>
> Normally, subclasses should extend functionality, not take it away. A
> fundamental principle of OO design is that anywhere you could sensibly
> allow an instance, should also be able to use a subclass.
>
> So if you have a Patch class, and a RemotePatch subclass, then everything
> that a Patch can do, a RemotePatch can do too, because RemotePatch
> instances *are also* instances of Patch.
>
> But the rule doesn't go in reverse: you can't necessarily use a Patch
> instance where you were using a RemotePatch. Subclasses are allowed to do
> *more*, but they shouldn't do *less*.
>
> On the other hand, if you have a Patch class, and a RemotePatchList class,
> inheritance does not seem to be the right relationship here. A
> RemotePatchList does not seem to be a kind of Patch, but a kind of list.
>
>
> > I'd imagine a much more sensible approach is to generate a base class
> that
> > implements all methods common to both and simply raises an exception in
> > those methods that aren't.  I agree it doesn't make much sense to inherit
> > from an object that has MORE functionality than you want.
>
> If a method is not common to both, it doesn't belong in the base class. The
> base should only include common methods.
>

Yes, I agree here.  The only reason I was considering NOT doing this was
because I wanted to control the exception that gets raised rather than let
through a simple NameError.  The reason, in case you care, is that I like
creating my own custom excepthook() which optionally suppresses tracebacks
of the base exception class of my program (which can be overridden by a
--debug option of some sort).

That way I don't worry about returning error codes and the like and my
exceptions double as error messages which don't scare users away.  Of
course, if I didn't raise the exception myself, then I definitely want to
know what line that error occurred on so I can fix it (since that typically
means it's a bug or error I did not handle gracefully).

I suppose I could get the same effect by dumping everything into a main()
function somewhere and wrapping that in a try/except where I catch my base
class, but I like the flexibility


> In fact, I'm usually rather suspicious of base classes that don't ever get
> used except as a base for subclassing. I'm not saying it's wrong, but it
> could be excessive abstraction. Abstraction is good, but you can have too
> much of a good thing. If the base class is not used, consider a flatter
> hierarchy:
>
> class Patch:  ...
> class RemotePatch(Patch):  ...
>
>
> rather than:
>
> class PatchBase:  ...
> class Patch(PatchBase):  ...
> class RemotePatch(Patch):  ...
>
> although this is okay:
>
> class PatchBase:  ...
> class Patch(PatchBase):  ...
> class RemotePatch(PatchBase):  ...
>

This last one is what I've settled on.  Patch and RemotePatch have common
functionality.  But RemotePatch can be downloaded and Patch can be parsed
through (in my app, if you're going to spend the time to parse through the
whole RemotePatch, it just gets downloaded and instantiated as a Patch).
 So this last form of inheritance made the most sense to me.


>
>
> > However, my desire to use decorators was not to disable methods in one
> > class vs. another.  The _protector_decorator (a name borrowed from my
> > actual code), is designed to wrap a function call inside a try/except, to
> > account for specific exceptions I might raise inside.
>
> Ah, your example looked like you were trying to implement some sort of
> access control, where some methods were flagged as "protected" to prevent
> subclasses from using them. Hence my quip about Java. What you describe
> here makes more sense.
>
>
> > One of my classes
> > deals with local file objects, and the other deals with remote file
> > objects
> > via urllib.  Naturally, the latter has other exceptions that can be
> > raised,
> > like HTTPError and the like.  So my desire was to override the decorator
> > to handle more types of exceptions, but leave the underlying methods
> > intact without duplicating them.
>
> >>> decorated(3)
> 4
>
> One way to do that is to keep a list of exceptions to catch:
>
>
> class Patch:
> catch_these = [SpamException, HamException]
> def method(self, arg):
> try:
> do_this()
> except self.catch_these:
> do_that()
>
> The subclass can then extend or replace that list:
>
> class RemotePatch(Patch):
> catch_these = Patch.catch_these + [EggsException, CheeseException]
>

Ha! I use this technique all the time to avoid code duplication (it's used
several times in the program I'm writing).  It didn't even occur to me in
this context... Thanks for pointing this out!

As always, the time you put into responses and helping is appreciated.

All the best,
Jason
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: TypeError: 'module' object is not callable

2011-08-11 Thread Jason Swails
On Thu, Aug 11, 2011 at 8:56 PM, Forafo San  wrote:

>
> Thank you all for your replies. When I do a
>
> from Univariate import Univariate
>
> the TypeError disappears and everything is fine.  Clearly this was an error
> that a newbie such as myself is likely to make because of little experience
> with Python. However, this isn't something I'm likely to forget.
>
> I will also adopt the style recommendations.  Thanks, again.
> --
> http://mail.python.org/mailman/listinfo/python-list
>

As a beginner, I found that it was useful to note how these 2 approaches
differ.  In the first approach:

import Univariate
a = Univariate.Univariate(foo)

what you're doing is loading everything in Univariate in a separate
namespace (the Univariate namespace).  The other approach:

from Univariate import Univariate
a = Univariate(foo)

what you're doing is loading *just* Univariate into your top level
namespace.  It's important to note this difference because it has
potentially critical implications as far as clobbering existing functions
that happen to have the same name (if you're loading into your top level
namespace with, for instance, "from mymodule import *") versus keeping
everything in a separate namespace so stuff doesn't get overwritten.

This is more applicable to scripts/programs you write that import a number
of different modules, all of whom may contain objects with the same name.

All the best,
Jason

-- 
Jason M. Swails
Quantum Theory Project,
University of Florida
Ph.D. Candidate
352-392-4032
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Ten rules to becoming a Python community member.

2011-08-16 Thread Jason Swails
On Tue, Aug 16, 2011 at 9:14 PM, Steven D'Aprano <
[email protected]> wrote:

> David Monaghan wrote:
>
> > On Tue, 16 Aug 2011 13:13:10 -0700 (PDT), rantingrick
> >  wrote:
> >
> >>If conciseness is all you seek then perhaps you prefer the following?
> >>
> >>ORIGINAL: "I used to wear wooden shoes"
> >>CONCISE:  "I wore wooden shoes"
> >
> >>ORIGINAL: "I have become used to wearing wooden shoes"
> >>CONCISE:  "I like wearing wooden shoes"
> >
> >>However as you can see much of the rich information is missing.
> >
> > Indeed. Neither of your two concise examples has the same meaning of the
> > originals.
>
> The second one is considerably different. Consider:
>
> "I have become used to getting up at 3am to be flogged for an hour by my
> boss. Between the sleep deprivation and the scar tissue on my back, I
> hardly feel a thing any more."
>
> versus
>
> "I like getting up at 3am to be flogged for an hour by my boss. I get all
> tingly in my man-bits, if you know what I mean."
>
> The first case is more subtle. The implication of "I used to wear..." is
> that you did back in the past, but no longer do, while "I wore..." has no
> such implication. It merely says that in the past you did this, whether you
> still do or don't is irrelevant.
>

Meh.  We can come up with examples all over the place to support any of our
assertions.  The context is what matters.  In a newspaper article, you'd
often prefer to use "The president wore shoes" to "The president used to
wear shoes" because the extra words and space make a difference, and you
want to be concise while letting people know what happened.  In a public
forum catering to native and non-native English speakers alike, common
phrases like "used to" and "supposed to" are well known and understood.  As
such, I argue they are supposed to be used (to) often.

In such circumstances as these, I say keep your language concise and simple,
and your words will reach the most people (and the fewest killfiles,
perhaps).  As the wise man says, "It's not only quiet people that don't say
much".  (And here RR joins his silent majority).

Peace,
Jason
-- 
http://mail.python.org/mailman/listinfo/python-list


typing question

2011-08-27 Thread Jason Swails
Hello everyone,

This is probably a basic question with an obvious answer, but I don't quite
get why the type(foo).__name__ works differently for some class instances
and not for others.  If I have an "underived" class, any instance of that
class is simply of type "instance".  If I include an explicit base class,
then its type __name__ is the name of the class.

$ python
Python 2.7.2 (default, Aug 26 2011, 22:35:24)
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> class MyClass:
... pass
...
>>> foo = MyClass()
>>> type(foo)

>>> type(foo).__name__
'instance'
>>> class MyClass1():
... pass
...
>>> bar = MyClass1()
>>> type(bar)

>>> type(bar).__name__
'instance'
>>> class MyClass2(object):
... pass
...
>>> foobar = MyClass2()
>>> type(foobar)

>>> type(foobar).__name__
'MyClass2'

I can't explain this behavior (since doesn't every class inherit from object
by default? And if so, there should be no difference between any of my class
definitions).  I would prefer that every approach give me the name of the
class (rather than the first 2 just return 'instance').  Why is this not the
case?  Also, is there any way to access the name of the of the class type
foo or bar in the above example?

Thanks!
Jason

P.S.  I'll note that my "preferred" behavior is how python3.2 actually
operates

$ python3.2
Python 3.2.1 (default, Aug 26 2011, 23:20:19)
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> class MyClass:
... pass
...
>>> foo = MyClass()
>>> type(foo).__name__
'MyClass'


-- 
Jason M. Swails
Quantum Theory Project,
University of Florida
Ph.D. Candidate
352-392-4032
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: killing a script

2011-08-29 Thread Jason Swails
On Sun, Aug 28, 2011 at 10:41 PM, Russ P.  wrote:

>
> > You could look at the return value of os.system, which may tell you the
> > exit status of the process.
>
> Thanks for the suggestion. Yeah, I guess I could do that, but it seems
> that there should be a simpler way to just kill the "whole enchilada."
> Hitting Control-C over and over is a bit like whacking moles.
>

Agreed.  I had written a program that had a similar problem.  As others have
suggested, you need to either wrap os.system in another function that
analyzes the return value of the call or use another approach in which the
Python program itself sees the SIGINT (I ended up changing to
subprocess.Popen classes since they are more customizable and SIGINT is
captured by the Python program itself rather than the child process).

Another thing you can consider doing is to define your scripts' behavior if
it captures a SIGINT.

(Untested)

import signal, sys

def sigint_handler():
sys.stdout.write('Caught an interruption signal!')
sys.exit(1)

signal.signal(signal.SIGINT, sigint_handler)

**rest of your program**

Of course, the SIGINT signal won't be caught if it isn't seen by the main
Python process, so this still won't do anything if you use an
unprotected/unwrapped os.system command.

HTH,
Jason
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Optparse buggy?

2011-09-01 Thread Jason Swails
On Thu, Sep 1, 2011 at 5:12 PM, Fulvio  wrote:

> Hello,
>
> I'm on python3.2, trying some experiment with OptionParser but no success
>
> >>> from optparse import OptionParser as parser
> >>> parser.add_option('-n','--new', dest='new')
>

Here you've imported parser as an alias to the OptionParser class.  You can
only use add_option() on an instance of that class.  Try this:

from optparse import OptionParser

parser = OptionParser()
parser.add_option('-n','--new',dest='new')

However, I think argparse has replaced optparse since Python 2.7 and
higher...

HTH,
Jason
-- 
http://mail.python.org/mailman/listinfo/python-list


backwards-compatibility

2011-02-26 Thread Jason Swails
Hello,

I have a question I was having a difficult time finding with a quick google
search, so I figured someone on here might know.  For the sake of backwards
compatibility (and supporting systems whose default python is OLD), I'd like
to rewrite some code to be compliant with Pythons as old as 2.4.  For this
reason I've already had to do away with all "{1}".format(item), but I'm
running into new problems with the packages I've written.  For instance, I
have a package "package1" with "subpackage1".  One of the modules in
subpackage1 imports the exceptions module from package1, and I do that like
this:

from ..exceptions import MyException

Which is perfectly fine by python2.5, 2.6, and 2.7; but unacceptable in
python2.4.  Any thoughts?

Another python2.6 feature I'm using is

except Exception as err:
   print err

Is there any way of rewriting this so I can still print the error message in
python2.5/2.4?

Thanks!
Jason

-- 
Jason M. Swails
Quantum Theory Project,
University of Florida
Ph.D. Candidate
352-392-4032
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: backwards-compatibility

2011-03-01 Thread Jason Swails
> subpackage1 imports the exceptions module from package1, and I do that
> like
> > this:
> >
> > from ..exceptions import MyException
> >
>
> You'll have to import that using the absolute import. It would be
> "from package1.exceptions import MyException".
>

Ah; I didn't quite see how something in subpackage1 would know to look up a
directory to see if it was in another package (I thought I would have to
play games with PYTHONPATH).  Works like a charm though.  Thanks!


>
> > Which is perfectly fine by python2.5, 2.6, and 2.7; but unacceptable in
> > python2.4.  Any thoughts?
> >
> > Another python2.6 feature I'm using is
> >
> > except Exception as err:
> >print err
> >
>
> except Exception, err :
>

Ah, great.  And it also works for python2.6 and 2.7.


> > Is there any way of rewriting this so I can still print the error message
> in
> > python2.5/2.4?
> > 
>

Many Unix OSes (especially on supercomputers) have painfully out-of-date
system python versions, so unfortunately I have to maintain compatibility
with these super old versions.

Thanks again!
Jason

-- 
Jason M. Swails
Quantum Theory Project,
University of Florida
Ph.D. Candidate
352-392-4032
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Get Path of current Script

2011-03-14 Thread Jason Swails
On Mon, Mar 14, 2011 at 11:25 AM, Alexander Schatten wrote:

> They don't. Hm, ok, I am always for best practices. If there is a
> better way to do it I am open for suggestions ;-) How would the best
> practice be to load configuration data from a file.
>
> I mean, this is something very common: you write a program or a script
> and want to load some configuration data.
>

For *nix, many utilities publish conf files in the user's home directory in
some sort of .conf file.  That makes it easy to give each user their own
.conf file (if multiple users will use it), and avoids any kind of
permission issues that arise if your script is in a folder whose write
positions are turned off.  It's also where common resource files are loaded
(same kind of idea).

Examples: ~/.bashrc, ~/.vimrc, ~/.bash_profile, ~/.cshrc, etc.

Alternatives are hard-coding the install directory location as part of the
install process, which is done sometimes as well.  This is easily accessible
from python via

os.environ['HOME']

or

os.getenv('HOME')

All the best,
Jason

-- 
Jason M. Swails
Quantum Theory Project,
University of Florida
Ph.D. Candidate
352-392-4032
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Problems of Symbol Congestion in Computer Languages

2011-03-14 Thread Jason Swails
> The metric system is defined to such a ridiculous level of precision
> because we have the technology, and the need, to measure things to that
> level of precision. Standards need to be based on something which is
> universal and unchanging.
>

Other systems of measure (for instance, atomic units and the light-year) are
based on physical constants instead of the size of a stick in France.  As
long as these don't change relative to one another, these approaches are
formally equivalent.  We can't be sure that's true, though.  If that's the
case, then we can't possibly keep an unchanging system of measurement.

Not to disagree with Steven, as these arguments are irrelevant in (almost
all) current scientific research; just to pose thoughts.

Food for thought,
Jason
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: dynamic assigments

2011-03-24 Thread Jason Swails
On Thu, Mar 24, 2011 at 4:05 PM, Steven D'Aprano <
[email protected]> wrote:

> On Thu, 24 Mar 2011 19:39:21 +0100, Seldon wrote:
>
> > Hi, I have a question about generating variable assignments dynamically.
> [...]
> > Now, I would like to use data contained in this list to dynamically
> > generate assignments of the form "var1 = value1", ecc where var1 is an
> > identifier equal (as a string) to the 'var1' in the list.
>
> Why on earth would you want to do that?
>

Doesn't optparse do something like that?  (I was actually looking for a way
to do this type of thing)  OptionParser.add_option() etc. etc. where you add
a variable and key, yet it assigns it as a new attribute of the Options
class that it returns upon OptionParser.parse_args().

I would see this as a way of making a class like that generalizable.  I
agree that __main__ (or something contained within) would certainly need to
know the name of each variable to be even remotely useful, but that doesn't
mean that each part inside does as well.

It's certainly doable with dicts and just using key-pairs, but i'd rather do
something like

opts.variable

than

opts.dictionary['variable']

Just my 2c.  Thanks to JM for suggesting setattr, btw.

--Jason
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Converting an array of string to array of float

2011-03-25 Thread Jason Swails
I'm guessing you have something like

list1=['1.0', '2.3', '4.4', '5.5', ...], right?

You can do this:

for i in range(len(list1)):
  list1[i] = float(list1[i])

There's almost certainly a 1-liner you can use, but this should work.

--Jason

On Fri, Mar 25, 2011 at 8:19 AM, joy99  wrote:

> Dear Group,
>
> I got a question which might be possible but I am not getting how to
> do it.
>
> If I have a list, named,
> list1=[1.0,2.3,4.4,5.5]
>
> Now each element in the array holds the string property if I want to
> convert them to float, how would I do it?
>
> Extracting the values with for and appending to a blank list it would
> not solve the problem. If appended to a blank list, it would not
> change the property.
>
> If any one of the learned members can kindly suggest any solution?
>
> Thanks in advance.
> Best Regards,
> Subhabrata.
> --
> http://mail.python.org/mailman/listinfo/python-list
>
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Some questions on pow and random

2011-03-27 Thread Jason Swails
2011/3/27 joy99 

> Dear Group,
>
> I have two questions one related to pow() and other is related to
> random.
> My questions are as below:
>
> (i) By standard definition of Likelihood Estimation, we get if x EURO X,
> where X is a countable set of types, p is probability and f is
> frequency.
> L(f;p)=Πp(x)f(x)
>
> My question is python provides two functions,
> (a) pow for power.
> (b) reduce(mul, list)
>
> Now, how to combine them? If any one can suggest any help?
> As p(x)f(x), would be huge would pow support it?
>

Not quite sure what you mean by "combine".  p(x)f(x) implies multiplying to
me, but "combine" suggests you mean something like p(f(x)) or the like.  In
any case, as long as the returned result of one of the functions is a valid
argument for the other, you can use the second approach.  And as long as the
results of both functions can be multiplied together (i.e. that operation is
defined), you can do that as well.

At most, pow would be limited by the floating point number in python.
sys.float_info gives me the following:

>>> sys.float_info
sys.float_info(max=1.7976931348623157e+308, max_exp=1024, max_10_exp=308,
min=2.2250738585072014e-308, min_exp=-1021, min_10_exp=-307, dig=15,
mant_dig=53, epsilon=2.220446049250313e-16, radix=2, rounds=1)

So if the numbers may be bigger than 1.8e+308, then you'll need to find an
alternative way of doing this.  May I suggest recasting the problem using
only logs if possible (since that will increase the value of the digits that
can be used up to 10 ^ (1.8 E 308).  You can of course always back out the
full value afterwards.


> (b) Suppose we have two distributions p(x1) and p(x2), of the Model M,
> the E of EM algorithm, without going into much technical details is,
> P0(x1,x2), P1(x1,x2) 
>
> Now I am taking random.random() to generate both x1 and x2 and trying
> to multiply them, is it fine? Or should I take anything else?
>
>
I see no reason why you can't multiply them...  I'm not exactly sure what
you're trying to get here, though.

--Jason
-- 
http://mail.python.org/mailman/listinfo/python-list


Python program termination and exception catching

2011-04-10 Thread Jason Swails
Hello everyone,

This may sound like a bit of a strange desire, but I want to change the way
in which a python program quits if an exception is not caught.  The program
has many different classes of exceptions (for clarity purposes), and they're
raised whenever something goes wrong.  Most I want to be fatal, but others
I'd want to catch and deal with.

Is there any way to control Python's default exit strategy when it hits an
uncaught exception (for instance, call another function that exits
"differently")?

An obvious way is to just catch every exception and manually call that
function, but then I fill up my script with trys and excepts which hurts
readability (and makes the code uglier) and quashes tracebacks; neither of
which I want to do.

Any thoughts?

Thanks!
Jason
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python program termination and exception catching

2011-04-10 Thread Jason Swails
On Sun, Apr 10, 2011 at 12:34 PM, Laszlo Nagy  wrote:

>  2011.04.10. 21:25 keltezéssel, Jason Swails írta:
>
> Hello everyone,
>
> This may sound like a bit of a strange desire, but I want to change the
> way in which a python program quits if an exception is not caught.  The
> program has many different classes of exceptions (for clarity purposes), and
> they're raised whenever something goes wrong.  Most I want to be fatal, but
> others I'd want to catch and deal with.
>
> Well, the application quits when all of it threads are ended. Do you want
> to catch those exception only in the last threads? Or do you want to do it
> in all threads? Or just the main thread?
>

The problem here is that the threads in this case are MPI threads, not
threads spawned during execution. (mpi4py).  I want exceptions to be fatal
as they normally are, but it *must* call MPI's Abort function instead of
just dying, because that will strand the rest of the processes and cause an
infinite hang if there are any subsequent communication attempts.

Hopefully this explains it more clearly?

Thanks!
Jason

-- 
Jason M. Swails
Quantum Theory Project,
University of Florida
Ph.D. Candidate
352-392-4032
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python program termination and exception catching

2011-04-10 Thread Jason Swails
On Sun, Apr 10, 2011 at 4:49 PM, Jerry Hill  wrote:

> On Sun, Apr 10, 2011 at 3:25 PM, Jason Swails 
> wrote:
> >
> > Hello everyone,
> >
> > This may sound like a bit of a strange desire, but I want to change the
> way in which a python program quits if an exception is not caught.  The
> program has many different classes of exceptions (for clarity purposes), and
> they're raised whenever something goes wrong.  Most I want to be fatal, but
> others I'd want to catch and deal with.
> >
> > Is there any way to control Python's default exit strategy when it hits
> an uncaught exception (for instance, call another function that exits
> "differently")?
>
> When an exception is raised and uncaught, the interpreter calls
> sys.excepthook. You can replace sys.excepthook with your own function.
>  See http://docs.python.org/library/sys.html#sys.excepthook
>

This is exactly what I was looking for.  Thank you!  I can just redefine
sys.excepthook to call MPI's Abort function and print the Tracebacks;
exactly what I wanted.

MPI threading doesn't work in the same way as, for instance, the threading
modules in Python's stdlib.  It doesn't spawn additional threads from some
'main' thread.  Instead, all of the threads are launched simultaneously at
the beginning and run the same program, dividing the workload based on their
rank, so I think my application is immune to this bug.

Thanks again!
Jason


> If your program is threaded, you may need to look at this bug:
> http://bugs.python.org/issue1230540.  It describes a problem with
> replacing sys.excepthook when using the threading module, along with
> some workarounds.
>
> There's a simple example of replacing excepthook here:
> http://code.activestate.com/recipes/65287/
>
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Feature suggestion -- return if true

2011-04-11 Thread Jason Swails
On Mon, Apr 11, 2011 at 7:12 PM, James Mills
wrote:

>
> > Are you saying the two snippets above are equivalent?
>
> def foo(n):
>x = n < 5
>if x:
>return x
>
> is functionally equivalent to:
>
> def foo(n):
>return n < 5
>
>
This is only true if n < 5.  Otherwise, the first returns None and the
second returns False.

>>> def foo(n):
... x = n < 5
... if x: return x
...
>>> def foo1(n):
... return n < 5
...
>>> foo(4)
True
>>> foo1(4)
True
>>> foo(6)
>>> foo1(6)
False
>>>

--Jason
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: [Mac OSX] TextWrangler "run" command not working properly

2011-04-15 Thread Jason Swails
On Thu, Apr 14, 2011 at 1:52 PM, Fabio  wrote:

> Then, I started to use TexWrangler, and I wanted to use the "shebang"
> menu, and "run" command.
> I have the "#! first line" pointing to the 2.6 version.
> It works fine, as long as I don't import the libraries, in which case it
> casts an error saying:
>
> ImportError: No module named scipy
>
> Maybe for some reason it points to the old 2.5 version.
> But I might be wrong and the problem is another...
>
>
TextWrangler doesn't launch a shell session that sources your typical
resource files (i.e. .bashrc, etc.), so any changes you make in an
interactive terminal session probably WON'T be loaded in TextWrangler.

See this website about setting environment variables for native Mac OS X
applications to see them:
http://www.astro.washington.edu/users/rowen/AquaEnvVar.html

Maybe if you prepend your Python 2.6 (MacPorts?) location to your PATH,
it'll find it.

--Jason
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Egos, heartlessness, and limitations

2011-04-16 Thread Jason Swails
On Wed, Apr 13, 2011 at 7:03 PM, Ryan Kelly  wrote:

> It's also an oft-cited troll conspiracy that Guido hangs out on
> python-list and posts under various pseudonyms.  I think it would be
> kinda fun if he did...
>

What if one of those was rr?  I can't imagine he'd have that much time for
self-entertainment, but that would be pretty awesome (terrible?).

--Jason
-- 
http://mail.python.org/mailman/listinfo/python-list


Very strange issues with collections.Mapping

2018-01-18 Thread Jason Swails
Hello!

I am running into a very perplexing issue that is very rare, but creeps up
and is crashing my app.

The root cause of the issue comes down to the following check returning
true:

isinstance([], collections.Mapping)

Obviously you can get this behavior if you register `list` as a subclass of
the Mapping ABC, but I'm not doing that.  Because the issue is so rare (but
still common enough that I need to address it), it's hard to reproduce in a
bench test.

What I am going to try is to essentially monkey-patch
collections.Mapping.register with a method that dumps a stack trace
whenever it's called at the time of initial import so I can get an idea of
where this method could *possibly* be getting called with a list as its
argument.

The annoying thing here is that wherever the bug is happening, the crash
happens *way* far away (in a third-party library).  I've also determined it
as the root cause of two crashes that seem completely unrelated (except
that the crash is caused by a list not behaving like a dict shortly after
isinstance(obj, collections.Mapping) returns True).  These are the
libraries I'm using:

amqp
billiard
celery
dj-database-url
Django
django-redis-cache
enum34
gunicorn
kombu
newrelic
psycopg2
pyasn1
pytz
redis
requests
rsa
six
vine
voluptuous

It's a web application, as you can probably tell.  The main reason I ask
here is because I'm wondering if anybody has encountered this before and
managed to hunt down which of these libraries is doing something naughty?

Thanks!
Jason

-- 
Jason M. Swails
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Tkinter problem: TclError> couldn't connect to display ":0

2013-12-29 Thread Jason Swails
On Sun, Dec 29, 2013 at 10:29 PM, Steven D'Aprano wrote:

> On Mon, 30 Dec 2013 10:30:11 +1100, Chris Angelico wrote:
>
> > On Mon, Dec 30, 2013 at 10:22 AM, Steven D'Aprano
> >  wrote:
> >> So you need to X-forward from the remote machine to the machine you are
> >> physically on, or perhaps it's the other way (X is really weird). I
> >> have no idea how to do that, but would love to know.
> >
> > With SSH, that's usually just "ssh -X target", and it'll mostly work.
>
> Holy cow, it works! Slwly, but works.
>

I usually use "ssh -Y".  The -Y argument toggles trusted forwarding.  From
the ssh man-page:

 -Y  Enables trusted X11 forwarding.  Trusted X11 forwardings are
not subjected to the X11
 SECURITY extension controls.

I've found -Y is a bit faster than -X in my experience (I've never really
had many problems with X-forwarding on LANs in my experience -- even with
OpenGL windows)
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What use of these _ prefix members?

2016-01-12 Thread Jason Swails
On Tue, Jan 12, 2016 at 9:12 AM, me  wrote:

> On 2016-01-10, Peter Otten <[email protected]> wrote:
>  class Derived(Base):
> > ... def _init(self, x):
> > ... super()._init(x)
> > ... print("do something else with", x)
> > ...
>  Derived(42)
> > do something with 42
> > do something else with 42
> ><__main__.Derived object at 0x7f8e6b3e9b70>
> >
>
> I think you are doing inheritance wrong.
>

​There's nothing "wrong" about this, and there are times this type of
pattern is justified.  Sure, *this* example doesn't make sense to do it
this way, but this is just an illustrative example.  I would even call this
type of pattern pythonic.
​
​

> AFAIK you should call directly the __init__() of the parent class, and
> ​​
> pass *args and **kwargs instead.
>

Sometimes there's no need to call __init__ on the parent class directly,
and the base class's __init__ is sufficient for the derived class.  And
perhaps initialization requires numerous "steps" that are easiest to grok
when split out into different, private sub-methods. For example:

class Derived(Base):
​def __init__(self, arg1, arg2, arg3):
self._initial_object_setup()
self._process_arg1(arg1)
self._process_arg23(arg2, arg3)
self._postprocess_new_object()​

This makes it clear what is involved in the initialization of the new
object.  And it allows the functionality to be split up into more atomic
units.  It also has the added benefit of subclasses being able to more
selectively override base class functionality.  Suppose Derived only needs
to change how it reacts to arg1 -- all Derived needs to implement directly
is _process_arg1.  This reduces code duplication and improves
maintainability, and is a pattern I've used myself and like enough to use
again (not necessarily in __init__, but outside of being automatically
called during construction I don't see anything else inherently "specialer"
about __init__ than any other method).

All the best,
Jason
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Why generators take long time?

2016-01-19 Thread Jason Swails
On Tue, Jan 19, 2016 at 2:27 AM, Arshpreet Singh  wrote:

>
> I was playing with Generators and found that using Generators time is bit
> more than list-comprehensions or I am doing it wrong?
>
>
> Function with List comprehensions:
>
> def sum_text(number_range):
> return sum([i*i for i in xrange(number_range)])
>
> %timeit sum_text(1)
> 1 loops, best of 3: 14.8 s per loop
>
> Using generator Expressions:
>
> def sum_text(number_range):
>
> return sum((i*i for i in xrange(number_range)))
>
> %timeit sum_text(1)
>
> 1 loops, best of 3: 16.4 s per loop
>

​Steven already pointed out the additional overhead in a generator
expression vs. a list comprehension.  In addition to the memory savings you
get via generator expressions, though, you can also get significant time
savings when generator expressions have the ability to short-circuit.

For instance, have a look at the following:

In [1]: import random

In [2]: %timeit all(random.random() < 0.5 for i in range(1000))
The slowest run took 4.85 times longer than the fastest. This could mean
that an intermediate result is being cached
10 loops, best of 3: 3.57 µs per loop

In [3]: %timeit all([random.random() < 0.5 for i in range(1000)])
1000 loops, best of 3: 422 µs per loop

In [4]: %timeit any(random.random() < 0.5 for i in range(1000))
10 loops, best of 3: 3.18 µs per loop

In [5]: %timeit any([random.random() < 0.5 for i in range(1000)])
1000 loops, best of 3: 408 µs per loop

This is using IPython with Python 3.5.  The difference here is that for
functions that short-circuit (like any and all), the generator expression
does not have to exhaust all of its elements (particularly since for each
element there's a 50-50 chance of being True or False in each case).  In
this case, the difference is a couple orders of magnitude.  The larger the
range argument is, the bigger this difference.

Also, in Python 2, the generator expression does not leak into the global
namespace, while the list comprehension does:

Python 2.7.10 (default, Jul 14 2015, 19:46:27)
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.39)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> list(i for i in range(10))
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> i
Traceback (most recent call last):
  File "", line 1, in 
NameError: name 'i' is not defined
>>> [i for i in range(10)]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> i
9

Python 3 does not leak the iterator variable in either case.  However, it
would be madness to have code actually relying on this behavior :).

At the end of the day, I use list comprehensions in the following
circumstances:

- I *know* I won't blow memory with a too-large list
- I want to iterate over the object multiple times or I want/may want
non-sequential access
- I know I want all the elements I'm creating (i.e., no chance of
short-circuiting)

I use generator expressions when

- I *might* want to

All the best,
Jason

P.S. There is a "cross-over" point where the memory requirements of the
list comp passes the generator overhead.  For instance:


In [17]: %timeit sum(i for i in range(1000))
1 loops, best of 3: 2.08 s per loop

In [18]: %timeit sum([i for i in range(1000)])
1 loops, best of 3: 1.86 s per loop

In [19]: %timeit sum(i for i in range(1))
1 loops, best of 3: 21.8 s per loop

In [20]: %timeit sum([i for i in range(1)])
1 loops, best of 3: 26.1 s per loop

-- 
Jason M. Swails
BioMaPS,
Rutgers University
Postdoctoral Researcher
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Why generators take long time?

2016-01-19 Thread Jason Swails
On Tue, Jan 19, 2016 at 3:19 PM, Jason Swails 
wrote:

>
> I use generator expressions when
>
> - I *might* want to
>

​I forgot to finish my thought here.  I use generator expressions when I
don't want to worry about memory, there's a decent chance of
short-circuiting​, or I just want to do some simple iteration where the
iterator will not be the bottleneck (which is almost always).

-- 
Jason M. Swails
BioMaPS,
Rutgers University
Postdoctoral Researcher
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ts.plot() pandas: No plot!

2016-02-01 Thread Jason Swails
On Sun, Jan 31, 2016 at 9:08 PM, Paulo da Silva <
[email protected]> wrote:

> Às 01:43 de 01-02-2016, Mark Lawrence escreveu:
> > On 01/02/2016 00:46, Paulo da Silva wrote:
> ...
>
> >>
> >
> > Is it as simple as adding a call to ts.show() ?
> >
> Thanks for the clue!
> Not so simple however.
> Needed to do
> import matplotlib.pyplot as plt
> plt.show()
>

​What you saw ts.plot() return was the matplotlib artists (the things that
will be drawn on whatever "canvas" is provided -- either saved to an image
or drawn to a GUI widget).  So whenever you see this kind of return value,
you know you need to call the matplotlib.pyplot.show function in order to
generate a canvas widget (with whatever backend you choose) and draw it.

If you want to do this kind of interactive plotting (reminiscent, I've
heard, of Matlab), I would highly recommend checking out IPython.  You can
use IPython's notebook or qtconsole and embed plots from matplotlib
directly in the viewer.  For example, try this:

ipython qtconsole

This opens up a window, then use the magic command "%matplotlib inline" to
have all plots sent directly to the ipython console you are typing commands
in.  I've found that kind of workflow quite convenient for directly
interacting with data.

HTH,
Jason
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: [STORY-TIME] THE BDFL AND HIS PYTHON PETTING ZOO

2016-02-08 Thread Jason Swails
On Sun, Feb 7, 2016 at 2:58 AM, Chris Angelico  wrote:

>
> Would writing a script to figure out whether there are more
> statisticians or programmers be a statistician's job or a
> programmer's?
>

​Yes.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Computational Chemistry Analysis

2016-02-28 Thread Jason Swails
On Wed, Feb 24, 2016 at 8:01 PM, Feagans, Mandy 
wrote:

> Dear Python,
>
>
> Hi! I am a student interested in conducting computational analysis of
> protein-ligand binding for drug development analysis. Recently, I read of
> an individual using a python program for their studies of protein-ligand
> binding. As I have been reading about Python programs, however, all I have
> been able to find are programs that use the python script (i.e. Autodock).
> I was hoping to see if there were any programs specifically run through the
> python programming that ran similar analysis to Autodock?
>

​Being trained as a computational chemist myself (and having learned Python
by writing a program for carrying out protein-ligand free energy
calculations: http://pubs.acs.org/doi/abs/10.1021/ct300418h), I had to
answer...

There are a number of programs out there that are aimed at computational
modeling of biomolecules, and many of them are written in Python or have
Python interfaces (pymol, chimera, MMPBSA.py, OpenMM to name a very small
few).  However, this is definitely not the right forum to ask such
questions as a very small number of people who frequent this list are
computational chemists (I check it only occasionally).

Looking for a program to do something because it's written in Python is the
wrong way to go about this.  You need to design your experiment (i.e., what
you want to test and what you hope to learn), then try to design a set of
calculations and analyses that will allow you to probe your underlying
hypothesis.  Then you should pick the software to perform these analyses
based on the best choice.  That choice is very frequently whatever others
in your lab are using.  A research group builds up experience (based
originally on the experience of the PI, typically) in a set of programs
they use for their computational experiments, and deviating from that set
of programs essentially discards potentially decades worth of experience.
So my suggestion -- ask other group members or the professor what softwares
they use, and consult Google if you wish to branch out a little.

You can also find more applicable mailing lists to ask questions about
computational chemistry, like the CCL or molecular dynamics news (the CCL
being the defacto "Computational Chemistry List").  That should help give
perhaps more helpful places to start.

FWIW, I did all my work with the AMBER and OpenMM software suites, (and
wrote a substantial amount of code for both projects).  But those are far
from the only options out there.

HTH,
Jason

-- 
Jason M. Swails
BioMaPS,
Rutgers University
Postdoctoral Researcher
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python extension using a C library with one 'hello' function

2014-11-04 Thread Jason Swails
On Tue, 2014-11-04 at 16:22 +0630, Veek M wrote:
> https://github.com/Veek/Python/tree/master/junk/hello
> doesn't work.
> I have:
> hello.c which contains: int hello(void);
> hello.h
> 
> To wrap that up, i have:
> hello.py -> _hello (c extension) -> pyhello.c -> method py_hello()
> 
> People using this will do:
> python3.2>> import hello
> python3.2>> hello.hello()
> 
> It doesn't compile/work. 
> 
> deathstar> python setup.py build_ext --inplace
> running build_ext
> building '_hello' extension
> creating build
> creating build/temp.linux-x86_64-3.2
> gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -
> D_FORTIFY_SOURCE=2 -g -fstack-protector --param=ssp-buffer-size=4 -Wformat -
> Werror=format-security -fPIC -I/usr/include/python3.2mu -c pyhello.c -o 
> build/temp.linux-x86_64-3.2/pyhello.o
> pyhello.c:15:6: warning: character constant too long for its type [enabled 
> by default]
> pyhello.c:15:5: warning: initialization makes pointer from integer without a 
> cast [enabled by default]
> pyhello.c:15:5: warning: (near initialization for 'hellomethods[0].ml_name') 
> [enabled by default]
> pyhello.c:15:5: warning: initialization from incompatible pointer type 
> [enabled by default]
> pyhello.c:15:5: warning: (near initialization for 'hellomethods[0].ml_meth') 
> [enabled by default]
> pyhello.c:15:5: warning: initialization makes integer from pointer without a 
> cast [enabled by default]
> pyhello.c:15:5: warning: (near initialization for 
> 'hellomethods[0].ml_flags') [enabled by default]
> gcc -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-z,relro 
> build/temp.linux-x86_64-3.2/pyhello.o -o 
> /root/github/junk/hello/_hello.cpython-32mu.so
> 
> 
> >>> import hello
> Traceback (most recent call last):
>   File "", line 1, in 
>   File "hello.py", line 1, in 
> from _hello import *
> ImportError: ./_hello.cpython-32mu.so: undefined symbol: hello

When I try your code, I get the error:

ImportError: dynamic module does not define init function (PyInit__hello)

There were a couple other problems as well.  Like 'hello' in
hellomethods instead of "hello" (note the double quotes).  Also, NULL is
no longer acceptable as a METH_XXXARGS replacement, you need to set it
to METH_NOARGS (or MET_VARARGS if you plan on accepting arguments).

I've found that you also need a NULL sentinel in the hellomethods array
to avoid segfaults on my Linux box.

After fixing these problems, you still need to add "hello.c" to the list
of sources in setup.py to make sure that module is built.

I've submitted a PR to your github repo showing you the changes
necessary to get your module working on my computer.

Hope this helps,
Jason

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python extension using a C library with one 'hello' function

2014-11-04 Thread Jason Swails
On Tue, 2014-11-04 at 21:45 +0630, Veek M wrote:
> Jason Swails wrote:
> 
> > I've submitted a PR to your github repo showing you the changes
> > necessary to get your module working on my computer.
> 
> Segfaults :p which is an improvement :)

What operating system are you running this on?  It works fine for me on
Linux:

bash$ ls
hello.c  hello.h  hello.py  pyhello.c  setup.py
bash$ python3.4 setup.py build_ext --inplace
running build_ext
building '_hello' extension
creating build
creating build/temp.linux-x86_64-3.4
x86_64-pc-linux-gnu-gcc -pthread -fPIC -I/usr/include/python3.4 -c pyhello.c -o 
build/temp.linux-x86_64-3.4/pyhello.o
pyhello.c:15:5: warning: initialization from incompatible pointer type [enabled 
by default]
 {"hello", py_hello, METH_NOARGS, py_hello_doc},
 ^
pyhello.c:15:5: warning: (near initialization for ‘hellomethods[0].ml_meth’) 
[enabled by default]
x86_64-pc-linux-gnu-gcc -pthread -fPIC -I/usr/include/python3.4 -c hello.c -o 
build/temp.linux-x86_64-3.4/hello.o
x86_64-pc-linux-gnu-gcc -pthread -shared build/temp.linux-x86_64-3.4/pyhello.o 
build/temp.linux-x86_64-3.4/hello.o -L/usr/lib64 -lpython3.4 -o 
/home/swails/BugHunter/CAPI/Python/junk/hello/_hello.cpython-34.so
bash$ python3.4
Python 3.4.1 (default, Aug 24 2014, 10:04:41) 
[GCC 4.7.4] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import hello
>>> hello.hello()
hello world
0

You can get rid of the warning by casting py_hello to (PyCFunction)...
maybe that's causing your segfault?

All the best,
Jason

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python extension using a C library with one 'hello' function

2014-11-04 Thread Jason Swails
On Tue, Nov 4, 2014 at 11:09 AM, Veek M  wrote:

> static PyMethodDef hellomethods[] = {
> {"hello", py_hello, METH_VARARGS, py_hello_doc},
> {NULL, NULL, 0, NULL},
> };
>
> It's basically the METH_VARARGS field that's giving the problem. Switching
> it to NULL gives,
> SystemError: Bad call flags in PyCFunction_Call. METH_OLDARGS is no longer
> supported!
>

​Yes, I got that problem too, which is why I switched it to METH_NOARGS.
​

> and METH_NOARGS doesn't work in 3.2
>

I
​t does for me:

​
Python 3.2.5 (default, Aug 24 2014, 10:06:23)
[GCC 4.7.4] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import hello
>>> hello.hello()
hello world
0
>>>

​As you can see -- this is a Python 3.2 built with GCC 4.7 (On Gentoo
Linux).  It also works on Python 3.1 and 3.0 (but obviously doesn't work
for Python 2.X).  I can't tell why you're having so many problems...​  Try
doing a "git clean -fxd" to make sure you don't have leftover files lying
around somewhere that are causing grief.

Also, you need to add "-g" to the compiler arguments to make sure you build
with debug symbols if you want a meaningful traceback.

Good luck,
Jason

-- 
Jason M. Swails
BioMaPS,
Rutgers University
Postdoctoral Researcher
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python extension using a C library with one 'hello' function

2014-11-04 Thread Jason Swails
On Tue, 2014-11-04 at 23:03 +0630, Veek M wrote:
> okay got it working - thanks Jason! The 3.2 docs are slightly different.

What did you need to do to get it working?

-- 
https://mail.python.org/mailman/listinfo/python-list


Neat little programming puzzle

2014-12-15 Thread Jason Swails
This was a problem posed to me, which I solved in Python.  I thought it was
neat and enjoyed the exercise of working through it; feel free to ignore.
For people looking for little projects to practice their skills with (or a
simple distraction), read on.

You have a triangle of numbers such that each row has 1 more number than
the row before it, like so:

1
 32
   861
510   15  2

The task is to find the maximum possible sum through the triangle that you
can compute by adding numbers that are adjacent to the value chosen in the
row above.  In this simple example, the solution is 1+3+6+15=25.  As the
number of rows increases, the possible paths through the triangle grows
exponentially (and it's not enough to just look at the max value in each
row, since they may not be part of the optimum pathway, like the '8' in row
3 of the above example).

The challenge is to write a program to compute the sum of
https://gist.github.com/swails/17ef52f3084df708816d.

I liked this problem because naive solutions scale as O(2^N), begging for a
more efficient approach.

Have fun,
Jason
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: A question about how plot from matplotlib works

2015-02-19 Thread Jason Swails
On Thu, Feb 19, 2015 at 5:47 AM, ast  wrote:

> Hello
>
>  import numpy as np
 import matplotlib.pyplot as plt
 x = np.arange(10)
 y = x**2
 x

>>> array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>
>> y

>>> array([ 0,  1,  4,  9, 16, 25, 36, 49, 64, 81])
>
>> plt.plot(x,y)

>>> []
>
>> plt.show()

>>>
>
> The question is:
>
> plt.plot() creates an object "matplotlib.lines.Line2D" but this object is
> not referenced. So this object should disapear from memory. But
> this doesn't happens since plt.show() draws the curve on a graphic
> window. So how does it work ?


​A reference to it is put in the "active" Axes instance of the
matplotlib.pyplot namespace.  There are many things that will prevent an
object from being garbage-collected (a common source of references are
caches). [1]

​In general, matplotlib has many containers.  In particular, Line2D objects
generated by the "plot" function are added to the Axes instance from which
"plot" was called.  When you don't explicitly specify an Axes object from
which to plot, matplotlib.pyplot applies it to some "default" Axes instance
living in the matplotlib.pyplot namespace.​

This is done to give matplotlib more of a Matlab-like feel.  To demonstrate
this, let's go try and FIND that reference to the lines:

>>> import matplotlib.pyplot as plt
​>>> import numpy as np
​>>> x = np.arange(10)
​>>> y = x ** 2
​>>> x
​array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
​>>> y
​array([ 0,  1,  4,  9, 16, 25, 36, 49, 64, 81])
​>>> lines, = plt.plot(x, y)
​>>> id(lines)
​4466622800
​​
>>> lines
​
​>>> del lines
​>>> # Now let's find those lines
​... active_axes = plt.gca() # Get Current Axes
​>>> dir(active_axes)
​[..., get_lines, ...] <-- this is snipped for brevity
​>>> active_axes.get_lines()
​
​>>> active_axes.get_lines()[0]
​
​>>> id(active_axes.get_lines()[0])
​4466622800

And there we have it!  Success!  (Note, my comment indicates that the gca
in plt.gca() stands for "Get Current Axes").  I also snipped the list of
attributes in active_axes that I got from the "dir" command, since that
list is HUGE, but the method we want is, rather expectedly, "get_lines".

In *my* personal opinion, the matplotlib API is quite intuitive, such that,
coupled with Python's native introspective functions (like dir() and id())
and "help" function in the interpreter, I rarely have to consult
StackOverflow or even the API documentation online to do what I need.

For instance, you want to change the color or thickness of the error bar
hats on error bars in your plot?  Either save a reference to them when they
are generated (by plt.errorbar, for instance), or go *find* them inside the
Axes you are manipulating and set whatever properties you want.

Hope this helps,
Jason

[1] OK, so there are not many *things* -- only if there are active,
non-circular references will the object *not* be garbage-collected, loosely
speaking.  But there are many reasons and places that such references are
generated inside many APIs... caching being one of the most popular.

-- 
Jason M. Swails
BioMaPS,
Rutgers University
Postdoctoral Researcher
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Parallelization of Python on GPU?

2015-02-26 Thread Jason Swails
On Wed, 2015-02-25 at 18:35 -0800, John Ladasky wrote:
> I've been working with machine learning for a while.  Many of the
> standard packages (e.g., scikit-learn) have fitting algorithms which
> run in single threads.  These algorithms are not themselves
> parallelized.  Perhaps, due to their unique mathematical requirements,
> they cannot be paralleized.  
> 
> When one is investigating several potential models of one's data with
> various settings for free parameters, it is still sometimes possible
> to speed things up.  On a modern machine, one can use Python's
> multiprocessing.Pool to run separate instances of scikit-learn fits.
> I am currently using ten of the twelve 3.3 GHz CPU cores on my machine
> to do just that.  And I can still browse the web with no observable
> lag.  :^)
> 
> Still, I'm waiting hours for jobs to finish.  Support vector
> regression fitting is hard.
> 
> What I would REALLY like to do is to take advantage of my GPU.  My
> NVidia graphics card has 1152 cores and a 1.0 GHz clock.  I wouldn't
> mind borrowing a few hundred of those GPU cores at a time, and see
> what they can do.  In theory, I calculate that I can speed up the job
> by another five-fold.
> 
> The trick is that each process would need to run some PYTHON code, not
> CUDA or OpenCL.  The child process code isn't particularly fancy.  (I
> should, for example, be able to switch that portion of my code to
> static typing.)
> 
> What is the most effective way to accomplish this task?

GPU computing is a lot more than simply saying "run this on a GPU".  To
realize the performance gains promised by a GPU, you need to tailor your
algorithms to take advantage of their hardware... SIMD reigns supreme
where thread divergence and branching are far more expensive than they
are in CPU computing.  So even if you decide to somehow translate your
Python code into a CUDA kernel, there is a good chance that you will be
woefully disappointed in the resulting speedup (or even moreso if you
actually get a slowdown :)).  For example, a simple reduction is more
expensive on a GPU than it is on a CPU for small arrays.  A dot product,
for example, has a part that's super fast on the GPU (element-by-element
multiplication), and then a part that gets a lot slower (summing up all
elements of the resulting multiplication).  Each core on the GPU is a
lot slower than a CPU (which is why a 1000-CUDA-core GPU doesn't run
anywhere near 1000x faster than a CPU), so you really only get gains
when they can all work efficiently together.

Another example -- matrix multiplies are *fast*.  Diagonalizations are
slow (which is why in my field where diagonalizations are common
requirements, they are often done on the CPU while *building* the matrix
is done on the GPU).
> 
> I came across a reference to a package called "Urutu" which may be
> what I need, however it doesn't look like it is widely supported.

Urutu seems to be built on PyCUDA and PyOpenCL (which are both written
by the same person; Andreas Kloeckner at UIUC in the United States).

Another package I would suggest looking into is numba, from Continuum
Analytics: https://github.com/numba/numba.  Unlike Urutu, their package
is built on LLVM and Python bindings they've written to implement
numpy-aware JIT capabilities.  I believe they also permit compiling down
to a GPU kernel through LLVM.  One downside I've experienced with that
package is that LLVM does not yet have a stable API (as I understand
it), so they often lag behind support for the latest versions of LLVM.
> 
> I would love it if the Python developers themselves added the ability
> to spawn GPU processes to the Multiprocessing module!

I would be stunned if this actually happened.  If you're worried about
performance, you get at least an order of magnitude performance boost by
going to numpy or writing the kernel directly in C or Fortran.  CPython
itself just isn't structured to run on a GPU... maybe pypy will tackle
that at some point in the probably-distant future.

All the best,
Jason

-- 
Jason M. Swails
BioMaPS,
Rutgers University
Postdoctoral Researcher

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Parallelization of Python on GPU?

2015-02-26 Thread Jason Swails
On Thu, 2015-02-26 at 14:02 +1100, Steven D'Aprano wrote:
> John Ladasky wrote:
> 
> 
> > What I would REALLY like to do is to take advantage of my GPU.
> 
> I can't help you with that, but I would like to point out that GPUs 
> typically don't support IEE-754 maths, which means that while they are 
> likely significantly faster, they're also likely significantly less 
> accurate. Any any two different brands/models of GPU are likely to give 
> different results. (Possibly not *very* different, but considering the mess 
> that floating point maths was prior to IEEE-754, possibly *very* different.)

This hasn't been true in NVidia GPUs manufactured since ca. 2008.

> Personally, I wouldn't trust GPU floating point for serious work. Maybe for 
> quick and dirty exploration of the data, but I'd then want to repeat any 
> calculations using the main CPU before using the numbers anywhere :-)

There is a *huge* dash toward GPU computing in the scientific computing
sector.  Since I started as a graduate student in computational
chemistry/physics in 2008, I watched as state-of-the-art supercomputers
running tens of thousands to hundreds of thousands of cores were
overtaken in performance by a $500 GPU (today the GTX 780 or 980) you
can put in a desktop.  I went from running all of my calculations on a
CPU cluster in 2009 to running 90% of my calculations on a GPU by the
time I graduated in 2013... and for people without as ready access to
supercomputers as myself the move was even more pronounced.

This work is very serious, and numerical precision is typically of
immense importance.  See, e.g.,
http://www.sciencedirect.com/science/article/pii/S0010465512003098 and
http://pubs.acs.org/doi/abs/10.1021/ct400314y

In our software, we can run simulations on a GPU or a CPU and the
results are *literally* indistinguishable.  The transition to GPUs was
accompanied by a series of studies that investigated precisely your
concerns... we would never have started using GPUs if we didn't trust
GPU numbers as much as we did from the CPU.

And NVidia is embracing this revolution (obviously) -- they are putting
a lot of time, effort, and money into ensuring the success of GPU high
performance computing.  It is here to stay in the immediate future, and
refusing to use the technology will leave those that *could* benefit
from it at a severe disadvantage. (That said, GPUs aren't good at
everything, and CPUs are also here to stay.)

And GPU performance gains are outpacing CPU performance gains -- I've
seen about two orders of magnitude improvement in computational
throughput over the past 6 years through the introduction of GPU
computing and improvements in GPU hardware.

All the best,
Jason

-- 
Jason M. Swails
BioMaPS,
Rutgers University
Postdoctoral Researcher

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Switching between cmd.CMD instances

2014-04-02 Thread Jason Swails
On Wed, Apr 2, 2014 at 1:03 AM, Josh English wrote:

> I have a program with several cmd.Cmd instances. I am trying to figure out
> what the best way to organize them should be.
>
> I've got my BossCmd, SubmissionCmd, and StoryCmd objects.
>
> The BossCmd object can start either of the other two, and this module
> allows the user switch back and forth between them. Exiting either of the
> sub-command objects returns back to the BossCmd.
>
> I have defined both a do_done and do_exit method on the sub-commands.
>
> Is it possible to flag BossCmd so when either of the other two process
> do_exit, the BossCmd will also exit?
>

I have an app that also has a number of cmd.Cmd subclasses to implement
different interpreter layers.  I haven't needed to implement what you're
talking about here (exiting one interpreter just drops you down to a
lower-level interpreter).  However, it's definitely possible.  You can have
your SubmissionCmd and StoryCmd take a "master" (BossCmd) object in its
__init__ method and store the BossCmd as an instance attribute.

>From there, you can implement a method interface in which the child Cmd
subclasses can call to indicate to BossCmd that do_exit has been called and
it should quit after the child's cmdloop returns.  So something like this:

class SubmissionCmd(cmd.Cmd):
# your stuff
def __init__(self, master):
cmd.Cmd.__init__(self, *your_args)
self.master = master

def do_exit(self, line):
self.master.child_has_exited()

class BossCmd(cmd.Cmd):
# your stuff
def child_has_exited(self):
self.exit_on_return = True # this should be set False in __init__

def do_submit(self, line):
subcmd = SubmissionCmd(self)
subcmd.cmdloop()
if self.exit_on_return: return True

Untested and incomplete, but you get the idea.

HTH,
Jason

-- 
Jason M. Swails
BioMaPS,
Rutgers University
Postdoctoral Researcher
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: [OFF-TOPIC] How do I find a mentor when no one I work with knows what they are doing?

2014-04-08 Thread Jason Swails
On Tue, Apr 8, 2014 at 3:07 AM, James Brewer  wrote:

> I'm sure there will be a substantial amount of arrogance perceived from
> this question, but frankly I don't think that I have anything to learn from
> my co-workers, which saddens me because I really like to learn and I know
> that I have a lot of learning to do.
>
> I've been employed as a software engineer for about eight months now and I
> feel like I haven't learned nearly as much as I should. Sure, I've picked
> up little tidbits of information here and there, but I'm no more confident
> in my ability to build anything more complex than a basic crud app than I
> was the day I started.
>
> Things I'm interested include contributing to both Python and Django,
> database design and data modeling, API design, code quality, algorithms and
> data structures, and software architecture, among other things.
>
> Basically, I want to be a better engineer. Where can I find someone
> willing to point me in the right direction and what can I offer in return?
>

Find something that interests you (you've done that already).  Clone the
repository of whatever interests you (Django, Python, etc.).  Then start
reading their source code.  Maybe pick up a bug report that you think you
can understand and work on coming up with a solution -- that will lead to
more targeted reading than simply perusing it at random.  Chances are
someone will fix it before you get a chance, but just seeing how _others_
have designed their software and implemented it will help you learn a lot.
 The more you do that, the more you will understand the overall framework
of the project.

Best of all would be to choose a project that you use regularly.  Once you
become more experienced and knowledgeable about the project you've chosen,
you can start contributing back to it.  Everybody wins.

At least I've learned a lot doing that.

Good luck,
Jason
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Adding R squared value to scatter plot

2014-05-21 Thread Jason Swails
​​



On Wed, May 21, 2014 at 7:59 AM, Jamie Mitchell  wrote:

> I have made a plot using the following code:
>
> python2.7
> import netCDF4
> import matplotlib.pyplot as plt
> import numpy as np
>
>
> swh_Q0_con_sw=netCDF4.Dataset('/data/cr1/jmitchel/Q0/swh/controlperiod/south_west/swhcontrol_swest_annavg1D.nc','r')
> hs_Q0_con_sw=swh_Q0_con_sw.variables['hs'][:]
>
> swh_Q3_con_sw=netCDF4.Dataset('/data/cr1/jmitchel/Q3/swh/controlperiod/south_west/swhcontrol_swest_annavg1D.nc','r')
> hs_Q3_con_sw=swh_Q3_con_sw.variables['hs'][:]
>
> swh_Q4_con_sw=netCDF4.Dataset('/data/cr1/jmitchel/Q4/swh/controlperiod/south_west/swhcontrol_swest_annavg1D.nc','r')
> hs_Q4_con_sw=swh_Q4_con_sw.variables['hs'][:]
>
> swh_Q14_con_sw=netCDF4.Dataset('/data/cr1/jmitchel/Q14/swh/controlperiod/south_west/swhcontrol_swest_annavg1D.nc','r')
> hs_Q14_con_sw=swh_Q14_con_sw.variables['hs'][:]
>
> swh_Q16_con_sw=netCDF4.Dataset('/data/cr1/jmitchel/Q16/swh/controlperiod/south_west/swhcontrol_swest_annavg1D.nc','r')
> hs_Q16_con_sw=swh_Q16_con_sw.variables['hs'][:]
>
> swh_Q0_fut_sw=netCDF4.Dataset('/data/cr1/jmitchel/Q0/swh/2050s/south_west/swh2050s_swest_annavg1D.nc','r')
> hs_Q0_fut_sw=swh_Q0_fut_sw.variables['hs'][:]
>
> swh_Q3_fut_sw=netCDF4.Dataset('/data/cr1/jmitchel/Q3/swh/2050s/south_west/swh2050s_swest_annavg1D.nc','r')
> hs_Q3_fut_sw=swh_Q3_fut_sw.variables['hs'][:]
>
> swh_Q4_fut_sw=netCDF4.Dataset('/data/cr1/jmitchel/Q4/swh/2050s/south_west/swh2050s_swest_annavg1D.nc','r')
> hs_Q4_fut_sw=swh_Q4_fut_sw.variables['hs'][:]
>
> swh_Q14_fut_sw=netCDF4.Dataset('/data/cr1/jmitchel/Q14/swh/2050s/south_west/swh2050s_swest_annavg1D.nc','r')
> hs_Q14_fut_sw=swh_Q14_fut_sw.variables['hs'][:]
>
> swh_Q16_fut_sw=netCDF4.Dataset('/data/cr1/jmitchel/Q16/swh/2050s/south_west/swh2050s_swest_annavg1D.nc','r')
> hs_Q16_fut_sw=swh_Q16_fut_sw.variables['hs'][:]
>
> fit_Q0_sw=np.polyfit(hs_Q0_con_sw,hs_Q0_fut_sw,1)
> fit_fn_Q0_sw=np.poly1d(fit_Q0_sw)
>
> plt.plot(hs_Q0_con_sw,hs_Q0_fut_sw,'g.')
> plt.plot(hs_Q0_con_sw,fit_fn_Q0_sw(hs_Q0_con_sw),'g',label='Q0 no pert')
>
> fit_Q3_sw=np.polyfit(hs_Q3_con_sw,hs_Q3_fut_sw,1)
> fit_fn_Q3_sw=np.poly1d(fit_Q3_sw)
>
> plt.plot(hs_Q3_con_sw,hs_Q3_fut_sw,'b.')
> plt.plot(hs_Q3_con_sw,fit_fn_Q3_sw(hs_Q3_con_sw),'b',label='Q3 low sens')
>
> fit_Q4_sw=np.polyfit(hs_Q4_con_sw,hs_Q4_fut_sw,1)
> fit_fn_Q4_sw=np.poly1d(fit_Q4_sw)
>
> plt.plot(hs_Q4_con_sw,hs_Q4_fut_sw,'y.')
> plt.plot(hs_Q4_con_sw,fit_fn_Q4_sw(hs_Q4_con_sw),'y',label='Q4 low sens')
>
> fit_Q14_sw=np.polyfit(hs_Q14_con_sw,hs_Q14_fut_sw,1)
> fit_fn_Q14_sw=np.poly1d(fit_Q14_sw)
>
> plt.plot(hs_Q14_con_sw,hs_Q14_fut_sw,'r.')
> plt.plot(hs_Q14_con_sw,fit_fn_Q14_sw(hs_Q14_con_sw),'r',label='Q14 high
> sens')
>
> fit_Q16_sw=np.polyfit(hs_Q16_con_sw,hs_Q16_fut_sw,1)
> fit_fn_Q16_sw=np.poly1d(fit_Q16_sw)
>
> plt.plot(hs_Q16_con_sw,hs_Q16_fut_sw,'c.')
> plt.plot(hs_Q16_con_sw,fit_fn_Q16_sw(hs_Q16_con_sw),'c',label='Q16 high
> sens')
>
> plt.legend(loc='best')
> plt.xlabel('Significant Wave Height annual averages NW Scotland 1981-2010')
> plt.ylabel('Significant Wave Height annual averages NW Scotland 2040-2069')
> plt.title('Scatter plot of Significant Wave Height')
> plt.show()
>
> --
>
> What I would like to do is display the R squared value next to the line of
> best fits that I have made.
>
> Does anyone know how to do this with matplotlib?
>

​You can add plain text or annotations with arrows using any of the API
functions described here:
http://matplotlib.org/1.3.1/users/text_intro.html(information
specifically regarding the text call is here:
http://matplotlib.org/1.3.1/api/pyplot_api.html#matplotlib.pyplot.text)

You can also use LaTeX typesetting here, so you can make the text something
like r'$R^2$' to display R^2 with "nice" typesetting. (I typically use raw
strings for matplotlib text strings with LaTeX formulas in them since LaTeX
makes extensive use of the \ character.)

The onus is on you, the programmer, to determine _where_ on the plot you
want the text to appear.  Since you know what you are plotting, you can
write a quick helper function that will compute the optimal (to you)
location for the label to occur based on where things are drawn on the
canvas.  There is a _lot_ of flexibility here so you should be able to get
your text looking exactly how (and where) you want it.

Hope this helps,
Jason

-- 
Jason M. Swails
BioMaPS,
Rutgers University
Postdoctoral Researcher
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: How to make Tkinter Listbox entry insensitive?

2013-10-10 Thread Jason Swails
On Thu, Oct 10, 2013 at 2:29 PM, Skip Montanaro  wrote:

> > Removing inappropriate entries is not much of a hack.
>
> True, but then I have to go through the trouble of adding them back in
> should they become valid again. :-)
>

It seems that this could be handled fairly straight-forwardly by
subclassing either Listbox or Frame to implement your own, custom widget.
 The trick is to retain references to every entry within the widget, but
only embed it in the viewable area if it happens to be a valid entry at
that point.  Then all that's left is to hook events up to the proper
callbacks that implement various actions of your custom widgets using what
Tkinter is capable of doing.

Personally I prefer to subclass Frame since it allows me the maximum
flexibility (I think 90+% of the widgets I've written for my own
Tkinter-based programs do this).

All the best,
Jason
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Building C++ modules for python using GNU autotools, automake, whatever

2015-02-26 Thread Jason Swails
On Thu, 2015-02-26 at 07:57 -0800, [email protected] wrote:
> Hi,
> 
> I'm a complete neophyte to the whole use of GNU
> autotools/automake/auto... .  (I'm not sure what it should be called
> anymore.)  Regardless, I'm porting a library project, for which I'm a
> team member, to using this toolset for building in Linux.  I'm to the
> point now of writing the Makefile.am file for the actual library.
> (There are several other static libraries compiled first that are
> sucked into this shared object file.)  
> 
> I found some references here:
> http://www.gnu.org/savannah-checkouts/gnu/automake/manual/html_node/Python.html,
>  which seemed to be just what I was after.  However, I've got a big question 
> about a file named "module.la" instead of "module.so" which is what we 
> compile it to now.

I certainly hope module.la is not what it gets compiled to.  Open it up
with a text editor :).  It's just basically a description of the library
that libtool makes use of.  In the projects that I build, the .la files
are all associated with a .a archive or a .so (/.dylib for Macs).
Obviously, static archives won't work for Python (and, in particular, I
believe you need to compile all of the objects as position independent
code, so you need to make sure the appropriate PIC flag is given to the
compiler... for g++ that would be -fPIC).
> 
> I guess I should have mentioned some background.  Currently, we build
> this tool through some homegrown makefiles.  This has worked, but
> distribution is difficult and our product must now run on an embedded
> platform (so building it cleanly requires the use of autotools).  
> 
> Basically, I need this thing to install
> to /usr/lib/python2.6/site-packages when the user invokes "make
> install".  I thought the variables and primaries discussed at the link
> above were what I needed.  However, what is a "*.la"?  I'm reading up
> on libtool now, but will it function the same way as a *.so?

To libtool, yes... provided that you *also* have the .so with the same
base name as the .la.  I don't think compilers themselves make any use
of .la files, though.

HTH,
Jason

-- 
Jason M. Swails
BioMaPS,
Rutgers University
Postdoctoral Researcher

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Parallelization of Python on GPU?

2015-02-26 Thread Jason Swails
On Thu, 2015-02-26 at 16:53 +, Sturla Molden wrote:
> GPU computing is great if you have the following:
> 
> 1. Your data structures are arrays floating point numbers.

It actually works equally great, if not better, for integers.

> 2. You have a data-parallel problem.

This is the biggest one, IMO. ^^^

> 3. You are happy with single precision.

NVidia GPUs have double-precision maths in hardware since compute
capability 1.2 (GTX 280).  That's ca. 2008.  In optimized CPU code, you
still get ~50% benefit going from double to single precision (it's
rarely ever that high, but 20-30% is commonplace in my experience of
optimized code).  It's admittedly a bigger hit on most GPUs, but there
are ways to work around it (e.g., fixed precision), and you can still do
double precision work where it's needed.  One of the articles I linked
previously demonstrates that a hybrid precision model (based on fixed
precision) provides exactly the same numerical stability as double
precision (which is much better than pure single precision) for that
application.

Double precision can often be avoided in many parts of a calculation,
using it only where those bits matter (like accumulators with
potentially small contributions, subtractions of two numbers of similar
magnitude, etc.).

> 4. You have time to code erything in CUDA or OpenCL.

This is the second biggest one, IMO. ^^^

> 5. You have enough video RAM to store your data.

Again, it can be worked around, but the frequent GPU->CPU xfers involved
if you can't fit everything on the GPU can be painstaking to limit its
potentially devastating effects on performance.

> 
> For Python the easiest solution is to use Numba Pro.

Agreed, although I've never actually tried PyCUDA before...

All the best,
Jason

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Parallelization of Python on GPU?

2015-02-26 Thread Jason Swails
On Thu, Feb 26, 2015 at 4:10 PM, Sturla Molden 
wrote:

> On 26/02/15 18:48, Jason Swails wrote:
>
>> On Thu, 2015-02-26 at 16:53 +, Sturla Molden wrote:
>>
>>> GPU computing is great if you have the following:
>>>
>>> 1. Your data structures are arrays floating point numbers.
>>>
>>
>> It actually works equally great, if not better, for integers.
>>
>
> Right, but not complicated data structures with a lot of references or
> pointers. It requires data are laid out in regular arrays, and then it acts
> on these arrays in a data-parallel manner. It is designed to process
> vertices in parallel for computer graphics, and that is a limitation which
> is always there. It is not a CPU with 1024 cores. It is a "floating point
> monster" which can process 1024 vectors in parallel. You write a tiny
> kernel in a C-like language (CUDA, OpenCL) to process one vector, and then
> it will apply the kernel to all the vectors in an array of vectors. It is
> very comparable to how GLSL and Direct3D vertex and fragment shaders work.
> (The reason for which should be obvious.) The GPU is actually great for a
> lot of things in science, but it is not a CPU. The biggest mistake in the
> GPGPU hype is the idea that the GPU will behave like a CPU with many cores.


Very well summarized.  At least in my field, though, it is well-known that
GPUs are not 'uber-fast CPUs'.  Algorithms have been redesigned, programs
rewritten to take advantage of their architecture.  It has been a *massive*
investment of time and resources, but (unlike the Xeon Phi coprocessor [1])
has reaped most of its promised rewards.

​--Jason

[1] I couldn't resist the jab.  At several times the cost of the top of the
line NVidia gaming card, the GPU is about 15-20x faster...
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Adding a 'struct' into new python type

2015-03-07 Thread Jason Swails
On Sat, Mar 7, 2015 at 4:15 AM, Lakshmipathi.G 
wrote:

> Hi,
>
> I'm following this example :
> http://nedbatchelder.com/text/whirlext.html#h_making_a_type and trying
> to add
> new data into 'CountDict' type
>
> Adding a simple 'char' works well.
>
> typedef struct {
>PyObject_HEAD
>PyObject * dict;
>int count;
>char c;  //add this and placed an entry into PyMemberDef as T_CHAR.
> } CountDict;
>
> I can access  'c' from python code,no issues so far.
>
> Now I want to added 'struct type' into this 'CountDict' type.
> struct test {
> int x;
> };
>
> typedef struct {
>PyObject_HEAD
>PyObject * dict;
>int count;
>char c;
>struct test t1; //?? how to add this
> } CountDict;
> ​​
> ​
>

​​
>
> ​​
> How to do achieve this? (Due to legacy code dependency, I can't use
> ​​
> ctype/cpython etc).
> ​​
> thanks for any help/pointers.
>

​The way *I* would do this is turn struct test into another Python object.
Then instead of defining it in CountDict as type "struct test", define it
as the PyStructTest that you assigned it to when you turned it into a
Python class.

For example, in that website you linked to, CountDict was turned into a
CountDictType Python type.  So do something similar to "test" in which you
turn it into a PyStructTest type in the same way.  Then declare it as a
PyStructTest pointer instead of a struct test.

I'm not sure you can get around an approach like this.  I believe that in
order to access data from Python, it needs to be a Python type.  The
struct-to-Python-type conversion *knows* how to translate basic types, like
char, double, float, int, and long into their Python equivalents; but it
can't handle something as potentially complicated as an arbitrary struct.
Of course, if all you want to do is access t1 from C, then I think what you
have is fine.

​Good luck,
Jason
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Brilliant or insane code?

2015-03-18 Thread Jason Swails
On Wed, 2015-03-18 at 00:35 +, Mark Lawrence wrote:
> I've just come across this 
> http://www.stavros.io/posts/brilliant-or-insane-code/ as a result of 
> this http://bugs.python.org/issue23695
> 
> Any and all opinions welcomed, I'm chickening out and sitting firmly on 
> the fence.

I'll go with clever, as most others have said here; I don't have much to
add to what Steven, Dan, Oscar, and Chris have said.  FTR, I do prefer
the:

def func(arr):
i = iter(arr)
return list(zip(i, i, i)) # py3's zip is py2's izip

I will poke at his benchmarking some, though; since they're really
unfairly biased toward the "magical" zip(*([iter(arr)]*3)) syntax:

He compares 3 different options:

In [5]: %timeit [(arr[3*x], arr[3*x+1], arr[3*x+2]) for x in range(len(arr)/3)]
10 loops, best of 3: 41.3 ms per loop

In [6]: %timeit numpy.reshape(arr, (-1, 3))
10 loops, best of 3: 25.3 ms per loop

In [7]: timeit zip(*([iter(arr)]*3))
100 loops, best of 3: 13.4 ms per loop

The first one spends a substantial amount of time doing the same
calculation -- 3*x.  You can trivially shave about 25% of the time off
that with no algorithmic changes just by avoiding tripling x so much:

In [8]: %timeit [(arr[x], arr[x+1], arr[x+2]) for x in xrange(0, len(arr), 3)]
10 loops, best of 3: 26.6 ms per loop

Any compiler would optimize that out, but the Python interpreter
doesn't.  As for numpy -- the vast majority of the time spent there is
in data copy.  As Oscar pointed out, numpy is almost always faster and
better at what numpy does well. If you run this test on a numpy array
instead of a list:

In [10]: %timeit numpy.reshape(arr2, (-1, 3))
10 loops, best of 3: 1.91 µs per loop

So here, option 2 is really ~4 orders of magnitude faster; but that's a
little cheating since no data is ever copied (reshape always returns a
view).  Doing an actual data copy, but always living in numpy, is a
little closer to pure Python (but still ~1 order of magnitude faster):

In [14]: %timeit np.array(arr2.reshape((-1, 3)))
1000 loops, best of 3: 307 µs per loop

So the iter-magic is really about 10-100x slower than an equivalent
numpy (and 1x slower than an optimized numpy) variant, and only ~2x
faster than the more explicit option.

But it's still a nice trick (and one I may make use of in the
future :)).

Thanks for the post,
Jason


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Best way to calculate fraction part of x?

2015-03-24 Thread Jason Swails
On Mon, Mar 23, 2015 at 8:38 PM, Emile van Sebille  wrote:

> On 3/23/2015 5:52 AM, Steven D'Aprano wrote:
>
>  Are there any other, possibly better, ways to calculate the fractional
>> part
>> of a number?
>>
>
> float (("%6.3f" % x)[-4:])


​In general you lose a lot of precision this way...​
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What is considered an "advanced" topic in Python?

2015-05-29 Thread Jason Swails
On Fri, May 29, 2015 at 5:06 PM,  wrote:

>
> > For example, the new (in 3.4) Enum class uses a metaclass.
> >
> >class SomeEnum(Enum):
> >   first = 1
> >   second = 2
> >   third = 3
> >
> > The metaclass changes normal class behavior to:
> >
> >- support iterating: list(SomeEnum) --> [SomeEnum.first,
> SomeEnum.second, SomeEnum.third]
> >- support a length:  len(SomeEnum) --> 3
> >- not allow new instances to be created:  --> SomeEnum(1) is
> SomeEnum(1)  # True
> >
> > --
> > ~Ethan~
>
> Regarding the first two, you can implement __iter__ and __len__ functions
> to create that functionality, though those functions would operate on an
> instance of the class, not the class itself.
>
> As for the third, can't you override the __new__ function to make attempts
> to create a new instance just return a previously created instance?
>

​Of course, but with metaclasses you don't *have* to (in fact, that's the
type of thing that the metaclass could do behind the scenes).

​In this case, any Enum subclass should *always* behave as Ethan described
-- without metaclasses that wouldn't necessarily be true (e.g., if the
person creating an Enum subclass didn't bother to correctly implement
__new__, __iter__, and __len__ for their subclass).

By using metaclasses, you can declare an Enum subclass as simply as Ethan
showed, since the metaclass does all of the dirty work implementing the
desired behavior at the time the class object is constructed (subclasses
inherit their parent class's metaclass).

In this use-case, you can think of it as a kind of "implicit class
factory".  Suppose you have a prescription for how you modify a class
definition (i.e., by implementing certain behavior in __new__ or __init__)
that you wrap up into some function "tweak_my_class".  The metaclass would
allow the class definition to be the equivalent of something like:

class MyClass(SubClass):
# whatever

MyClass = tweak_my_class(MyClass)

Or as a class decorator

@tweak_my_class
class MyClass(SubClass):
# whatever

(In fact, the `six` module allows you to implement metaclasses as
decorators to work with both Python 2 and Python 3, but I think metaclasses
are more powerful in Py3 than they are in Py2).​
​
They are cool ideas, and I've used them in my own code, but they do have a
kind of magic-ness to them -- especially in codes that you didn't write but
are working on.  As a result, I've recently started to prefer alternatives,
but in some rare cases (like Enum, for example), they are just the best
solution.

All the best,
Jason
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Memory error while using pandas dataframe

2015-06-10 Thread Jason Swails
On Mon, Jun 8, 2015 at 3:32 AM, naren  wrote:

> Memory Error while working with pandas dataframe.
>
> Description of Environment Windows 7 python 3.4.2 32-bit version pandas
> 0.16.0
>
> We are running into the error described below. Any help provided will be
> sincerely appreciated.
>
> We are able to read a 300MB Csv file into a dataframe using the read_csv
> function. While working with the dataframe we ran into memory error. We
> used the pd.Concat function to concatenate two dataframes. So we decided to
> use chunksize for lazy reading. Chunking returns an object of type
> TextFileReader.
>
>
> http://pandas.pydata.org/pandas-docs/stable/io.html#iterating-through-files-chunk-by-chunk
>
> We are able to iterate over this object once as a debugging measure. The
> iterator gets exhausted after iterating once. So we are not able to convert
> the TextFileReader object back into a dataframe, using the pd.concat
> function.
>
​It looks like you already figured out what your problem is.  The
TextFileReader is exhausted (i.e., at EOF), so you end up getting None from
it.​


​What is your question?  You want to be able to iterate through
TextFileReader again?

If so, try rewinding the file object that you passed to pd.concat.  If you
saved a reference to the file object, just call "seek(0)" on that object.
If you didn't, access it as the "f" attribute on the TextFileReader object
and call "seek(0)" on that instead.

That might work.  Otherwise, you should be more specific with your question
and provide a full segment of code that is as small as possible to
reproduce the error you're seeing.

HTH,
Jason
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Bug in floating point multiplication

2015-07-02 Thread Jason Swails
On Thu, Jul 2, 2015 at 10:52 AM, Steven D'Aprano 
wrote:

> Despite the title, this is not one of the usual "Why can't Python do
> maths?" "bug" reports.
>
> Can anyone reproduce this behaviour? If so, please reply with the version
> of
> Python and your operating system. Printing sys.version will probably do.
>
>
> x = 1 - 1/2**53
> assert x == 0.
> for i in range(1, 100):
> if int(i*x) == i:
> print(i); break
>
>
> Using Jython and IronPython, the loop runs to completion. That is the
> correct behaviour, or so I am lead to believe. Using Python 2.6, 2.7 and
> 3.3 on Centos and Debian, it prints 2049 and breaks. That should not
> happen. If you can reproduce that (for any value of i, not necessarily
> 2049), please reply.
>

​
As others have suggested, this is almost certainly a 32-bit vs. 64-bit
issue.  Consider the following C program:

// maths.h
#include 
#include 

int main() {
double x;
int i;
x = 1-pow(0.5, 53);

for (i = 1; i < 100; i++) {
if ((int)(i*x) == i) {
printf("%d\n", i);
break;
}
}

return 0;
}

For the most part, this should be as close to an exact transliteration of
your Python code as possible.

Here's what I get when I try compiling and running it on my 64-bit (Gentoo)
Linux machine with 32-bit compatible libs:

swails@batman ~/test $ gcc maths.c
swails@batman ~/test $ ./a.out
swails@batman ~/test $ gcc -m32 maths.c
swails@batman ~/test $ ./a.out
2049

That this happens at the C level in 32-bit mode is highly suggestive, I
think, since I believe these are the actual machine ops that CPython float
maths execute under the hood.

All the best,
Jason
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Bug in floating point multiplication

2015-07-03 Thread Jason Swails
On Fri, Jul 3, 2015 at 11:13 AM, Oscar Benjamin 
wrote:

> On 2 July 2015 at 18:29, Jason Swails  wrote:
> >
> > As others have suggested, this is almost certainly a 32-bit vs. 64-bit
> > issue.  Consider the following C program:
> >
> > // maths.h
> > #include 
> > #include 
> >
> > int main() {
> > double x;
> > int i;
> > x = 1-pow(0.5, 53);
> >
> > for (i = 1; i < 100; i++) {
> > if ((int)(i*x) == i) {
> > printf("%d\n", i);
> > break;
> > }
> > }
> >
> > return 0;
> > }
> >
> > For the most part, this should be as close to an exact transliteration of
> > your Python code as possible.
> >
> > Here's what I get when I try compiling and running it on my 64-bit
> (Gentoo)
> > Linux machine with 32-bit compatible libs:
> >
> > swails@batman ~/test $ gcc maths.c
> > swails@batman ~/test $ ./a.out
> > swails@batman ~/test $ gcc -m32 maths.c
> > swails@batman ~/test $ ./a.out
> > 2049
>
> I was unable to reproduce this on my system. In both cases the loops
> run to completion. A look at the assembly generated by gcc shows that
> something different goes on there though.
>
> The loop in the 64 bit one (in the main function) looks like:
>
> $ objdump -d a.out | less
> ...
> 400555:  pxor   %xmm0,%xmm0
> 400559:  cvtsi2sdl -0xc(%rbp),%xmm0
> 40055e:  mulsd  -0x8(%rbp),%xmm0
> 400563:  cvttsd2si %xmm0,%eax
> 400567:  cmp-0xc(%rbp),%eax
> 40056a:  jne400582 
> 40056c:  mov-0xc(%rbp),%eax
> 40056f:  mov%eax,%esi
> 400571:  mov$0x400624,%edi
> 400576:  mov$0x0,%eax
> 40057b:  callq  400410 
> 400580:  jmp40058f 
> 400582:  addl   $0x1,-0xc(%rbp)
> 400586:  cmpl   $0xf423f,-0xc(%rbp)
> 40058d:  jle400555 
> ...
>
> Where is the 32 bit one looks like:
>
> $ objdump -d a.out.32 | less
> ...
>  804843e:  fildl  -0x14(%ebp)
>  8048441:  fmull  -0x10(%ebp)
>  8048444:  fnstcw -0x1a(%ebp)
>  8048447:  movzwl -0x1a(%ebp),%eax
>  804844b:  mov$0xc,%ah
>  804844d:  mov%ax,-0x1c(%ebp)
>  8048451:  fldcw  -0x1c(%ebp)
>  8048454:  fistpl -0x20(%ebp)
>  8048457:  fldcw  -0x1a(%ebp)
>  804845a:  mov-0x20(%ebp),%eax
>  804845d:  cmp-0x14(%ebp),%eax
>  8048460:  jne8048477 
>  8048462:  sub$0x8,%esp
>  8048465:  pushl  -0x14(%ebp)
>  8048468:  push   $0x8048520
>  804846d:  call   80482f0 
>  8048472:  add$0x10,%esp
>  8048475:  jmp8048484 
>  8048477:  addl   $0x1,-0x14(%ebp)
>  804847b:  cmpl   $0xf423f,-0x14(%ebp)
>  8048482:  jle804843e 
> ...
>
> So the 64 bit one is using SSE instructions and the 32-bit one is
> using x87. That could explain the difference you see at the C level
> but I don't see it on this CPU (/proc/cpuinfo says Intel(R) Core(TM)
> i5-3427U CPU @ 1.80GHz).
>

​Hmm.  Well that could explain why you don't get the same results as me.
My CPU is a
AMD FX(tm)-6100 Six-Core Processor
​ (from /proc/cpuinfo).  My objdump looks the same as yours for the 64-bit
version, but for 32-bit it looks like:

...
 804843a:   db 44 24 14 fildl  0x14(%esp)
​​

​ 804843e:   dc 4c 24 18 fmull  0x18(%esp)
​ 8048442:   dd 5c 24 08 fstpl  0x8(%esp)
​ 8048446:   f2 0f 2c 44 24 08   cvttsd2si 0x8(%esp),%eax
​ 804844c:   3b 44 24 14 cmp0x14(%esp),%eax
​ 8048450:   75 16   jne8048468 
​ 8048452:   8b 44 24 14 mov0x14(%esp),%eax
​ 8048456:   89 44 24 04 mov%eax,0x4(%esp)
​ 804845a:   c7 04 24 10 85 04 08movl   $0x8048510,(%esp)
​ 8048461:   e8 8a fe ff ff  call   80482f0 
​ 8048466:   eb 0f   jmp8048477 
​ 8048468:   83 44 24 14 01  addl   $0x1,0x14(%esp)
​ 804846d:   81 7c 24 14 3f 42 0fcmpl   $0xf423f,0x14(%esp)
​ 8048474:   00
​ 8048475:   7e c3   jle804843a 
...​
​


However, I have no experience looking at raw assembler, so I can't discern
what it is I'm even looking at (nor do I know what explicit SSE
instructions look like in assembler).

I have a Mac that runs an Intel Core i5, and, like you, both 32- and 64-bit
versions run to completion.  Which is at least consistent with what others
are seeing with Python.

All the best,
Jason
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Should iPython Notebook replace Idle

2015-07-03 Thread Jason Swails
On Fri, Jul 3, 2015 at 10:01 PM, Sayth Renshaw 
wrote:

> In future releases of Python should ipython Notebooks replace idle as the
> default tool for new users to learn python?


> This would as I see it have many benefits?
>
> 1. A nicer more usual web interface for new users.
> 2. Would allow the python documentation and tutorials to be distributed as
> ipython notebooks which would allow new users to play and interact with the
> tutorials as they proceed. No download separate code retyping just edit run
> and play.
> 3. Would allow teachers to setup notebooks knowing that all users have the
> same default environment, no need for setting up virtualenvs etc.
> 4. Strengthen the learning base and for new python developers as a whole.
>
> Thoughts?
>

IPython and IDLE are different.  IPython is *just* an interactive Python
interpreter with a ton of tweaks and enhancements.  IDLE, by contrast, is
both an upscale interpreter (not *nearly* as feature-complete as IPython),
but it's also an IDE.  AFAICT, IPython does not do this.

Also, look at the IPython dependencies for its core functionalities:

- jinja2
- sphinx
- pyzmq
- pygments
- tornado
- PyQt | PySide

None of these are part of the Python standard library.  By contrast, IDLE
is built entirely with stdlib components (tkinter for the GUI).  AFAIK,
nothing in the stdlib depends on anything outside of it.  And addition to
the Python stdlib imposes some pretty serious restrictions on a library.
If the IPython team agreed to release their tools with the stdlib instead
of IDLE, they'd have to give up a lot of control over their project:

- License
- Release schedule
- Development environment

Everything gets swallowed into Python.  I can't imagine this ever happening.

All the best,
Jason
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Bug in floating point multiplication

2015-07-06 Thread Jason Swails
On Mon, Jul 6, 2015 at 11:44 AM, Oscar Benjamin 
wrote:

> On Sat, 4 Jul 2015 at 02:12 Jason Swails  wrote:
>
>> On Fri, Jul 3, 2015 at 11:13 AM, Oscar Benjamin <
>> [email protected]> wrote:
>>
>>> On 2 July 2015 at 18:29, Jason Swails  wrote:
>>>
>>> Where is the 32 bit one looks like:
>>>
>>> $ objdump -d a.out.32 | less
>>> ...
>>>
>>  804843e:  fildl  -0x14(%ebp)
>>>  8048441:  fmull  -0x10(%ebp)
>>>  8048444:  fnstcw -0x1a(%ebp)
>>>  8048447:  movzwl -0x1a(%ebp),%eax
>>>  804844b:  mov$0xc,%ah
>>>  804844d:  mov%ax,-0x1c(%ebp)
>>>  8048451:  fldcw  -0x1c(%ebp)
>>>  8048454:  fistpl -0x20(%ebp)
>>>  8048457:  fldcw  -0x1a(%ebp)
>>>  804845a:  mov-0x20(%ebp),%eax
>>>  804845d:  cmp-0x14(%ebp),%eax
>>>  8048460:  jne8048477 
>>>  8048462:  sub$0x8,%esp
>>>  8048465:  pushl  -0x14(%ebp)
>>>  8048468:  push   $0x8048520
>>>  804846d:  call   80482f0 
>>>  8048472:  add$0x10,%esp
>>>  8048475:  jmp8048484 
>>>  8048477:  addl   $0x1,-0x14(%ebp)
>>>  804847b:  cmpl   $0xf423f,-0x14(%ebp)
>>>  8048482:  jle804843e 
>>> ...
>>>
>>> So the 64 bit one is using SSE instructions and the 32-bit one is
>>> using x87. That could explain the difference you see at the C level
>>> but I don't see it on this CPU (/proc/cpuinfo says Intel(R) Core(TM)
>>> i5-3427U CPU @ 1.80GHz).
>>>
>>
>> ​Hmm.  Well that could explain why you don't get the same results as me.
>> My CPU is a
>> AMD FX(tm)-6100 Six-Core Processor
>> ​ (from /proc/cpuinfo).  My objdump looks the same as yours for the
>> 64-bit version, but for 32-bit it looks like:
>>
>
> So if we have different generated machine instructions it suggests a
> difference in the way it was compiled rather than in the hardware itself.
> (Although it could be that the compilers were changed because the hardware
> was inconsistent in this particular usage).
>

​I had assumed that the different compilations resulted from different
underlying hardware (i.e., the instructions used on the Intel chip were
either unavailable, or somehow deemed heuristically inferior on the AMD
chip).​

>
>
>> However, I have no experience looking at raw assembler, so I can't
>> discern what it is I'm even looking at (nor do I know what explicit SSE
>> instructions look like in assembler).
>>
>
> The give away is that SSE instructions use the XMM registers so where you
> see %xmm0 etc data is being loaded ready for SSE instructions. I'll
> translate the important part of the 32 bit code below:
>

​Oh of course.  I *have* seen/worked with code that uses routines from
xmmintrin.h, so I should've been able to piece the xmm together with the
SSE instructions.
​

> [
> ​snip]​
> This means that the x87 register will be storing a higher precision result
> in its 80 bit format. This result will have to be rounded by the FSTPL
> instruction.
>
> If you look at the assembly output I showed you'll see the instructions
> FNSTCW/FLDCW (x87 store/load control word) which are used to manipulate the
> control word to tell the FPU how to perform this kind of rounding. The fact
> that we don't see it in your compiled output could indicate a C compiler
> bug which could in turn explain the different behaviour people see from
> Python.
>
> To understand exactly why 2049 is the number where it fails consider that
> it is the smallest integer that requires 12 bits of mantissa in floating
> point format. The number 1-.5**53 has a mantissa that is 53 ones:
>
> >>> x = 1-.5**53
> >>> x.hex()
> '0x1.fp-1'
>
> When extended to 80 bit real-extended format with a 64 bit mantissa it
> will have 11 trailing zeros. So I think multiplication of x with any
> integer less than 2049 can be performed exactly by the FMULL instruction. I
> haven't fully considered what impact that would have but it seems
> reasonable that this is why 2049 is the first number that fails.
>

​Wow.  Great discussion and description -- I'm convinced.​

Thanks a lot,
Jason
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Should non-security 2.7 bugs be fixed?

2015-07-22 Thread Jason Swails
I am a little late to the party, but I feel that I have something to
contribute to this discussion.  Apologies for the top-post, but it's really
in response to any particular question; more of a "this is my story with
Python 2.7".  I still use primarily Python 2.7, although I write code using
six to support both Py2 and 3 in the same code base and I'm slowly making
the transition.

I have written Python code for primarily Linux machines only since 2008,
and until the last year or two, exclusively for Python 2.  In my brief,
early forays into Python 3, I experienced mostly discomfort.  Discovering
2to3 alleviated some of that. But my general lack of need for anything
beyond ASCII meant that when my abstracted file reading/writing routines
started returning/requiring a mishmash of bytes and str depending on what
kind of file was opened sent me into a TypeError/AttributeError rabbithole
and made me give up.

But Python 2 continued to get developed (eventually culminating in Python
2.7 and its continual improvements), and many good features of Python 3
found its way into Python 2.7 for awhile.  As I got better with Python 2.7,
and finally abandoned Python <2.7, I revisited Python 3 support through the
six module and found that it was really not too bad maintaining a single
code base for both Python 2.7 and Python 3+.  While I was working on that,
though, I still had Python 2.7 to rely on as a safety net.  Basically,
Python 2.7 was my gateway into Python 3.  Where it was more similar to
Python 3, the transition was easier (modules like urllib and website
processing as well as Tkinter coding are some of my more difficult tasks,
since I have to constantly look up where stuff has moved and figure out how
compatible they are between versions). I have also since discovered
TextIOWrapper and come to understand the nature of the bytes/str split and
when to expect which, so that's no longer a problem (even though I still
never need Unicode).

And the more I use six, the more I find that I'm discarding Python 2
baggage (like range and zip in exchange for xrange and izip) and using the
Python 3 replacements through six or __future__ (absolute imports, print
function, division, ...).  And gems like OrderedDict being made available
to Python 2.7 did a lot to convince me to shed my allegiance to Python
<=2.6, getting me even closer to Python 3.

Where I used to see the Py3 overhaul of I/O as an incomprehensible mess
(because it broke all my code!), It now appears as an elegant solution and
I find myself having to patch (fortunately only a few) deficiencies in
Python 2 that simply don't exist in Python 3's superior design.  For
example:

# Works in Python 3, not in Python 2
from six.moves.urllib.request import urlopen
import bz2
import gzip
from io import TextIOWrapper

TextIOWrapper(bz2.BZ2File(urlopen('
http://www.somewebsite.com/path/to/file.bz2')))
TextIOWrapper(gzip.GzipFile(fileobj=urlopen('
http://www.somewebsite.com/path/to/file.gz')))

So for Python 2, my file handling routine has to download the entire
contents to a BytesIO and feed *that* to bz2.decompress or gzip.GzipFile,
which can be a bottleneck if I only want to inspect the headers of many
large files (which I sometimes do).  But the workaround exists and my code
can be written to support both Python 2 and Python 3 without much hassle.
If I run that code under Python 3, I get a huge performance boost in some
corner cases thanks to the improved underlying design.

Python 3 is the future, and thanks to how *good* Python 2.7 is, I am ready
to make that leap and shed some extra code baggage when the popular
versions of the popular Linux distros make the move to a default system
Python 3 (and they will... someday).

I know my experiences don't hold true for everybody, but I also don't think
they are uncommon (I know several colleagues that share many aspects of
them).  And for me, the *better* Python 2.7 becomes, and the longer it's
kept around, the easier (and more fun!) it makes my transition to Python
3.  So for me at least, arguments like "don't make Python 2.7 too good or
people won't switch" are not only wrong, but in actuality
counter-productive.

Apologies for the novel,
Jason

On Sat, Jul 18, 2015 at 7:36 PM, Terry Reedy  wrote:

> I asked the following as an off-topic aside in a reply on another thread.
> I got one response which presented a point I had not considered.  I would
> like more viewpoints from 2.7 users.
>
> Background: each x.y.0 release normally gets up to 2 years of bugfixes,
> until x.(y+1).0 is released.  For 2.7, released summer 2010, the bugfix
> period was initially extended to 5 years, ending about now.  At the spring
> pycon last year, the period was extended to 10 years, with an emphasis on
> security and build fixed.  My general question is what other fixes should
> be made?  Some specific forms of this question are the following.
>
> If the vast majority of Python programmers are focused on 2.7, why are
> volunteers to help fix 2

Re: Is there a way to install ALL Python packages?

2015-07-22 Thread Jason Swails
On Tue, Jul 21, 2015 at 10:58 PM, ryguy7272  wrote:

> On Monday, July 20, 2015 at 10:57:47 PM UTC-4, ryguy7272 wrote:
> > I'd like to install ALL Python packages on my machine.  Even if it takes
> up 4-5GB, or more, I'd like to get everything, and then use it when I need
> it.  Now, I'd like to import packages, like numpy and pandas, but nothing
> will install.  I figure, if I can just install everything, I can simply use
> it when I need it, and if I don't need it, then I just won't use it.
> >
> > I know R offers this as an option.  I figure Python must allow it too.
> >
> > Any idea  how to grab everything?
> >
> > Thanks all.
>
>
> Thanks for the tip.  I just downloaded and installed Anaconda.  I just
> successfully ran my first Python script.  So, so happy now.  Thanks again!!
>

​Anaconda (or the freely-available miniconda) is what I was going to
suggest.  If you are coming from R, then you likely want the scientific
computing stack (scipy, numpy, pandas, scikit-learn, pytables, statsmodels,
matplotlib, ... the list goes on).  And for that (which links extensively
to C and even Fortran libraries), conda blows pip out of the water.​  There
are other solutions (like Enthought's Canopy distribution, for example),
but conda is so nice that I really have little incentive to try others.

All the best,
Jason
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: How Do I ............?

2015-07-31 Thread Jason Swails
On Fri, Jul 31, 2015 at 9:21 PM, Steve Burrus 
wrote:

> How Do I access tkinter in Python 3.4 anyway? I've tried and  tried but
> cannot do it.
>

​You import it.

If I play mind-reader for a second, I suspect you're trying to do in Python
3 what you did in Python 2.  That won't work -- the Tkinter module layout
has completely changed between Python 2 and Python 3.  For starters,
instead of doing:

import Tkinter

like you did in Python 2, you need to do

import tkinter

in Python 3.  There are several other changes, like standalone modules that
are not subpackages in tkinter (e.g., tkMessageBox is now
tkinter.messagebox).  To get a more complete list of name changes, you can
Google something like "tkinter python 2 to python 3", which will give you a
page like this:
http://docs.pythonsprints.com/python3_porting/py-porting.html.

Personally, I don't bother with that.  I have my working Tkinter code from
Python 2 and simply look at what the "2to3" module spits out during its
conversion.  That's often a good way to figure out "how the heck do I do
something in Python 3" when you have a script written for Python 2 that
works.

If that doesn't answer your question, then chances are your Python wasn't
built with Tkinter support (like the system Pythons in many Linux
distributions).  In that case you need to install the appropriate package
(depends on your distro).

​Or if that *still* doesn't answer your question, then provide enough
information so that someone can actually figure out what went wrong ;).​

HTH,
Jason
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: problem with netCDF4 OpenDAP

2015-08-13 Thread Jason Swails
On Thu, Aug 13, 2015 at 6:32 AM, Tom P  wrote:

> I'm having a problem trying to access OpenDAP files using netCDF4.
> The netCDF4 is installed from the Anaconda package. According to their
> changelog, openDAP is supposed to be supported.
>
> netCDF4.__version__
> Out[7]:
> '1.1.8'
>
> Here's some code:
>
> url = '
> http://www1.ncdc.noaa.gov/pub/data/cmb/ersst/v3b/netcdf/ersst.201507.nc'
> nc = netCDF4.Dataset(url)
>
> I get the error -
> netCDF4/_netCDF4.pyx in netCDF4._netCDF4.Dataset.__init__
> (netCDF4/_netCDF4.c:9551)()
>
> RuntimeError: NetCDF: file not found
>
>
> However if I download the same file, it works -
> url = '/home/tom/Downloads/ersst.201507.nc'
> nc = netCDF4.Dataset(url)
> print nc
>  . . . .
>
> Is it something I'm doing wrong?


​Yes.  URLs are not files and cannot be opened like normal files.  netCDF4
*requires* a local file as far as I can tell.

All the best,
Jason

-- 
Jason M. Swails
BioMaPS,
Rutgers University
Postdoctoral Researcher
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: problem with netCDF4 OpenDAP

2015-08-14 Thread Jason Swails

> On Aug 14, 2015, at 3:18 AM, Tom P  wrote:
> 
> Thanks for the reply but that is not what the documentation says.
> 
> http://unidata.github.io/netcdf4-python/#section8
> "Remote OPeNDAP-hosted datasets can be accessed for reading over http if a 
> URL is provided to the netCDF4.Dataset constructor instead of a filename. 
> However, this requires that the netCDF library be built with OPenDAP support, 
> via the --enable-dap configure option (added in version 4.0.1).”

​Huh, so it does.  Your error message says "file not found", though, which 
suggested to me that it's trying to interpret the NetCDF file as a local file 
instead of a URL.  Indeed, when I run that example, the traceback is more 
complete (the traceback you printed had omitted some information):

>>> netCDF4.Dataset('http://www1.ncdc.noaa.gov/pub/data/cmb/ersst/v3b/netcdf/ersst.201507.nc')
syntax error, unexpected WORD_WORD, expecting SCAN_ATTR or SCAN_DATASET or 
SCAN_ERROR
context: 404 Not FoundNot 
FoundThe requested URL 
/pub/data/cmb/ersst/v3b/netcdf/ersst.201507.nc.dds was not found on this 
server.
Traceback (most recent call last):
  File "", line 1, in 
  File "netCDF4/_netCDF4.pyx", line 1547, in netCDF4._netCDF4.Dataset.__init__ 
(netCDF4/_netCDF4.c:9551)
RuntimeError: NetCDF: file not found

So it’s clear that netCDF4 is at least *trying* to go online to look for the 
file, but it simply can’t find it.  Since the docs say it’s linking to libcurl, 
I tried using curl to download the file (curl -# 
http://www1.ncdc.noaa.gov/pub/data/cmb/ersst/v3b/netcdf/ersst.201507.nc > 
test.nc) and it worked fine.  What’s more, it *seems* like the file 
(/pub/.../ersst.201507.nc.dds) was decorated with the ‘.dds’ suffix for some 
reason (not sure if the server redirected the request there or not).  But this 
looks like a netCDF4 issue.  Perhaps you can go to their project page on Github 
and file an issue there -- they will be more likely to have your answer than 
people here.

HTH,
Jason

> 
> and for the Anaconda package -
> http://docs.continuum.io/anaconda/changelog
> "2013-05-08: 1.5.0:
> Highlights:
>  updates to all important packages: python, numpy, scipy, ipython, 
> matplotlib, pandas, cython
>  added netCDF4 (with OpenDAP support) on Linux and MacOSX"
> 
> -- 
> https://mail.python.org/mailman/listinfo/python-list

--
Jason M. Swails
BioMaPS,
Rutgers University
Postdoctoral Researcher

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: netcdf read

2015-09-01 Thread Jason Swails
On Tue, Sep 1, 2015 at 2:07 PM, Chris Angelico  wrote:

> On Wed, Sep 2, 2015 at 3:23 AM,   wrote:
> > I'm starting in the Python scripts. I run this script:
> >
> >
> > import numpy as np
> >
> > import netCDF4
> >
> > f = netCDF4.Dataset('uwnd.mon.ltm.nc','r')
> >
> >
> > f.variables
> >
> >
> > and I had the message:
> >
> >
> > netcdf4.py
> > Traceback (most recent call last):
> >   File "", line 1, in 
> > NameError: name 'netcdf4' is not defined
> >
> >
> > What can I do to solve this.
>
> My crystal ball tells me you're probably running Windows.
>

​Or Mac OS X.  Unless you go out of your way to specify otherwise, the
default OS X filesystem is case-insensitive.

All the best,
Jason
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Script To Remove Files Made Either By Python Or Git

2015-10-09 Thread Jason Swails
On Fri, Oct 9, 2015 at 6:08 AM, Joshua Stokes 
wrote:

> Hi
>
> Is there an available script to remove file created by either using the
> Python module or by using git?
>

​There's always this nugget:

git clean -fxd

This will get rid of *all* untracked files in the current directory of a
git repo (and recursively all subdirectories).  You can optionally specify
a directory at the end of that command.

Careful with this sledgehammer, though, as it will also trash any untracked
source code files as well (and you may never get them back).

HTH,
Jason
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Strong typing implementation for Python

2015-10-12 Thread Jason Swails
On Mon, Oct 12, 2015 at 1:50 PM, Bartc  wrote:

> On 12/10/2015 18:20, Marko Rauhamaa wrote:
>
>> Bartc :
>>
>> (Example, calling fib(40) on the example below took 90 seconds on
>>> Python 3.4, 11 seconds with PyPy, but only 1.8 seconds running the
>>> equivalent with FreeBasic:
>>>
>>
>> I don't know what you need fibonacci numbers for,
>>
>
> It's a benchmark that gives you an idea of how efficient an implementation
> is at doing function calls.
>
> but speed is not the
>> essence of most programming tasks.
>>
>
> They've been wasting their time with PyPy then! Everyone likes a bit more
> speed. It can mean being able to have a solution all within the same
> language.


​Marko said most.  Not all.  And I would agree with that (I'm a
computational scientist, where we put more emphasis on performance than
almost anywhere else).  A lot of our tools need to be hand-optimized using
either assembler or compiler intrinsics to get the most performance
possible out of the machine, but the *vast* majority of our daily
programming does not require this.  Only the most computationally intensive
kernels do (which themselves are small portions of the main simulation
engines!).

Performance only matters when it allows you to do something that you
otherwise couldn't.  pypy makes some things possible that otherwise wasn't,
but there's a reason why CPython is still used overwhelmingly more than
pypy for scientific computing.​

​All the best,
Jason
-- 
https://mail.python.org/mailman/listinfo/python-list


Stylistic question regarding no-op code and tests

2015-10-14 Thread Jason Swails
Hi everyone,

I'd like to get some opinions about some coding constructs that may seem at
first glance to serve no purpose, but does have *some* non-negligible
purpose, and I think I've come to the right place :).

The construct is this:

def my_function(arg1, arg2, filename=None):
""" Some function. If a file is given, it is processed """
# Some code that performs my_function
if filename is not None:
process_file(filename)
else:
pass

My question is, what do you think of the "else: pass" statement?  It is a
complete no-op and is syntactically equivalent to the same code with those
lines removed.  Up until earlier today, I would look at that and cringe (I
still do, a little).

What I recently realized, though, that what this construct allows is for
the coverage testing package (which I have recently started employing for
my project... thanks Ned and others!) to detect whether or not both code
paths are covered in the test suite.

I think my opinion here is that this construct is useful to use when the
two code paths are very different operationally from each other, one is an
unusual path that you are not >100% sure is well-covered in your test
suite, but that your first instinct should be to avoid such code.

What do you think?

Thanks!
Jason

-- 
Jason M. Swails
BioMaPS,
Rutgers University
Postdoctoral Researcher
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Stylistic question regarding no-op code and tests

2015-10-15 Thread Jason Swails
On Wed, Oct 14, 2015 at 10:07 PM, Ben Finney 
wrote:

> Jason Swails  writes:
>
> > What I recently realized, though, that what this construct allows is
> > for the coverage testing package (which I have recently started
> > employing for my project... thanks Ned and others!) to detect whether
> > or not both code paths are covered in the test suite.
>
> Coverage.py has, for many releases now, had good measurement of branch
> coverage by your tests. Enable it with the ‘--branch’ option to ‘run’
>

​Oh, cool.  I'm actually using coverage indirectly through nose, so I
haven't really looked through the coverage docs (although nosetests has a
--cover-branches option that toggles this feature).  Now I can go back to
cringing about "else: pass" in peace :).

Thanks!
Jason

-- 
Jason M. Swails
BioMaPS,
Rutgers University
Postdoctoral Researcher
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What is the meaning of Py_INCREF a static PyTypeObject?

2015-11-12 Thread Jason Swails
On Thu, Nov 12, 2015 at 3:05 AM, Xiang Zhang <[email protected]> wrote:

> Recently I am learning Python C API.
>
> When I read the tutorial <
> https://docs.python.org/3/extending/newtypes.html#the-basics>, defining
> new types, I feel confused. After PyType_Ready(&noddy_NoddyType) comes
> Py_INCREF(&noddy_NoddyType). Actually noddy_NoddyType is a static struct so
> I don't understand why I need to Py_INCREF it. Since it's Py_INCREFed, does
> it mean sometimes we also need to Py_DECREF it? But then it seems that
> type_dealloc will be invoked and it will fail assert(type->tp_flags &
> Py_TPFLAGS_HEAPTYPE);
>

​It is a module attribute, so when the module is imported it has to have a
single reference (the reference *in* the module).  If you don't INCREF it,
then it will have a refcount of 0, and immediately be ready for garbage
collection.  So if you try to use the type from the module, you could get a
segfault because it's trying to use an object (type definition) that was
already destroyed.

Note that you don't *always* have to INCREF objects after you create them
in C.  Some macros and function do that for you.  And in some cases, all
you want or need is a borrowed reference.  In those cases, Py_INCREF is
unnecessary.

The DECREF will be done when it's normally done in Python.  If you do
something like

import noddy
del noddy.NoddyType

​All that's really doing is removing NoddyType from the noddy namespace and
Py_DECREFing it.  Alternatively, doing

import noddy
noddy.NoddyType = 10 # rebind the name

Then the original object NoddyType was pointing to will be DECREFed and
NoddyType will point to an object taking the value of 10.

HTH,
Jason
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Matplotlib Colouring outline of histogram

2014-06-20 Thread Jason Swails
On Fri, Jun 20, 2014 at 4:10 AM, Jamie Mitchell  wrote:

> Hi folks,
>
> Instead of colouring the entire bar of a histogram i.e. filling it, I
> would like to colour just the outline of the histogram. Does anyone know
> how to do this?
> Version - Python2.7
>

Look at the matplotlib.pyplot.hist function documentation:
http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.hist

In addition to the listed parameters, you'll see the "Other Parameters"
taken are those that can be applied to the created Patch objects (which are
the actual rectangles).  For the Patch keywords, see the API documentation
on the Patch object (
http://matplotlib.org/api/artist_api.html#matplotlib.patches.Patch). So you
can do one of two things:

1) Pass the necessary Patch keywords to effect what you want

e.g. (untested):
import matplotlib.pyplot as plt

plt.hist(dataset, bins=10, range=(-5, 5), normed=True,
 edgecolor='b', linewidth=2, facecolor='none', # Patch options
)

plt.show()

2) Iterate over the Patch instances returned by plt.hist() and set the
properties you want.

e.g. (untested):
import matplotlib.pyplot as plt

n, bins, patches = plt.hist(dataset, bins=10, range=(-5, 5), normed=True)
for patch in patches:
patch.set_edgecolor('b') # color of the lines around each bin
patch.set_linewidth(2) # Set width of bin edge
patch.set_facecolor('none') # set no fill
# Anything else you want to do

plt.show()

Approach (1) is the "easy" way, and is there to satisfy the majority of use
cases.  However, approach (2) is _much_ more flexible.  Suppose you wanted
to highlight a particular region of your data with a specific facecolor or
edgecolor -- you can apply the features you want to individual patches
using approach (2).  Or if you wanted to highlight a specific bin with
thicker lines.

This is a common theme in matplotlib -- you can use keywords to apply the
same features to every part of a plot or you can iterate over the drawn
objects and customize them individually.  This is a large part of what
makes matplotlib nice to me -- it has a "simple" mode as well as a
predictable API for customizing a plot in almost any way you could possibly
want.

HTH,
Jason

-- 
Jason M. Swails
BioMaPS,
Rutgers University
Postdoctoral Researcher
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Matplotlib Colouring outline of histogram

2014-06-20 Thread Jason Swails
On Fri, Jun 20, 2014 at 10:27 AM, Jamie Mitchell <
[email protected]> wrote:

>
> That's great Jason thanks for the detailed response, I went with the
> easier option 1!
>
> I am also trying to put hatches on my histograms like so:
>
> plt.hist(dataset,bins=10,hatch=['*'])
>
> When it comes to plt.show() I get the following error message:
> ​[snip]
>
>   File
> "/usr/local/sci/lib/python2.7/site-packages/matplotlib-1.3.1-py2.7-linux-x86_64.egg/matplotlib/path.py",
> line 888, in hatch
> hatch_path = cls._hatch_dict.get((hatchpattern, density))
> TypeError: unhashable type: 'list'
>
> Do you have any idea why this is happening?
>

lists are mutable types, so they are not hashable (and therefore cannot be
used as dictionary keywords).​  You need an immutable type (which _is_
hashable) to act as a dictionary key.  Like strings, tuples, and basic
number types (int, float, etc.).

The hatch should be a string (allowable symbols are given in the API
documentation).  So try

plt.hist(dataset, bins, hatch='*')

HTH,
Jason

-- 
Jason M. Swails
BioMaPS,
Rutgers University
Postdoctoral Researcher
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: This Python 3 is killing Python thread is killing me.

2014-07-16 Thread Jason Swails
On Wed, Jul 16, 2014 at 11:40 AM, Mark Lawrence 
wrote:

> On 16/07/2014 18:32, Deb Wyatt wrote:
>
>> Can you all stop already with the non python US bashing?  Please?
>>
>> Deb in WA, USA
>>
>>
> rr started it with a fairly impressive piece of trolling but as you've
> asked so politely I will happily oblige.


​I honestly don't understand why you haven't kill-filed him yet.  I can
understand people wanting to respond to jmf to prevent newbies and the
Unicode-ignorant from thinking the FSR is not a good thing (or
fundamentally wrong), although I've killfiled him as well. [1] But nobody
will confuse rr's posts with something of value (their only possible use
can be to populate http://en.wikipedia.org/wiki/List_of_logical_fallacies
and http://en.wikipedia.org/wiki/Ad_hominem).  There's nobody to protect
from rr-induced misconceptions. [http://xkcd.com/386/]

My life lurking and learning on Python-list has been dramatically improved
since I began to instantiate filters, I highly recommend it.

Cheers,
Jason

[1] Seen one and you've seen them all, and I'm no unicode expert.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: This Python 3 is killing Python thread is killing me.

2014-07-16 Thread Jason Swails
On Wed, Jul 16, 2014 at 1:27 PM, Mark Lawrence 
wrote:

>
> The difference between our most illustrious resident unicode expert and rr
> is that the former has only said anything of use once, whereas the latter
> does know about tkinter/IDLE.  rr doesn't show up that often, the MIRUC has
> been spewing his mistruths for nearly two years and IMHO should have been
> booted into touch a long time ago.


"Know about" is awfully vague.  Start with 5 things:

1) A browser opened to effbot
2) An open python interpreter
3) A willingness to build widgets and widget collections as Frame subclasses
4) A willingness to fingerpaint with Canvas objects to create custom widgets
5) A useful(ish) program to write

Within a couple hours I learned everything I later saw in all of rr's
Tkinter posts, albeit without the irrelevant condescension. (I've written 3
Tkinter-based GUIs, all simple... I'm no expert with it.)

But that's quite beside the point.  When rr says something 'useful' about
using Python, he probably doesn't need to be corrected.  When he doesn't,
it's often incoherent verbiage with big words, flashy/illogical/outrageous
comparisons, and ad hominem attacks aimed at everyone that's disagreed with
him on the interwebs.  While I occasionally found it satisfying to fire
back and bask in my own logical and moral superiority, the little corner of
my life I devote to python-list is far more peaceful and fulfilling (not to
mention productive) now.

One last opinion before I sign off on this thread, I make an active effort
to attach my name to useful contributions on the web and cut down on the
useless.  I don't want my name associated with the idea "a lot of what he
sends is useless ranting or useless retaliation thereof".  With the volume
of material available on the web, I try to be careful not to make a poor
impression with anything I author (although that is unavoidable sometimes).

All the best,
Jason

P.S. And nobody will think you're just like  if you
don't bite back in a public forum.
-- 
https://mail.python.org/mailman/listinfo/python-list


  1   2   >