date:20140918

Controlling ALLUSERS property in non-interactive MSI installer (Python 3.4.1)

2014-09-18 Thread norman . ives

Hey list

I need to install a private copy of Python on Windows 7 and 8, for use only by 
one specific tool. The install should happen as part of the installation of the 
tool.

I can control TARGETDIR and extensions (via ADDLOCAL) but I'm failing to 
install "just for this user".

I've been trying variations on

msiexec /i python-3.4.1.msi /qb! ALLUSERS=0 WHICHUSERS=JUSTME

I found the WHICHUSERS property by inspecting the install log (/l*v).

This is not working - I still end up with an install using ALLUSERS=1, 
according to the log.

Can anyone comment on this issue? Typical users will not have admin rights, so 
I assume in that case ALLUSERS will default to 0, but I have not yet tested.

Thanks for your attention!
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Why `divmod(float('inf'), 1) == (float('nan'), float('nan'))`

2014-09-18 Thread cool-RR

On Thursday, September 18, 2014 6:12:08 AM UTC+3, Steven D'Aprano wrote:
> cool-RR wrote:
> > Chris, why is this invariant `div*y + mod == x` so important? Maybe it's
> > more important to return a mathematically reasonable result for the the
> > floor-division result than to maintain this invariant?
>
> You keep talking about floor(inf) == inf being "mathematically reasonable",
> but I'm not convinced that it is. Can you justify why you think it is
> mathematically reasonable?

http://i.imgur.com/9SoBbXG.png

> [1] But *which* mathematical infinity? One of the cardinal infinities, the
> alephs, or one of the ordinal infinities, the omegas and the epsilons?

The alephs are about sizes of sets, they have nothing to do with limits. When 
talking about limits, which is what this is about, there is no need for any 
variety of infinities.


Thanks,
Ram.
-- 
https://mail.python.org/mailman/listinfo/python-list

Hierarchical consolidation in Python

2014-09-18 Thread ap501228

I am looking for some tips as to how Python could be used to solve a simple 
business problem involving consolidation of financial data across a company  
with a number of business units rolling up to a department and departments 
rolling up to the whole organization.

Company = Department(1)+Department(2)+...Department (N)
Department(1)= Unit(1,1)+Unit(1,2)+...+Unit(1,K)
...
Department(N)= Unit(N,1)+Unit(N,2)+..+Unit(N,K)

Suppose,for each unit, Unit(i,j) we have a time series of 3 variables:

Income(i,j) , Expenses(i,j) and Surplus(i,j) = Income(i,j)- Expenses(i,j) 

Required to find:

(1)Income, Expenses and Surplus consolidated for all units within a Department; 
and
(2)Income, Expenses and Surplus consolidated for all departments within the 
company.

I would welcome any help in expressing this problem directly in Python. Also, 
are there any Python modules or packages that enable problems of this type to 
be solved. Thanks in advance for any help.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Hierarchical consolidation in Python

2014-09-18 Thread Chris Angelico

On Thu, Sep 18, 2014 at 6:57 PM,   wrote:
> Required to find:
>
> (1)Income, Expenses and Surplus consolidated for all units within a 
> Department; and
> (2)Income, Expenses and Surplus consolidated for all departments within the 
> company.
>
> I would welcome any help in expressing this problem directly in Python. Also, 
> are there any Python modules or packages that enable problems of this type to 
> be solved. Thanks in advance for any help.

This actually sounds more like a data-driven problem than a
code-driven one. If you store everything in a database table, with one
row for each business unit, and their departments, income, and
expenses, you could the do queries like this:

SELECT sum(income), sum(expenses), sum(income-expenses) as surplus
FROM units GROUP BY department

and that'd give you the per-department stats. Drop the GROUP BY to get
stats for the whole company.

If the job's simple enough, or if you already have the data in some
other format, you could do the same thing in Python code. It's
basically the same thing again - accumulate data by department, or
across the whole company.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Why `divmod(float('inf'), 1) == (float('nan'), float('nan'))`

2014-09-18 Thread Steven D'Aprano

Marko Rauhamaa wrote:

> Maybe IEEE had some specific numeric algorithms in mind when it
> introduced inf and nan. However, I have a feeling an exception would be
> a sounder response whenever the arithmetics leaves the solid ground.

I'm afraid that you're missing the essential point of INF and quiet NANs,
namely that they *don't* cause an exception. That is their point.

Back in the Dark Ages of numeric computing, prior to IEEE-754, all was
chaos. Numeric computing was a *mess*. To give an example of how bad it
was, there were well-known computers where:

x != 0

would pass, but then:

1.0/x

would fail with a Division By Zero error (which could mean a segfault).
Another machine could have 1.0*x overflow; a whole class of IBM machines
had 1.0*x lop off one bit of precision each time you called it, so that
multiplying by one would gradually and irreversibly change the number. Chip
designers had a cavalier attitude towards the accuracy of floating point
arithmetic, preferring to optimise for speed even when the result was
wrong. Writing correct, platform-independent floating point code was next
to impossible.

When IEEE-754 was designed, the target was low-level languages similar to C,
Pascal, Algol, Lisp, etc. There were no exceptions in the Python sense, but
many platforms provided signals, where certain operations could signal an
exceptional case and cause an interrupt. IEEE-754 standardised those
hardware-based signals and required any compliant system to provide them.

But it also provided a mechanism for *not* interrupting a long running
calculation just because an exception occurred. Remember that not all
exceptions are necessarily fatal. You can choose whether exceptions in a
calculation will cause a signal, or quietly continue. It even defines two
different kinds of NANs, signalling and quiet NANs: signalling NANs are
supposed to signal, always, and quiet NANs are supposed to either silently
propagate or signal, whichever you choose.

Instead of peppering your code with dozens, even hundreds of Look Before You
Leap checks for error conditions, or installing a single signal handler
which will catch exceptions from anywhere in your application, you have the
choice of also allowing calculations to continue to the end even if they
reach an exceptional case. You can then inspect the result and decide what
to do: report an error, re-do the calculation with different values, skip
that iteration, whatever is appropriate.

The standard even gives NANs a payload, so that they can carry diagnostic
information. For instance, NAN[1] might mean 0/0, while NAN[2] might mean
INF-INF. The Standard Apple Numerics Environment (SANE) in the 1980s and
90s supported that, and it worked really well. Alas, I don't know any other
language or library that even offers a way to inspect the NAN payload, let
alone promises to set it consistently.

In any case, other error handling strategies continue to work, or at least
they are supposed to work.

A good way to understand how the IEEE-754 standard is supposed to work is to
read and use the decimal.py module. (Strictly speaking, decimal doesn't
implement IEEE-754, but another, similar, standard.) Python's binary
floats, which are a thin wrapper around the platform C libraries, is sad
and impoverished compared to what IEEE-754 offers.

-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Hierarchical consolidation in Python

2014-09-18 Thread Mark Lawrence


On 18/09/2014 09:57, [email protected] wrote:

I am looking for some tips as to how Python could be used to solve a simple 
business problem involving consolidation of financial data across a company  
with a number of business units rolling up to a department and departments 
rolling up to the whole organization.

Company = Department(1)+Department(2)+...Department (N)
Department(1)= Unit(1,1)+Unit(1,2)+...+Unit(1,K)
...
Department(N)= Unit(N,1)+Unit(N,2)+..+Unit(N,K)

Suppose,for each unit, Unit(i,j) we have a time series of 3 variables:

Income(i,j) , Expenses(i,j) and Surplus(i,j) = Income(i,j)- Expenses(i,j)

Required to find:

(1)Income, Expenses and Surplus consolidated for all units within a Department; 
and
(2)Income, Expenses and Surplus consolidated for all departments within the 
company.

I would welcome any help in expressing this problem directly in Python. Also, 
are there any Python modules or packages that enable problems of this type to 
be solved. Thanks in advance for any help.



Maybe complete overkill here but this animal http://pandas.pydata.org/ 
is worth looking at.  I'll quote "pandas is an open source, BSD-licensed 
library providing high-performance, easy-to-use data structures and data 
analysis tools for the Python programming language.".


--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list

Is there a canonical way to check whether an iterable is ordered?

2014-09-18 Thread cool-RR

My function gets an iterable of an unknown type. I want to check whether it's 
ordered. I could check whether it's a `set` or `frozenset`, which would cover 
many cases, but I wonder if I can do better. Is there a nicer way to check 
whether an iterable is ordered or not? 


Thanks,
Ram.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Is there a canonical way to check whether an iterable is ordered?

2014-09-18 Thread Chris Angelico

On Thu, Sep 18, 2014 at 9:55 PM, cool-RR  wrote:
> My function gets an iterable of an unknown type. I want to check whether it's 
> ordered. I could check whether it's a `set` or `frozenset`, which would cover 
> many cases, but I wonder if I can do better. Is there a nicer way to check 
> whether an iterable is ordered or not?
>

An iterable is always ordered. You call next() and you get the next
value. Are you asking if there's a way to find out if the order
matters? Not easily. What's your use-case? Why do you need to know?

Also, you're still using Google Groups, which means your formatting is
b0rked. Please can you use something better, or else look into fixing
this.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Is there a canonical way to check whether an iterable is ordered?

2014-09-18 Thread Roy Smith

In article ,
 Chris Angelico  wrote:

> On Thu, Sep 18, 2014 at 9:55 PM, cool-RR  wrote:
> > My function gets an iterable of an unknown type. I want to check whether 
> > it's ordered. I could check whether it's a `set` or `frozenset`, which 
> > would cover many cases, but I wonder if I can do better. Is there a nicer 
> > way to check whether an iterable is ordered or not?
> >
> 
> An iterable is always ordered. You call next() and you get the next
> value.

I suspect what he meant was "How can I tell if I'm iterating over an 
ordered collection?", i.e. iterating over a list vs. iterating over a 
set.

Is there anything which requires an iterator to be deterministic?  For 
example, let's say I have an iterable, i, and I do:

list1 = [item for item in i]
list2 = [item for item in i]

am I guaranteed that list1 == list2?  It will be for all the collections 
I can think of in the standard library, but if I wrote my own class with 
an __iter__() which yielded the items in a non-deterministic order, 
would I be violating something other than the principle of least 
astonishment?
-- 
https://mail.python.org/mailman/listinfo/python-list

Best approach to get data from web page continuously

2014-09-18 Thread Juan Christian

I'll write a python (Python 3.4.1) script to fetch for new data (topics)
from this page (http://steamcommunity.com/app/440/tradingforum)
continuously.

All the topics follow this structure: http://steamcommunity.com/app/440/tradingforum/TOPIC_ID/";> 

It will work like that: I'll get the last topics, do some background
checking regarding user level, inventory value, account age, and other
things, if the user pass in the checking, I'll print some info and links in
the terminal. The only thing I need to know to start is: What's the better
way the get this data? Beautiful Soup 4 + requests? urllib? Others?

Thanks.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Is there a canonical way to check whether an iterable is ordered?

2014-09-18 Thread Chris Angelico

On Thu, Sep 18, 2014 at 10:58 PM, Roy Smith  wrote:
> I suspect what he meant was "How can I tell if I'm iterating over an
> ordered collection?", i.e. iterating over a list vs. iterating over a
> set.

Right, which is what I meant by asking if the order mattered. When you
iterate over a set, you'll get some kind of order (because iterables
have to have an order), but it won't mean anything.

> Is there anything which requires an iterator to be deterministic?  For
> example, let's say I have an iterable, i, and I do:
>
> list1 = [item for item in i]
> list2 = [item for item in i]
>
> am I guaranteed that list1 == list2?  It will be for all the collections
> I can think of in the standard library, but if I wrote my own class with
> an __iter__() which yielded the items in a non-deterministic order,
> would I be violating something other than the principle of least
> astonishment?

It's not guaranteed. If you do exactly that, with no operations in
between, then yes, all the stdlib collections will (AFAIK) give you
matching lists, but that's definitely not required by iterator
protocol.

The one thing you can rely on (and therefore must comply with, when
you design an iterable) is that iteration will hit every element
exactly once. Implementing that on most collections means returning
the values in some internal representational order, something that's
consistent across the lifetime of the iterator; having that not be
consistent across multiple iterators would be the exception, not the
rule.

There would be a few special cases where you specifically *want to*
have the lists differ, though. Imagine DNS records: perhaps you have
four A records for some name, and you want to distribute load between
them [1]. You could have your name server do something like this:

class _cl_iter:
def __init__(self, lst, start):
self._data = lst
self._start = self._next = start
def __iter__(self): return self
def __next__(self):
if self._next is None: raise StopIteration
val = self._data[self._next]
self._next = (self._next + 1) % len(self._data)
if self._next == self._start: self._next = None
return val

class CircularList:
def __init__(self, it):
self._data = list(it)
self._next = -1
def __iter__(self):
self._next = (self._next + 1) % len(self._data)
return _cl_iter(self._data, self._next)

Every time you iterate over a given CircularList, you'll get the same
results, but starting at a different point:

>>> lst = CircularList(("192.0.2.1","192.0.2.2","192.0.2.3","192.0.2.4"))
>>> list(lst)
['192.0.2.1', '192.0.2.2', '192.0.2.3', '192.0.2.4']
>>> list(lst)
['192.0.2.2', '192.0.2.3', '192.0.2.4', '192.0.2.1']
>>> list(lst)
['192.0.2.3', '192.0.2.4', '192.0.2.1', '192.0.2.2']
>>> list(lst)
['192.0.2.4', '192.0.2.1', '192.0.2.2', '192.0.2.3']

So if you have a whole bunch of these for your different A records,
and you return them in iteration order to each client, you'll end up
with different clients getting them in a different order, with minimal
extra storage space. (This is Py3 code; to make it Py2 compatible,
probably all you need to do is subclass object and rename __next__ to
next.)

But this is a pretty unusual case. I would expect that most objects
will either iterate consistently until mutated, or return only what
wasn't previously consumed (like an iterator, which is itself
iterable).

ChrisA

[1] This isn't true load-balancing, of course, but it's a simple way
to distribute requests a bit. It's better than nothing.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Is there a canonical way to check whether an iterable is ordered?

2014-09-18 Thread Terry Reedy

On 9/18/2014 8:58 AM, Roy Smith wrote:

I suspect what he meant was "How can I tell if I'm iterating over an
ordered collection?", i.e. iterating over a list vs. iterating over a
set.

One can check whether the iterable is a tuple, list, range, or tuple or 
list iterator (the latter not being re-iterable).

>>> type(iter([]))

>>> type(iter(()))

Is there anything which requires an iterator to be deterministic?

No. An iterator can yields random number, input from a non-deterministic 
source -- human or mechanical, or items from a collection in shuffled 
order.  Generator that do such can easily be turned into the __iter__ 
method of a class.

> For example, let's say I have an iterable, i, and I do:

list1 = [item for item in i]
list2 = [item for item in i]

If i is an iterator or other non-reiterable, list2 will be empty.
If i is an instance of a class with a non-deterministic __iter__ method, 
list2 will not necessarily be either empty or a copy of list1.

am I guaranteed that list1 == list2?

Clearly not.

> It will be for all the collections I can think of in the standard 
library, but if I wrote my own class with

an __iter__() which yielded the items in a non-deterministic order,
would I be violating something other than the principle of least
astonishment?

There should not be any astonishment.  'Iterable' is a much broader 
category than 'deterministically re-iterable iterable'.

--
Terry Jan Reedy

--
https://mail.python.org/mailman/listinfo/python-list

Re: Why `divmod(float('inf'), 1) == (float('nan'), float('nan'))`

2014-09-18 Thread Grant Edwards

On 2014-09-18, Steven D'Aprano  wrote:
> Marko Rauhamaa wrote:
>
>> Maybe IEEE had some specific numeric algorithms in mind when it
>> introduced inf and nan. However, I have a feeling an exception would be
>> a sounder response whenever the arithmetics leaves the solid ground.
>
> I'm afraid that you're missing the essential point of INF and quiet NANs,
> namely that they *don't* cause an exception. That is their point.

And it is a very important point.  I spent a number of years working
with floating point in process control where the non-signalling
(propogating) behavior of IEEE inf and NaNs was exactly the right
thing.

You've got a set of inputs, and a set of outputs each of which depend
on some (but not usually all of the inputs).  When one of the inputs
goes invalid (NaN) or wonky (creating an infinity), it's vitally
important that the computations _all_ got carried out and _all_ of the
outputs got calculated and updated.  Not updating an output was simply
not an option.

Some outputs end up as NaNs or infinities and some are valid, but
_all_ of them get set to the proper value for the given set of inputs.

Using exceptions would have required a whole "shadow" set of
calculations and logic to try to figure out which outputs were still
valid and could be calculated, and which ones were not valid.  It
would have at least tripled the amount of code required -- and it
probably wouldn't have worked right.

-- 
Grant Edwards   grant.b.edwardsYow! My mind is a potato
  at   field ...
  gmail.com
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Controlling ALLUSERS property in non-interactive MSI installer (Python 3.4.1)

2014-09-18 Thread Joel Goldstick

On Thu, Sep 18, 2014 at 3:22 AM,   wrote:
> Hey list
>
> I need to install a private copy of Python on Windows 7 and 8, for use only 
> by one specific tool. The install should happen as part of the installation 
> of the tool.
>
> I can control TARGETDIR and extensions (via ADDLOCAL) but I'm failing to 
> install "just for this user".
>
> I've been trying variations on
>
> msiexec /i python-3.4.1.msi /qb! ALLUSERS=0 WHICHUSERS=JUSTME
>
> I found the WHICHUSERS property by inspecting the install log (/l*v).
>
> This is not working - I still end up with an install using ALLUSERS=1, 
> according to the log.
>
> Can anyone comment on this issue? Typical users will not have admin rights, 
> so I assume in that case ALLUSERS will default to 0, but I have not yet 
> tested.
>
> Thanks for your attention!
> --
> https://mail.python.org/mailman/listinfo/python-list

You may want to check out this:
http://virtualenvwrapper.readthedocs.org/en/latest/install.html
VirtualEnv creates a python environment that is separate from that on
the system, and is accessible by a user

--
Joel Goldstick
http://joelgoldstick.com
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: the python shell window is already executing a command

2014-09-18 Thread Seymore4Head

On Wed, 17 Sep 2014 23:50:56 -0400, Terry Reedy 
wrote:

>On 9/17/2014 9:34 PM, Seymore4Head wrote:
>> On Wed, 17 Sep 2014 18:56:47 -0400, Terry Reedy 
>
>>> A little digging with Idle's grep (Find in Files) shows that the message
>>> is produced by this code in idlelib/PyShell.py, about 825.
>>>
>>>  def display_executing_dialog(self):
>>>  tkMessageBox.showerror(
>>>  "Already executing",
>>>  "The Python Shell window is already executing a command; "
>>>  "please wait until it is finished.",
>>>  master=self.tkconsole.text)
>>>
>>> This function is only called here (about line 735)
>>>  def runcommand(self, code):
>>>  "Run the code without invoking the debugger"
>>>  # The code better not raise an exception!
>>>  if self.tkconsole.executing:
>>>  self.display_executing_dialog()
>>>  
>>>
>>> How is this run?  Run-Module F5 invokes
>>> ScriptBinding.run_module_event(116) and thence _run_module_event (129).
>>> This methods includes this.
>>>  if PyShell.use_subprocess:
>>>  interp.restart_subprocess(with_cwd=False)
>>>
>>> restart_subprocess includes these lines (starting at 470):
>>>  # Kill subprocess, spawn a new one, accept connection.
>>>  self.rpcclt.close()
>>>  self.terminate_subprocess()
>>>  console = self.tkconsole
>>>  ...
>>>  console.executing = False  # == self.tkconsole
>>>  ...
>>>  self.transfer_path(with_cwd=with_cwd)
>>>
>>> transfer_path calls runcommand but only after tkconsole.executing has
>>> been set to False.  But this only happens if PyShell.use_subprocess is
>>> True, which it normally is, but not if one starts Idle with the -n option.
>>>
>>> After conditionally calling interp.restart_subprocess, _run_module_event
>>> directly calls interp.runcommand, which can fail when running with -n.
>>> Are you?  This is the only way I know to get the error message.  Is so,
>>> the second way to not get the error message is to not use -n and run
>>> normally.
>>
>> Sorry.  I don't speak python yet.  Quite a few of the above terms are
>> new to me.
>>
>> It may be that was trying to run the program again before the current
>> one was finished.  In the past I was getting the error when I was
>> (almost) sure the program had finished.  I will be more careful in the
>> future, but I will also keep an eye out for the problem to repeat.
>> I just tried to run the above program again and gave it more time to
>> finish and I did not get the error, so it could well be I was jumping
>> the gun.
>
>My question was "How do you start Idle?"
>(I can make a difference.)

The way I start IDLE is to go to my programs folder and right click on
file.py in the directory and select "edit with IDLE".
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Best approach to get data from web page continuously

2014-09-18 Thread Joel Goldstick

On Thu, Sep 18, 2014 at 9:30 AM, Juan Christian
 wrote:
> I'll write a python (Python 3.4.1) script to fetch for new data (topics)
> from this page (http://steamcommunity.com/app/440/tradingforum)
> continuously.
>
> All the topics follow this structure:  href="http://steamcommunity.com/app/440/tradingforum/TOPIC_ID/";> 
>
> It will work like that: I'll get the last topics, do some background
> checking regarding user level, inventory value, account age, and other
> things, if the user pass in the checking, I'll print some info and links in
> the terminal. The only thing I need to know to start is: What's the better
> way the get this data? Beautiful Soup 4 + requests? urllib? Others?

Requests is a lot simpler than urllib.  I've used BS4.  There is
something called scrapy that is similar I think

>
> Thanks.
>
> --
> https://mail.python.org/mailman/listinfo/python-list
>



-- 
Joel Goldstick
http://joelgoldstick.com
-- 
https://mail.python.org/mailman/listinfo/python-list

Best practice for opening files for newbies?

2014-09-18 Thread chris . barker

Folks,

I'm in the position of teaching Python to beginners (beginners to Python, 
anyway).

I'm teaching Python2 -- because that is still what most of the code "in the 
wild" is in. I do think I"ll transition to Python 3 fairly soon, as it's not 
too hard for folks to back-port their knowledge, but for now, it's Py2 -- and 
I'm hoping not to have that debate on this thread. 

But I do want to keep the 2->3 transition in mind, so where it's not too hard, 
want to teach approaches that will transition well to py3.

So: there are way too many ways to open a simple file to read or write a bit of 
text (or binary):

open()
file()
io.open()
codecs.open()

others???

I'm thinking that way to go now with modern Py2 is:

from io import open

then use open() .

IIUC, this will give the user an open() that behaves the same way as py3's 
open() (identical?).

The only issue (so far) I've run into is this:

In [51]: f = io.open("test_file.txt", 'w')

In [52]: f.write("some string")
---
TypeError Traceback (most recent call last)
 in ()
> 1 f.write("some string")

TypeError: must be unicode, not str

I'm OK with that -- I think it's better for folks learning py2 now to get used 
to Unicode up front anyway.

But any other issues? Is this a good way to go?

By the way: I note that the default encoding for io.open on my system (OS-X) is 
utf-8, despite:
In [54]: sys.getdefaultencoding()
Out[54]: 'ascii'

How is that determined?

-CHB

-- 
https://mail.python.org/mailman/listinfo/python-list

hashlib suddenly broken

2014-09-18 Thread Larry Martell

I am on a mac running 10.8.5, python 2.7

Suddenly, many of my scripts started failing with:

ValueError: unsupported hash type sha1

Googling this showed that it's an issue with hashlib with a common
cause being a file called hashlib.py that gets in the way of the
interpreter finding the standard hashlib module, but that doesn't seem
to be the case:

>>> import hashlib
ERROR:root:code for hash sha1 was not found.
Traceback (most recent call last):
  File 
"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/hashlib.py",
line 139, in 
globals()[__func_name] = __get_hash(__func_name)
  File 
"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/hashlib.py",
line 103, in __get_openssl_constructor
return __get_builtin_constructor(name)
  File 
"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/hashlib.py",
line 91, in __get_builtin_constructor
raise ValueError('unsupported hash type %s' % name)
ValueError: unsupported hash type sha1

And that file has not changed any time recently:

$ ls -l 
/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/hashlib.py
-rw-r--r--  1 root  wheel  5013 Apr 12  2013
/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/hashlib.py

This just started happening yesterday, and I cannot think of anything
that I've done that could cause this.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Best practice for opening files for newbies?

2014-09-18 Thread Chris Angelico

On Fri, Sep 19, 2014 at 2:19 AM,   wrote:
> So: there are way too many ways to open a simple file to read or write a bit 
> of text (or binary):
>
> open()

Personally, I'd just use this, all the way through - and not importing
from io, either. But others may disagree.

Be clear about what's text and what's bytes, everywhere. When you do
make the jump to Py3, you'll have to worry about text files vs binary
files, and if you need to support Windows as well as Unix, you need to
get that right anyway, so just make sure you get the two straight.
Going Py3 will actually make your job quite a bit easier, there; but
even if you don't, save yourself a lot of trouble later on by keeping
the difference very clear. And you can save yourself some more
conversion trouble by tossing this at the top of every .py file you
write:

from __future__ import print_function, division, unicode_literals

But mainly, just go with the simple open() call and do the job the
easiest way you can. And go Py3 as soon as you can, because ...

> because that is still what most of the code "in the wild" is in.

... this statement isn't really an obvious truth any more (it's hard
to say what "most" code is), and it's definitely not going to remain
so for the long-term future. For people learning Python today, unless
they plan on having a really short career in programming, more of
their time will be after 2020 than before it, and Python 3 is the way
to go.

Plus, it's just way WAY easier to get Unicode right in Py3 than in
Py2. Save yourself the hassle!

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Why `divmod(float('inf'), 1) == (float('nan'), float('nan'))`

2014-09-18 Thread chris . barker

On Wednesday, September 17, 2014 11:22:42 PM UTC-7, [email protected] wrote:
> >>> 1e300*1e300
> 
> inf
> 
> >>> exp(1e300)
> 
> Traceback (most recent call last):
> 
>   File "", line 1, in 
> 
> OverflowError: math range error

FWIW, numpy is a bit more consistent:

In [89]: numpy.exp(1e300)
Out[89]: inf

This is more critical in numpy, because that result may have been one of a big 
huge array of values -- you really don't want the entire array operation to 
raise and Exception because of one odd value.

It's be nice if Python's math module did more than simply wrap the default i 
implementation of the underlying C lib -- it's gotten better over the years 
(Inf and NaN used to be really hard to get), but still not quite what it could 
be.

-Chris

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: hashlib suddenly broken

2014-09-18 Thread John Gordon

In  Larry Martell 
 writes:

> Googling this showed that it's an issue with hashlib with a common
> cause being a file called hashlib.py that gets in the way of the
> interpreter finding the standard hashlib module, but that doesn't seem
> to be the case:

Perhaps hashlib imports some other module which has a local module of the
same name?

SHA1 has been deprecated for some time.  Maybe a recent OS update finally
got rid of it altogether?

-- 
John Gordon Imagine what it must be like for a real medical doctor to
[email protected] 'House', or a real serial killer to watch 'Dexter'.

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Is there a canonical way to check whether an iterable is ordered?

2014-09-18 Thread Tim Chase

On 2014-09-18 08:58, Roy Smith wrote:
> I suspect what he meant was "How can I tell if I'm iterating over
> an ordered collection?", i.e. iterating over a list vs. iterating
> over a set.
> 
> list1 = [item for item in i]
> list2 = [item for item in i]
> 
> am I guaranteed that list1 == list2?  It will be for all the
> collections I can think of in the standard library,

For stdlib *collections*, yes, but if you're just talking generic
iterators, then it can become exhausted in the first:

  with open('example.txt') as f:
list1 = [item for item in f]
list2 = [item for item in f]
assert list1 == list2, "Not equal"

The OP would have to track the meta-information regarding whether the
iterable was sorted.

At least for dicts, order is guaranteed by the specs as long as the
container isn't modified between iterations[1], but I don't see any
similar claim for sets.

You can always test the thing:

  def foo(iterable):
if isinstance(iterable, (set, frozenset)):
  iterable = sorted(iterable)
for thing in iterable:
  do_stuff(thing)

but nothing prevents that from being called with an unsorted list.

That said, sorting in the stdlib is pretty speedy on pre-sorted lists,
so I'd just start by sorting whatever it is that you have, unless
you're positive it's already sorted.

-tkc

[1]
https://docs.python.org/2/library/stdtypes.html#dict.items

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Why `divmod(float('inf'), 1) == (float('nan'), float('nan'))`

2014-09-18 Thread Ian Kelly

On Thu, Sep 18, 2014 at 10:35 AM,   wrote:
> It's be nice if Python's math module did more than simply wrap the default i 
> implementation of the underlying C lib -- it's gotten better over the years 
> (Inf and NaN used to be really hard to get), but still not quite what it 
> could be.

I think there's not a whole lot that can be done due to backward
compatibility issues. Python 3 did make some progress (e.g. math.floor
now returns an int instead of a float).
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: hashlib suddenly broken

2014-09-18 Thread Steven D'Aprano

Larry Martell wrote:

> I am on a mac running 10.8.5, python 2.7
> 
> Suddenly, many of my scripts started failing with:
> 
> ValueError: unsupported hash type sha1
[...]
> This just started happening yesterday, and I cannot think of anything
> that I've done that could cause this.

Ah, the ol' "I didn't change anything, I swear!" excuse *wink*

But seriously... did you perhaps upgrade Python prior to yesterday? Or
possibly an automatic update ran?

Check the creation/last modified dates on:

/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/hashlib.py

but I expect that's probably not where the problem lies. My *wild guess* is
that your system updated SSL, and removed some underlying SHA-1 library
needed by hashlib. SHA-1 is pretty old, and there is now a known attack on
it, so some over-zealous security update may have removed it.

If that's the case, it really is over-zealous, for although SHA-1 is
deprecated, the threat is still some years away. Microsoft, Google and
Mozilla have all announced that they will continue accepting it until 2017.
I can't imagine why Apple would removed it so soon.

-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: hashlib suddenly broken

2014-09-18 Thread Chris Angelico

On Fri, Sep 19, 2014 at 3:07 AM, Steven D'Aprano
 wrote:
> but I expect that's probably not where the problem lies. My *wild guess* is
> that your system updated SSL, and removed some underlying SHA-1 library
> needed by hashlib. SHA-1 is pretty old, and there is now a known attack on
> it, so some over-zealous security update may have removed it.

Or, more likely, the actual code for sha1 is imported from somewhere
else, and *that* module is what's been shadowed. What happens if you
change directory to something with absolutely no .py files in it, then
start interactive Python and try importing hashlib? Maybe you have an
openssl.py or something.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

program to generate data helpful in finding duplicate large files

2014-09-18 Thread David Alban

greetings,

i'm a long time perl programmer who is learning python.  i'd be interested
in any comments you might have on my code below.  feel free to respond
privately if you prefer.  i'd like to know if i'm on the right track.  the
program works, and does what i want it to do.  is there a different way a
seasoned python programmer would have done things?  i would like to learn
the culture as well as the language.  am i missing anything?  i know i'm
not doing error checking below.  i suppose comments would help, too.

i wanted a program to scan a tree and for each regular file, print a line
of text to stdout with information about the file.  this will be data for
another program i want to write which finds sets of duplicate files larger
than a parameter size.  that is, using output from this program, the sets
of files i want to find are on the same filesystem on the same host
(obviously, but i include hostname in the data to be sure), and must have
the same md5 sum, but different inode numbers.

the output of the code below is easier for a human to read when paged
through 'less', which on my mac renders the ascii nuls as "^@" in reverse
video.

thanks,
david


*usage: dupscan [-h] [--start-directory START_DIRECTORY]*

*scan files in a tree and print a line of information about each regular
file*

*optional arguments:*
*  -h, --helpshow this help message and exit*
*  --start-directory START_DIRECTORY, -d START_DIRECTORY*
*specifies the root of the filesystem tree to be*
*processed*




*#!/usr/bin/python*

*import argparse*
*import hashlib*
*import os*
*import re*
*import socket*
*import sys*

*from stat import **

*ascii_nul = chr(0)*

* # from:
http://stackoverflow.com/questions/1131220/get-md5-hash-of-big-files-in-python
*
* # except that i use hexdigest() rather than digest()*
*def md5_for_file(f, block_size=2**20):*
*  md5 = hashlib.md5()*
*  while True:*
*data = f.read(block_size)*
*if not data:*
*  break*
*md5.update(data)*
*  return md5.hexdigest()*

*thishost = socket.gethostname()*

*parser = argparse.ArgumentParser(description='scan files in a tree and
print a line of information about each regular file')*
*parser.add_argument('--start-directory', '-d', default='.',
help='specifies the root of the filesystem tree to be processed')*
*args = parser.parse_args()*

*start_directory = re.sub( '/+$', '', args.start_directory )*

*for directory_path, directory_names, file_names in os.walk(
start_directory ):*
*  for file_name in file_names:*
*file_path = "%s/%s" % ( directory_path, file_name )*

*lstat_info = os.lstat( file_path )*

*mode = lstat_info.st_mode*

*if not S_ISREG( mode ) or S_ISLNK( mode ):*
*  continue*

*f = open( file_path, 'r' )*
*md5sum = md5_for_file( f )*

*dev   = lstat_info.st_dev*
*ino   = lstat_info.st_ino*
*nlink = lstat_info.st_nlink*
*size  = lstat_info.st_size*

*sep = ascii_nul*

*print "%s%c%s%c%d%c%d%c%d%c%d%c%s" % ( thishost, sep, md5sum, sep,
dev, sep, ino, sep, nlink, sep, size, sep, file_path )*

*exit( 0 )*



-- 
Our decisions are the most important things in our lives.
***
Live in a world of your own, but always welcome visitors.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: program to generate data helpful in finding duplicate large files

2014-09-18 Thread Chris Kaynor

On Thu, Sep 18, 2014 at 11:11 AM, David Alban  wrote:

> *#!/usr/bin/python*
>
> *import argparse*
> *import hashlib*
> *import os*
> *import re*
> *import socket*
> *import sys*
>
> *from stat import **
>

Generally, from import * imports are discouraged as they tend to populate
your namespace and have issues with accidentally overriding imported
functions/variables. Generally, its more Pythonic to use the other imports
(or import as) and reference with the namespace, as you are doing
everywhere else. The main case where from import * is recommended is API
imports (for example, importing the API of one module into another, such as
for inter-platform, inter-version, or accelerator support).

>
> *ascii_nul = chr(0)*
>
> * # from:
> http://stackoverflow.com/questions/1131220/get-md5-hash-of-big-files-in-python
> *
> * # except that i use hexdigest() rather than digest()*
> *def md5_for_file(f, block_size=2**20):*
> *  md5 = hashlib.md5()*
> *  while True:*
> *data = f.read(block_size)*
> *if not data:*
> *  break*
> *md5.update(data)*
> *  return md5.hexdigest()*
>
> *thishost = socket.gethostname()*
>
> *parser = argparse.ArgumentParser(description='scan files in a tree and
> print a line of information about each regular file')*
> *parser.add_argument('--start-directory', '-d', default='.',
> help='specifies the root of the filesystem tree to be processed')*
> *args = parser.parse_args()*
>
> *start_directory = re.sub( '/+$', '', args.start_directory )*
>

I'm not sure this is actually needed. Its also not platform-independent as
some platforms (eg, Windows) primary uses "\" instead.

>
> *for directory_path, directory_names, file_names in os.walk(
> start_directory ):*
> *  for file_name in file_names:*
> *file_path = "%s/%s" % ( directory_path, file_name )*
>

os.path.join would be more cross-platform than the string formatting.
Basically, this line would become

file_path = os.path.join(directory_path, file_name)

os.path.join will also ensure that, regardless of the inputs, the paths
will only be joined by a single slash.

> *lstat_info = os.lstat( file_path )*
>
> *mode = lstat_info.st_mode*
>
> *if not S_ISREG( mode ) or S_ISLNK( mode ):*
> *  continue*
>
> *f = open( file_path, 'r' )*
>
*md5sum = md5_for_file( f )*
>

The Pythonic thing to do here would be to use a "with" statement to ensure
the file is closed in a timely manner. This requires Python 2.6 or newer
(2.5 works as well with a future directive).
This would require the above two lines to become:

with open( file_path, 'r' ) as f:
md5sum = md5_for_file( f )

I do note that you never explicitly close the files (which is done via the
with statement in my example). While generally fine as CPython will close
them automatically when no longer referenced, its not a good practice to
get into. Other versions of Python may have delays before the file is
closed, which could then result in errors if processing a huge number of
files. The with statement will ensure the file is closed immediately after
the md5 computation finishes, even if there is an error computing the md5.
Note that in any case, the OS should automatically close the file when the
process exits, but this is likely even worse than relying on Python to
close them for you.

Additionally, you may want to specify binary mode by using open(file_path,
'rb') to ensure platform-independence ('r' uses Universal newlines, which
means on Windows, Python will convert "\r\n" to "\n" while reading the
file). Additionally, some platforms will treat binary files differently.

You may also want to put some additional error handling in here. For
example, the file could be deleted between the "walk" call and the "open"
call, the file may not be readable (locked by other processes, incorrect
permissions, etc). Without knowing your use case, you may need to deal with
those cases, or maybe having the script fail out with an error message is
good enough.

> *dev   = lstat_info.st_dev*
> *ino   = lstat_info.st_ino*
> *nlink = lstat_info.st_nlink*
> *size  = lstat_info.st_size*
>
> *sep = ascii_nul*
>
> *print "%s%c%s%c%d%c%d%c%d%c%d%c%s" % ( thishost, sep, md5sum, sep,
> dev, sep, ino, sep, nlink, sep, size, sep, file_path )*
>

You could use sep.join(thishost, md5sum, dev, nio, nlink, size, file_path)
rather than a string format here, presuming all the input values are
strings (you can call the str function on the values to convert them, which
will do the same as the "%s" formatter).

I don't know how much control you have over the output format (you said you
intend to pipe this output into other code), but if you can change it, I
would suggest either using a pure binary format, using a more
human-readable separator than chr(0), or at least providing an argument to
the script to set the separator (I believe Linux has a -0 argument for many
of i

Re: program to generate data helpful in finding duplicate large files

2014-09-18 Thread Chris Angelico

On Fri, Sep 19, 2014 at 4:11 AM, David Alban  wrote:
> i'm a long time perl programmer who is learning python.  i'd be interested
> in any comments you might have on my code below.  feel free to respond
> privately if you prefer.  i'd like to know if i'm on the right track.

Sure! Happy to help out. But first, a comment about your English, as
shown above: It's conventional to capitalize in certain places, and it
does make your prose more readable. Just as PEP 8 does for Python
code, tidy spelling and linguistic conventions make it easier for
other "English programmers" (wordsmiths?) to follow what you're doing.

I've trimmed out any parts of your code that I don't have comments on.

> ascii_nul = chr(0)

You use this in exactly one place, to initialize sep. I'd either set
sep up here, or use ascii_nul down below (probably the former). But if
you're going to have a named constant, the Python convention is to
upper-case it: ASCII_NUL = chr(0).

> def md5_for_file(f, block_size=2**20):
>   md5 = hashlib.md5()
>   while True:
> data = f.read(block_size)
> if not data:
>   break
> md5.update(data)
>   return md5.hexdigest()

This is used by opening a file and then passing the open file to this
function. Recommendation: Pass a file name, and have this function
open and close the file itself. Then it'll also be a candidate for the
obvious tidyup of using the file as a context manager - the 'with'
statement guarantees that the file will be promptly closed.

> thishost = socket.gethostname()

(Be aware that this won't always give you a useful value. You may want
to provide a default.)

> start_directory = re.sub( '/+$', '', args.start_directory )

Hello, Perl programmer :) When there's text to manipulate, you first
reach for a regular expression. What's this looking to do, exactly?
Trim off any trailing slashes? Python has a clearer way to write that:

start_directory = args.start_directory.rstrip("/")

Lots of text manipulation in Python is done with methods on the string
itself. Have an explore - there's all sorts of powerful stuff there.

> for directory_path, directory_names, file_names in os.walk( start_directory
> ):

Personally, I believe in Huffman coding my names. For locals that
exist for the sole purpose of loop iteration (particularly the third
one, which you use in one place), I'd use shorter names:

for path, dirs, files in os.walk(start_directory):
for fn in files:

> lstat_info = os.lstat( file_path )
> dev   = lstat_info.st_dev
> ino   = lstat_info.st_ino
> nlink = lstat_info.st_nlink
> size  = lstat_info.st_size
>
> sep = ascii_nul
>
> print "%s%c%s%c%d%c%d%c%d%c%d%c%s" % ( thishost, sep, md5sum, sep, dev,
> sep, ino, sep, nlink, sep, size, sep, file_path )

Python 3 turns print into a function, which is able to do this rather
more cleanly. You can call on the new behaviour in a Python 2 program
by putting this immediately under your shebang:

from __future__ import print_function

And then you can write the print call like this:

print(thishost, md5sum, dev, ino, nlink, size, file_path, sep=chr(0))

Or, eliminating all the assignment to locals:

st = os.lstat( file_path )
print(thishost, md5sum, st.st_dev, st.st_ino, st.st_nlink,
st.st_size, file_path, sep=chr(0))

That's shorter than your original line, *and* it doesn't have all the
assignments :)

> exit( 0 )

Unnecessary - if you omit this, you'll exit 0 implicitly at the end of
the script.

Standard rule of programming: There's always another way to do it.
Standard rule of asking for advice: There's always someone who will
advocate another way of doing it. It's up to you to decide which
advice is worth following, which is worth taking note of but not
actually following, and which is to be thrown out as rubbish :)

All the best!

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: program to generate data helpful in finding duplicate large files

2014-09-18 Thread Chris Angelico

On Fri, Sep 19, 2014 at 4:45 AM, Chris Kaynor  wrote:
>> from stat import *
>
>
> Generally, from import * imports are discouraged as they tend to populate
> your namespace and have issues with accidentally overriding imported
> functions/variables. Generally, its more Pythonic to use the other imports
> (or import as) and reference with the namespace, as you are doing everywhere
> else. The main case where from import * is recommended is API imports (for
> example, importing the API of one module into another, such as for
> inter-platform, inter-version, or accelerator support).
>

I was going to say the same thing, except that this module
specifically is documented as recommending that. I still don't like
"import *", but either this is a special case, or the docs need to be
changed.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Best practice for opening files for newbies?

2014-09-18 Thread chris . barker

On Thursday, September 18, 2014 9:38:00 AM UTC-7, Chris Angelico wrote:
> On Fri, Sep 19, 2014 at 2:19 AM,   wrote:
> > So: there are way too many ways to open a simple file to read or write a 
> > bit of text (or binary):
> > open()
> 
> Personally, I'd just use this, all the way through - and not importing
> 
> from io, either. But others may disagree.

well the trick there is that it's a serious trick to work with non-ascii 
compatible text files if you do that...

> Be clear about what's text and what's bytes, everywhere. When you do
> make the jump to Py3, you'll have to worry about text files vs binary
> files, and if you need to support Windows as well as Unix, you need to
> get that right anyway, so just make sure you get the two straight.

yup -- I've always emphasized that point, but from a py2 perspective (and with 
the built in open() file object, what is a utf-8 encoded file? text or bytes? 
It's bytes -- and you need to do the decoding yourself. Why make people do 
that? 

In the past, I started with open(), ignored unicode for a while then when I 
introduced unicode, I pointed them to codecs.open() (I hadn't discovered 
io.open yet ). Maybe I should stick with this approach, but it feels like a bad 
idea.

> Save yourself a lot of trouble later on by keeping 
> the difference very clear.

exactly -- but it's equally clear, and easier and more robust to have two types 
of files: binary and text, where text requires a known encoding. Rather than 
three types: binary, ascii text and encoded text, which is really binary, which 
you can then decode to make text

Think of somethign as simple and common as loping through the lines in a file!

> And you can save yourself some more
> conversion trouble by tossing this at the top of every .py file you
> 
> write:
> 
> from __future__ import print_function, division, unicode_literals

yup -- I've been thinking of recommending that to my students as well -- 
particularly unicode_literal

> But mainly, just go with the simple open() call and do the job the 
> easiest way you can. And go Py3 as soon as you can, because ...

A discussion for another thread

Thanks,
-Chris

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: the python shell window is already executing a command

2014-09-18 Thread Terry Reedy


On 9/18/2014 11:24 AM, Seymore4Head wrote:

On Wed, 17 Sep 2014 23:50:56 -0400, Terry Reedy 
wrote:



My question was "How do you start Idle?"
(I can make a difference.)


The way I start IDLE is to go to my programs folder and right click on
file.py in the directory and select "edit with IDLE".


A couple more questions; after you run the file once, is there a warning 
above the first >>> prompt?  If, after the program stop and you see a 
second >>> prompt and run

>>> import sys; len(sys.modules), 'array' in sys.modules
what is the result?

If you run the program multiple times and get the error message, please 
cut and paste the whole message and the lines above, up to 10 or 15.


--
Terry Jan Reedy

--
https://mail.python.org/mailman/listinfo/python-list

Re: hashlib suddenly broken

2014-09-18 Thread Larry Martell

On Thu, Sep 18, 2014 at 10:47 AM, John Gordon  wrote:
> In  Larry Martell 
>  writes:
>
>> Googling this showed that it's an issue with hashlib with a common
>> cause being a file called hashlib.py that gets in the way of the
>> interpreter finding the standard hashlib module, but that doesn't seem
>> to be the case:
>
> Perhaps hashlib imports some other module which has a local module of the
> same name?

It's failing on the 'import _sha' in hashlib.py:

 66   def __get_builtin_constructor(name):
 67try:
 68  if name in ('SHA1', 'sha1'):
 69   ->import _sha
 70  return _sha.new

(Pdb) s
ImportError: 'No module named _sha'



>
> SHA1 has been deprecated for some time.  Maybe a recent OS update finally
> got rid of it altogether?

I did not do an OS, or any other upgrade or install.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: hashlib suddenly broken

2014-09-18 Thread Larry Martell

On Thu, Sep 18, 2014 at 11:18 AM, Chris Angelico  wrote:
> On Fri, Sep 19, 2014 at 3:07 AM, Steven D'Aprano
>  wrote:
>> but I expect that's probably not where the problem lies. My *wild guess* is
>> that your system updated SSL, and removed some underlying SHA-1 library
>> needed by hashlib. SHA-1 is pretty old, and there is now a known attack on
>> it, so some over-zealous security update may have removed it.
>
> Or, more likely, the actual code for sha1 is imported from somewhere
> else, and *that* module is what's been shadowed. What happens if you
> change directory to something with absolutely no .py files in it, then
> start interactive Python and try importing hashlib? Maybe you have an
> openssl.py or something.

I still get the same error.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: hashlib suddenly broken

2014-09-18 Thread Larry Martell

On Thu, Sep 18, 2014 at 11:07 AM, Steven D'Aprano
 wrote:
> Larry Martell wrote:
>
>> I am on a mac running 10.8.5, python 2.7
>>
>> Suddenly, many of my scripts started failing with:
>>
>> ValueError: unsupported hash type sha1
> [...]
>> This just started happening yesterday, and I cannot think of anything
>> that I've done that could cause this.
>
> Ah, the ol' "I didn't change anything, I swear!" excuse *wink*
>
> But seriously... did you perhaps upgrade Python prior to yesterday? Or
> possibly an automatic update ran?

No, I did not upgrade or install anything.

> Check the creation/last modified dates on:
>
> /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/hashlib.py

That was in my original post:

$ ls -l 
/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/hashlib.py
-rw-r--r--  1 root  wheel  5013 Apr 12  2013
/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/hashlib.py


> but I expect that's probably not where the problem lies. My *wild guess* is
> that your system updated SSL, and removed some underlying SHA-1 library
> needed by hashlib. SHA-1 is pretty old, and there is now a known attack on
> it, so some over-zealous security update may have removed it.
>
> If that's the case, it really is over-zealous, for although SHA-1 is
> deprecated, the threat is still some years away. Microsoft, Google and
> Mozilla have all announced that they will continue accepting it until 2017.
> I can't imagine why Apple would removed it so soon.


So you know how I could check and see if I have SHA-1 and when my SSL
was updated?
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: hashlib suddenly broken

2014-09-18 Thread Larry Martell

On Thu, Sep 18, 2014 at 1:22 PM, Larry Martell  wrote:
> On Thu, Sep 18, 2014 at 11:07 AM, Steven D'Aprano
>  wrote:
>> Larry Martell wrote:
>>
>>> I am on a mac running 10.8.5, python 2.7
>>>
>>> Suddenly, many of my scripts started failing with:
>>>
>>> ValueError: unsupported hash type sha1
>> [...]
>>> This just started happening yesterday, and I cannot think of anything
>>> that I've done that could cause this.
>>
>> Ah, the ol' "I didn't change anything, I swear!" excuse *wink*
>>
>> But seriously... did you perhaps upgrade Python prior to yesterday? Or
>> possibly an automatic update ran?
>
> No, I did not upgrade or install anything.
>
>> Check the creation/last modified dates on:
>>
>> /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/hashlib.py
>
> That was in my original post:
>
> $ ls -l 
> /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/hashlib.py
> -rw-r--r--  1 root  wheel  5013 Apr 12  2013
> /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/hashlib.py
>
>
>> but I expect that's probably not where the problem lies. My *wild guess* is
>> that your system updated SSL, and removed some underlying SHA-1 library
>> needed by hashlib. SHA-1 is pretty old, and there is now a known attack on
>> it, so some over-zealous security update may have removed it.
>>
>> If that's the case, it really is over-zealous, for although SHA-1 is
>> deprecated, the threat is still some years away. Microsoft, Google and
>> Mozilla have all announced that they will continue accepting it until 2017.
>> I can't imagine why Apple would removed it so soon.
>
>
> So you know how I could check and see if I have SHA-1 and when my SSL
> was updated?

Nothing appears to have been recently changed:

$ ls -la 
/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/OpenSSL
total 224
drwxr-xr-x  12 root  wheel 408 Jun 20  2012 .
drwxr-xr-x  41 root  wheel1394 Apr 13  2013 ..
-rwxr-xr-x   1 root  wheel  124736 Apr 12  2013 SSL.so
-rw-r--r--   1 root  wheel 965 Apr 12  2013 __init__.py
-rw-r--r--   1 root  wheel 991 Apr 12  2013 __init__.pyc
-rwxr-xr-x   1 root  wheel  168544 Apr 12  2013 crypto.so
-rwxr-xr-x   1 root  wheel   40864 Apr 12  2013 rand.so
drwxr-xr-x  12 root  wheel 408 Jun 20  2012 test
-rw-r--r--   1 root  wheel1010 Apr 12  2013 tsafe.py
-rw-r--r--   1 root  wheel1775 Apr 12  2013 tsafe.pyc
-rw-r--r--   1 root  wheel 176 Apr 12  2013 version.py
-rw-r--r--   1 root  wheel 293 Apr 12  2013 version.pyc
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: the python shell window is already executing a command

2014-09-18 Thread Seymore4Head

On Thu, 18 Sep 2014 15:05:53 -0400, Terry Reedy 
wrote:

>On 9/18/2014 11:24 AM, Seymore4Head wrote:
>> On Wed, 17 Sep 2014 23:50:56 -0400, Terry Reedy 
>> wrote:
>
>>> My question was "How do you start Idle?"
>>> (I can make a difference.)
>>
>> The way I start IDLE is to go to my programs folder and right click on
>> file.py in the directory and select "edit with IDLE".
>
>A couple more questions; after you run the file once, is there a warning 
>above the first >>> prompt?  If, after the program stop and you see a 
>second >>> prompt and run
> >>> import sys; len(sys.modules), 'array' in sys.modules
>what is the result?
>
>If you run the program multiple times and get the error message, please 
>cut and paste the whole message and the lines above, up to 10 or 15.

I think it might be that I was trying to re run the program too soon.
I haven't messed with any programming too much, but I did re run the
program I posted and gave it more time to finish.  That seems to have
been the problem.

If I run into the problem again after making sure the program has
finished, I will update.

Thanks
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: program to generate data helpful in finding duplicate large files

2014-09-18 Thread Peter Otten

David Alban wrote:

> *sep = ascii_nul*
> 
> *print "%s%c%s%c%d%c%d%c%d%c%d%c%s" % ( thishost, sep, md5sum, sep,
> dev, sep, ino, sep, nlink, sep, size, sep, file_path )*

file_path may contain newlines, therefore you should probably use "\0" to 
separate the records. The other fields may not contain whitespace, so it's 
safe to use " " as the field separator. When you deserialize the record you 
can prevent the file_path from being broken by providing maxsplit to the 
str.split() method:

for record in infile.read().split("\0"):
print(record.split(" ", 6))

Splitting into records without reading the whole data into memory left as an 
exercise ;)

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: hashlib suddenly broken

2014-09-18 Thread John Gordon

In  Larry Martell 
 writes:

> It's failing on the 'import _sha' in hashlib.py:

>  66   def __get_builtin_constructor(name):
>  67try:
>  68  if name in ('SHA1', 'sha1'):
>  69   ->import _sha
>  70  return _sha.new

> (Pdb) s
> ImportError: 'No module named _sha'

This appears to differ from the error you originally reported:

>   File 
> "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/hashlib.py",
> line 91, in __get_builtin_constructor
> raise ValueError('unsupported hash type %s' % name)
> ValueError: unsupported hash type sha1

Could there be two different versions of hashlib.py on your system?

-- 
John Gordon Imagine what it must be like for a real medical doctor to
[email protected] 'House', or a real serial killer to watch 'Dexter'.

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: hashlib suddenly broken

2014-09-18 Thread Ned Deily

In article 
,
 Larry Martell  wrote:
> On Thu, Sep 18, 2014 at 1:22 PM, Larry Martell  
> wrote:
> > On Thu, Sep 18, 2014 at 11:07 AM, Steven D'Aprano
> >  wrote:
> >> Larry Martell wrote:
> >>> I am on a mac running 10.8.5, python 2.7
> >>> Suddenly, many of my scripts started failing with:
> >>>
> >>> ValueError: unsupported hash type sha1
> >> [...]
> >>> This just started happening yesterday, and I cannot think of anything
> >>> that I've done that could cause this.
[...]
> > So you know how I could check and see if I have SHA-1 and when my SSL
> > was updated?

IIRC, the _sha1 extension module is only built for Python 2.7 if the 
necessary OpenSSL libraries (libssl and libcrypto) are not available 
when Python is built.  They are available on OS X so, normally, you 
won't see an _sha1.so with Pythons there.  hashlib.py first tries to 
import _hashlib.so and check that if it was built with the corresponding 
OpenSSL API and then calls it.  On OS X many Python builds, including 
the Apple system Pythons and the python.org Pythons, are dynamically 
linked to the system OpenSSL libs in /usr/lib.  From your original post, 
I'm assuming you are using the Apple-supplied system Python 2.7 on OS X 
10.8.5.  If so, you should see something like this:

$ sw_vers
ProductName:   Mac OS X
ProductVersion:   10.8.5
BuildVersion:  12F45 
$ /usr/bin/python2.7
Python 2.7.2 (default, Oct 11 2012, 20:14:37)
[GCC 4.2.1 Compatible Apple Clang 4.0 (tags/Apple/clang-418.0.60)] on 
darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import _hashlib
>>> dir(_hashlib)
['__doc__', '__file__', '__name__', '__package__', 'new', 'openssl_md5', 
'openssl_sha1', 'openssl_sha224', 'openssl_sha256', 'openssl_sha384', 
'openssl_sha512']
>>> _hashlib.__file__
'/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/l
ib-dynload/_hashlib.so'
>>> ^D
$ otool -L 
'/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/l
ib-dynload/_hashlib.so'
/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/li
b-dynload/_hashlib.so:
   /usr/lib/libssl.0.9.8.dylib (compatibility version 0.9.8, current 
version 47.0.0)
   /usr/lib/libcrypto.0.9.8.dylib (compatibility version 0.9.8, current 
version 47.0.0)
   /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current 
version 169.3.0)
$ ls -l /usr/lib/libssl.0.9.8.dylib
-rwxr-xr-x  1 root  wheel  620848 Sep 18 13:13 
/usr/lib/libssl.0.9.8.dylib
$ ls -l /usr/lib/libcrypto.0.9.8.dylib
-rwxr-xr-x  1 root  wheel  2712368 Sep 18 13:13 
/usr/lib/libcrypto.0.9.8.dylib

Note that this was taken *after* installing the latest 10.8.5 Security 
Update for 10.8 (Security Update 2014-004, 
http://support.apple.com/kb/ht6443) which was just released today; that 
includes an updated OpenSSL.  But, I tried this today just before 
installing the update and it worked the same way, with older 
modification dates.  The python.org Python 2.7.x should look very 
similar but with /Library/Frameworks paths instead of 
/System/Library/Frameworks.  Other Pythons (e.g. MacPorts or Homebrew) 
may be using their own copies of OpenSSL libraries.

-- 
 Ned Deily,
 [email protected]

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: hashlib suddenly broken

2014-09-18 Thread Christian Heimes

On 18.09.2014 21:23, Larry Martell wrote:
> On Thu, Sep 18, 2014 at 11:18 AM, Chris Angelico  wrote:
>> On Fri, Sep 19, 2014 at 3:07 AM, Steven D'Aprano
>>  wrote:
>>> but I expect that's probably not where the problem lies. My *wild guess* is
>>> that your system updated SSL, and removed some underlying SHA-1 library
>>> needed by hashlib. SHA-1 is pretty old, and there is now a known attack on
>>> it, so some over-zealous security update may have removed it.
>>
>> Or, more likely, the actual code for sha1 is imported from somewhere
>> else, and *that* module is what's been shadowed. What happens if you
>> change directory to something with absolutely no .py files in it, then
>> start interactive Python and try importing hashlib? Maybe you have an
>> openssl.py or something.
> 
> I still get the same error.

The Python's implementation of SHA-1 either comes from _hashlib (which
wraps OpenSSL) or from _sha (which uses code from LibTomCrypt and
doesn't require external dependencies. Python 2.7 doesn't have a _sha
module if OpenSSL is available at compile time.

Please try to import _hashlib and see what happens. On Linux:

>>> import _hashlib
>>> _hashlib.__file__
'/usr/lib/python2.7/lib-dynload/_hashlib.x86_64-linux-gnu.so'
>>> _hashlib.openssl_sha1()

>>> _hashlib.openssl_sha1().hexdigest()
'da39a3ee5e6b4b0d3255bfef95601890afd80709'
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: hashlib suddenly broken

2014-09-18 Thread Larry Martell

On Thu, Sep 18, 2014 at 2:21 PM, John Gordon  wrote:
> In  Larry Martell 
>  writes:
>
>> It's failing on the 'import _sha' in hashlib.py:
>
>>  66   def __get_builtin_constructor(name):
>>  67try:
>>  68  if name in ('SHA1', 'sha1'):
>>  69   ->import _sha
>>  70  return _sha.new
>
>> (Pdb) s
>> ImportError: 'No module named _sha'
>
> This appears to differ from the error you originally reported:
>
>>   File 
>> "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/hashlib.py",
>> line 91, in __get_builtin_constructor
>> raise ValueError('unsupported hash type %s' % name)
>> ValueError: unsupported hash type sha1

It's the lower level error that triggers the initial error I reported.
The ImportError is caught and the ValueError is reported.

> Could there be two different versions of hashlib.py on your system?

No, I checked and there is only the ones for the various python
versions. And none that were recently installed or modified. And you
can see the full path reported by python is the expected one.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: hashlib suddenly broken

2014-09-18 Thread Larry Martell

On Thu, Sep 18, 2014 at 2:44 PM, Ned Deily  wrote:
> In article
> ,
>  Larry Martell  wrote:
>> On Thu, Sep 18, 2014 at 1:22 PM, Larry Martell 
>> wrote:
>> > On Thu, Sep 18, 2014 at 11:07 AM, Steven D'Aprano
>> >  wrote:
>> >> Larry Martell wrote:
>> >>> I am on a mac running 10.8.5, python 2.7
>> >>> Suddenly, many of my scripts started failing with:
>> >>>
>> >>> ValueError: unsupported hash type sha1
>> >> [...]
>> >>> This just started happening yesterday, and I cannot think of anything
>> >>> that I've done that could cause this.
> [...]
>> > So you know how I could check and see if I have SHA-1 and when my SSL
>> > was updated?
>
> IIRC, the _sha1 extension module is only built for Python 2.7 if the
> necessary OpenSSL libraries (libssl and libcrypto) are not available
> when Python is built.  They are available on OS X so, normally, you
> won't see an _sha1.so with Pythons there.  hashlib.py first tries to
> import _hashlib.so and check that if it was built with the corresponding
> OpenSSL API and then calls it.  On OS X many Python builds, including
> the Apple system Pythons and the python.org Pythons, are dynamically
> linked to the system OpenSSL libs in /usr/lib.  From your original post,
> I'm assuming you are using the Apple-supplied system Python 2.7 on OS X
> 10.8.5.

Yes, I am using the Apple-supplied system Python 2.7 on OS X 10.8.5.

> If so, you should see something like this:
>
> $ sw_vers
> ProductName:   Mac OS X
> ProductVersion:   10.8.5
> BuildVersion:  12F45
> $ /usr/bin/python2.7
> Python 2.7.2 (default, Oct 11 2012, 20:14:37)
> [GCC 4.2.1 Compatible Apple Clang 4.0 (tags/Apple/clang-418.0.60)] on
> darwin
> Type "help", "copyright", "credits" or "license" for more information.
 import _hashlib
 dir(_hashlib)
> ['__doc__', '__file__', '__name__', '__package__', 'new', 'openssl_md5',
> 'openssl_sha1', 'openssl_sha224', 'openssl_sha256', 'openssl_sha384',
> 'openssl_sha512']
 _hashlib.__file__
> '/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/l
> ib-dynload/_hashlib.so'
 ^D
> $ otool -L
> '/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/l
> ib-dynload/_hashlib.so'
> /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/li
> b-dynload/_hashlib.so:
>/usr/lib/libssl.0.9.8.dylib (compatibility version 0.9.8, current
> version 47.0.0)
>/usr/lib/libcrypto.0.9.8.dylib (compatibility version 0.9.8, current
> version 47.0.0)
>/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current
> version 169.3.0)
> $ ls -l /usr/lib/libssl.0.9.8.dylib
> -rwxr-xr-x  1 root  wheel  620848 Sep 18 13:13
> /usr/lib/libssl.0.9.8.dylib
> $ ls -l /usr/lib/libcrypto.0.9.8.dylib
> -rwxr-xr-x  1 root  wheel  2712368 Sep 18 13:13
> /usr/lib/libcrypto.0.9.8.dylib

I get identical output, with the exception of the mod dates on those 2 files:

$ ls -l /usr/lib/libssl.0.9.8.dylib
-rwxr-xr-x  1 root  wheel  620768 Sep 19  2013 /usr/lib/libssl.0.9.8.dylib
$ ls -l /usr/lib/libcrypto.0.9.8.dylib
-rwxr-xr-x  1 root  wheel  2724720 Sep 19  2013 /usr/lib/libcrypto.0.9.8.dylib

> Note that this was taken *after* installing the latest 10.8.5 Security
> Update for 10.8 (Security Update 2014-004,
> http://support.apple.com/kb/ht6443) which was just released today; that
> includes an updated OpenSSL.

Do you think I should install this update? Perhaps that would restore
whatever is missing.

> But, I tried this today just before
> installing the update and it worked the same way, with older
> modification dates.  The python.org Python 2.7.x should look very
> similar but with /Library/Frameworks paths instead of
> /System/Library/Frameworks.  Other Pythons (e.g. MacPorts or Homebrew)
> may be using their own copies of OpenSSL libraries.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: hashlib suddenly broken

2014-09-18 Thread Larry Martell

On Thu, Sep 18, 2014 at 2:49 PM, Christian Heimes  wrote:
> On 18.09.2014 21:23, Larry Martell wrote:
>> On Thu, Sep 18, 2014 at 11:18 AM, Chris Angelico  wrote:
>>> On Fri, Sep 19, 2014 at 3:07 AM, Steven D'Aprano
>>>  wrote:
 but I expect that's probably not where the problem lies. My *wild guess* is
 that your system updated SSL, and removed some underlying SHA-1 library
 needed by hashlib. SHA-1 is pretty old, and there is now a known attack on
 it, so some over-zealous security update may have removed it.
>>>
>>> Or, more likely, the actual code for sha1 is imported from somewhere
>>> else, and *that* module is what's been shadowed. What happens if you
>>> change directory to something with absolutely no .py files in it, then
>>> start interactive Python and try importing hashlib? Maybe you have an
>>> openssl.py or something.
>>
>> I still get the same error.
>
> The Python's implementation of SHA-1 either comes from _hashlib (which
> wraps OpenSSL) or from _sha (which uses code from LibTomCrypt and
> doesn't require external dependencies. Python 2.7 doesn't have a _sha
> module if OpenSSL is available at compile time.
>
> Please try to import _hashlib and see what happens. On Linux:
>
 import _hashlib
 _hashlib.__file__
> '/usr/lib/python2.7/lib-dynload/_hashlib.x86_64-linux-gnu.so'
 _hashlib.openssl_sha1()
> 
 _hashlib.openssl_sha1().hexdigest()
> 'da39a3ee5e6b4b0d3255bfef95601890afd80709'


$ python
Python 2.7.2 (default, Oct 11 2012, 20:14:37)
[GCC 4.2.1 Compatible Apple Clang 4.0 (tags/Apple/clang-418.0.60)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import _hashlib
>>> _hashlib.__file__
'/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/lib-dynload/_hashlib.so'
>>> _hashlib.openssl_sha1()
Traceback (most recent call last):
  File "", line 1, in 
ValueError: unsupported hash type
>>> _hashlib.openssl_sha1().hexdigest()
Traceback (most recent call last):
  File "", line 1, in 
ValueError: unsupported hash type
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: hashlib suddenly broken

2014-09-18 Thread Christian Heimes

On 18.09.2014 23:39, Larry Martell wrote:
> $ python
> Python 2.7.2 (default, Oct 11 2012, 20:14:37)
> [GCC 4.2.1 Compatible Apple Clang 4.0 (tags/Apple/clang-418.0.60)] on darwin
> Type "help", "copyright", "credits" or "license" for more information.
 import _hashlib
 _hashlib.__file__
> '/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/lib-dynload/_hashlib.so'
 _hashlib.openssl_sha1()
> Traceback (most recent call last):
>   File "", line 1, in 
> ValueError: unsupported hash type
 _hashlib.openssl_sha1().hexdigest()
> Traceback (most recent call last):
>   File "", line 1, in 
> ValueError: unsupported hash type
> 

For unknown reasions your OpenSSL version doesn't support SHA-1. Please
try these two commands on the command line to check version and digest
support of your OpenSSL:

  $ echo -n '' | openssl dgst -sha1 -hex
  (stdin)= da39a3ee5e6b4b0d3255bfef95601890afd80709

  $ openssl version
  OpenSSL 1.0.1f 6 Jan 2014


Please also check which OpenSSL libcrypto is used by the _hashlib.so
shared library. On OSX otool -L should give a similar output as ldd on
Linux:

  $ otool -L
/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/lib-dynload/_hashlib.so

Christian

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: hashlib suddenly broken

2014-09-18 Thread Ned Deily

In article 
,
 Larry Martell  wrote:
> Do you think I should install this update? Perhaps that would restore
> whatever is missing.

Yes. You should install the update in any case and it's unlikely to make 
the hashlib situation worse :=)

-- 
 Ned Deily,
 [email protected]

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: program to generate data helpful in finding duplicate large files

2014-09-18 Thread Gregory Ewing


Chris Angelico wrote:

On Fri, Sep 19, 2014 at 4:45 AM, Chris Kaynor  wrote:


from stat import *


I was going to say the same thing, except that this module
specifically is documented as recommending that. I still don't like
"import *", but either this is a special case, or the docs need to be
changed.


I think it's something of a special case. The main issue with
import * is that it makes it hard for someone reading the code
to tell where names are coming from.

However, all the names in the stat module are prefixed with
S_ or ST_ and are well-known stat-related names from the unix
C library, so there is less room for confusion in this case.

--
Greg
--
https://mail.python.org/mailman/listinfo/python-list

Re: Is there a canonical way to check whether an iterable is ordered?

2014-09-18 Thread Roy Smith

In article ,
 Chris Angelico  wrote:

> The one thing you can rely on (and therefore must comply with, when
> you design an iterable) is that iteration will hit every element
> exactly once. 

Does it actually say that somewhere?  For example:

for i in bag.pick_randomly_with_replacement(n=5):
   print i

shouldn't do that.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Is there a canonical way to check whether an iterable is ordered?

2014-09-18 Thread Chris Angelico

On Fri, Sep 19, 2014 at 9:52 AM, Roy Smith  wrote:
> In article ,
>  Chris Angelico  wrote:
>
>> The one thing you can rely on (and therefore must comply with, when
>> you design an iterable) is that iteration will hit every element
>> exactly once.
>
> Does it actually say that somewhere?  For example:
>
> for i in bag.pick_randomly_with_replacement(n=5):
>print i
>
> shouldn't do that.

When you pick randomly from a population, you create a new population,
which may have duplicates compared to the original. (For efficiency's
sake it probably won't all actually exist immediately, but
conceptually it does exist.) That's what you're iterating over - not
the bag itself.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: the python shell window is already executing a command

2014-09-18 Thread Chris Angelico

On Fri, Sep 19, 2014 at 5:05 AM, Terry Reedy  wrote:
> A couple more questions; after you run the file once, is there a warning
> above the first >>> prompt?  If, after the program stop and you see a second
 prompt and run
 import sys; len(sys.modules), 'array' in sys.modules
> what is the result?

What's significant about the array module here? I'm a little puzzled.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: hashlib suddenly broken

2014-09-18 Thread Steven D'Aprano

Ned Deily wrote:

> In article
> ,
>  Larry Martell  wrote:
>> Do you think I should install this update? Perhaps that would restore
>> whatever is missing.
> 
> Yes. You should install the update in any case and it's unlikely to make
> the hashlib situation worse :=)

However, it is likely to make it impossible to diagnose the problem and stop
it from happening again.

It's not normal behaviour to have functionality just disappear overnight
like this. If Larry is telling the truth that there were no updates
running, *how did the sha-1 library disappear*?

Larry, I recommend that you try Christian's suggestions before upgrading:

  $ echo -n '' | openssl dgst -sha1 -hex
  (stdin)= da39a3ee5e6b4b0d3255bfef95601890afd80709

  $ openssl version
  OpenSSL 1.0.1f 6 Jan 2014

-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Is there a canonical way to check whether an iterable is ordered?

2014-09-18 Thread Steven D'Aprano

cool-RR wrote:

> My function gets an iterable of an unknown type. I want to check whether
> it's ordered. I could check whether it's a `set` or `frozenset`, which
> would cover many cases, but I wonder if I can do better. Is there a nicer
> way to check whether an iterable is ordered or not?

See the collections.abc module:

https://docs.python.org/3/library/collections.abc.html

I think what you want is:

import collections.abc
isinstance(it, collections.abc.Sequence)

Prior to 3.3, you would use:

# Untested.
import collections
isinstance(it, collections.Sequence)



-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Is there a canonical way to check whether an iterable is ordered?

2014-09-18 Thread Steven D'Aprano

Roy Smith wrote:

> Is there anything which requires an iterator to be deterministic?

Absolutely not.

py> def spam():
... while True:
... n = random.randint(0, 10)
... s = ' '.join(['spam']*n)
... if not s:
... return
... yield s + '!'
...
py> for s in spam():
... print(s)
...
spam spam spam spam spam spam!
spam spam!
spam spam spam spam spam spam spam spam spam!
spam spam spam spam spam spam spam!
py>

> For 
> example, let's say I have an iterable, i, and I do:
> 
> list1 = [item for item in i]
> list2 = [item for item in i]

Don't do that. Just write:

list1 = list(i)
list2 = list(i)

> am I guaranteed that list1 == list2?

No.

However, as far as I am aware, there are no built-ins that will fail that
test, yet. Although the iteration order of dicts and sets is arbitrary, I
think that (at least to date) it will be the same order every time you
iterate over the dict or set within a single run of the Python interpreter.
(Provided the dict or set hasn't changed.)

That's not a language guarantee though. It's an implementation detail. In
principle, it could be different each time:

s = set("abcd")
list(s)
=> returns ['d', 'a', 'b', 'c']
list(s)
=> returns ['c', 'a', 'd', 'b']

> It will be for all the collections 
> I can think of in the standard library, but if I wrote my own class with
> an __iter__() which yielded the items in a non-deterministic order,
> would I be violating something other than the principle of least
> astonishment?

Nope.

-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Is there a canonical way to check whether an iterable is ordered?

2014-09-18 Thread Chris Angelico

On Fri, Sep 19, 2014 at 3:15 PM, Steven D'Aprano
 wrote:
> However, as far as I am aware, there are no built-ins that will fail that
> test, yet. Although the iteration order of dicts and sets is arbitrary, I
> think that (at least to date) it will be the same order every time you
> iterate over the dict or set within a single run of the Python interpreter.
> (Provided the dict or set hasn't changed.)
>
> That's not a language guarantee though. It's an implementation detail. In
> principle, it could be different each time:
>
> s = set("abcd")
> list(s)
> => returns ['d', 'a', 'b', 'c']
> list(s)
> => returns ['c', 'a', 'd', 'b']

Possibly for the set, but the dict is guaranteed some measure of stability:

https://docs.python.org/3.4/library/stdtypes.html#dict-views
"""If keys, values and items views are iterated over with no
intervening modifications to the dictionary, the order of items will
directly correspond."""

Also, a little above:
"""
iter(d)

Return an iterator over the keys of the dictionary. This is a shortcut
for iter(d.keys()).
"""

So if iterating over d.keys() and then d.values() with no mutations is
guaranteed to give the same order, then so is iterating over d.keys(),
then d.keys(), then d.values(), and since there's no magic in
iterating over d.values(), it logically follows that iterating over
d.keys() twice will give the same order.

But yes, it's conceivable that the set might change iteration order
arbitrarily. I don't know of any good reason for it to, but it
certainly isn't forbidden.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: program to generate data helpful in finding duplicate large files

2014-09-18 Thread Steven D'Aprano

David Alban wrote:

> *#!/usr/bin/python*
> 
> *import argparse*
> *import hashlib*
> *import os*
> *import re*
> *import socket*
> *import sys*

Um, how did you end up with leading and trailing asterisks? That's going to
stop your code from running.

> *from stat import **

"import *" is slightly discouraged. It's not that it's bad, per se, it's
mostly designed for use at the interactive interpreter, and it can lead to
a few annoyances if you don't know what you are doing. So be careful of
using it when you don't need to.

[...]
> *start_directory = re.sub( '/+$', '', args.start_directory )*

I don't think you need to do that, and you certainly don't need to pull out
the nuclear-powered bulldozer of regular expressions just to crack the
peanut of stripping trailing slashes from a string.

start_directory = args.start_directory.rstrip("/")

ought to do the job.

[...]
> *f = open( file_path, 'r' )*
> *md5sum = md5_for_file( f )*

You never close the file, which means Python will close it for you, when it
is good and ready. In the case of some Python implementations, that might
not be until the interpreter shuts down, which could mean that you run out
of file handles!

Better is to explicitly close the file:

f = open(file_path, 'r')
md5sum = md5_for_file(f)
f.close()

or if you are using a recent version of Python and don't need to support
Python 2.4 or older:

with open(file_path, 'r') as f:
md5sum = md5_for_file(f)

(The "with" block automatically closes the file when you exit the indented
block.)

> *sep = ascii_nul*

Seems a strange choice of a delimiter.

> *print "%s%c%s%c%d%c%d%c%d%c%d%c%s" % ( thishost, sep, md5sum, sep,
> dev, sep, ino, sep, nlink, sep, size, sep, file_path )*

Arggh, my brain! *wink*

Try this instead:

s = '\0'.join([thishost, md5sum, dev, ino, nlink, size, file_path])
print s

> *exit( 0 )*

No need to explicitly call sys.exit (just exit won't work) at the end of
your code. If you exit by falling off the end of your program, Python uses
a exit code of zero. Normally, you should only call sys.exit to:

- exit with a non-zero code;

- to exit early.

-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: program to generate data helpful in finding duplicate large files

2014-09-18 Thread Chris Angelico

On Fri, Sep 19, 2014 at 3:45 PM, Steven D'Aprano
 wrote:
> David Alban wrote:
>> *import sys*
>
> Um, how did you end up with leading and trailing asterisks? That's going to
> stop your code from running.

They're not part of the code, they're part of the mangling of the
formatting. So this isn't a code issue, it's a mailing list /
newsgroup one. David, if you set your mail/news client to send plain
text only (not rich text or HTML or formatted or anything like that),
you'll avoid these problems.

>> *sep = ascii_nul*
>
> Seems a strange choice of a delimiter.

But one that he explained in his body :)

>> *print "%s%c%s%c%d%c%d%c%d%c%d%c%s" % ( thishost, sep, md5sum, sep,
>> dev, sep, ino, sep, nlink, sep, size, sep, file_path )*
>
> Arggh, my brain! *wink*
>
> Try this instead:
>
> s = '\0'.join([thishost, md5sum, dev, ino, nlink, size, file_path])
> print s

That won't work on its own; several of the values are integers. So
either they need to be str()'d or something in the output system needs
to know to convert them to strings. I'm inclined to the latter option,
which simply means importing print_function from __future__ and
setting sep=chr(0).

>> *exit( 0 )*
>
> No need to explicitly call sys.exit (just exit won't work) at the end of
> your code.

Hmm, you sure exit won't work? I normally use sys.exit to set return
values (though as you say, it's unnecessary at the end of the
program), but I tested it (Python 2.7.3 on Debian) and it does seem to
be functional. Do you know what provides it?

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

55 matches

Mail list logo