Re: [Python-Dev] Python 2.7 patch levels turning two digit

2014-06-23 Thread Ethan Furman

On 06/23/2014 01:04 PM, Antoine Pitrou wrote:

Le 23/06/2014 15:27, M.-A. Lemburg a écrit :


Not sure what you mean. We've had binary wininst distributions
for Windows for more than a decade, and egg and msi distributions
for 8 years :-)

But without access to the VS 2008 compiler that is needed to
compile those extensions,


It does seem to be available:
http://www.microsoft.com/en-us/download/details.aspx?id=13276

What am I missing?


Is that VS 2008 /with/ the SP, or just the SP?

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 2.7 patch levels turning two digit

2014-06-23 Thread Ethan Furman

On 06/21/2014 02:48 PM, Ethan Furman wrote:

On 06/21/2014 02:37 PM, M.-A. Lemburg wrote:


My answers to these are: 1. We should use dynamic linking
instead and not let OpenSSL bugs trigger Python releases; 2.
It's not a big problem; 3. Yes, please, since it is difficult
for people to develop and debug their extensions with a
2008 compiler, when the rest of the world has long moved on.


+1  (assuming not incredibly difficult and those that can are willing ;)


Revising this to:

+1, -0, -1

It seems to me the intention of supporting 2.7 for so long was not to give ourselves additional nightmares, but to 
provide a basic level of support for those who are needing longer time before migrating.  One of the reasons to migrate 
is to avoid future pain (pain is an excellent motivator -- it's why we don't go to the doctor when we're healthy, right? 
;)  If getting new or updated modules becomes more painful then that's motivation to upgrade -- not motivation for us to 
make both our lives (with the extra work) and everyone's else lives (why isn't this module working? oh, wrong compiler) 
more difficult.


--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Fix Unicode-disabled build of Python 2.7

2014-06-24 Thread Ethan Furman

On 06/24/2014 12:54 PM, Ned Deily wrote:


Yes, we are committed to maintaining
Python 2.7 for multiple years but that doesn't mean we have to fix every
open issue or even most open issues.  Any or all of the above costs may
apply to any changes we make.  For many of our users, the best
maintenance policy for Python 2.7 would be the least change possible.


+1

We need to keep 2.7 running, but we don't need to kill ourselves doing it.  If a bug has been there for a while, the 
affected users are probably working around it by now.  ;)


--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 471 -- os.scandir() function -- a better and faster directory iterator

2014-06-26 Thread Ethan Furman

On 06/26/2014 04:36 PM, Tim Delaney wrote:

On 27 June 2014 09:28, MRAB wrote:


Personally, I'd prefer the name 'iterdir' because it emphasises that
it's an iterator.


Exactly what I was going to post (with the added note that thee's an obvious 
symmetry with listdir).

+1 for iterdir rather than scandir

Other than that:

+1 for adding [it] to the stdlib


+1 for all of above

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 471 -- os.scandir() function -- a better and faster directory iterator

2014-06-29 Thread Ethan Furman

On 06/29/2014 05:28 AM, Nick Coghlan wrote:


So, here's my alternative proposal: add an "ensure_lstat" flag to
scandir() itself, and don't have *any* methods on DirEntry, only
attributes.

That would make the DirEntry attributes:

 is_dir: boolean, always populated
 is_file: boolean, always populated
 is_symlink boolean, always populated
 lstat_result: stat result, may be None on POSIX systems if
ensure_lstat is False

(I'm not particularly sold on "lstat_result" as the name, but "lstat"
reads as a verb to me, so doesn't sound right as an attribute name)


+1

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 471 -- os.scandir() function -- a better and faster directory iterator

2014-06-29 Thread Ethan Furman

On 06/29/2014 04:12 AM, Jonas Wielicki wrote:


If the flag is set to False, all the fields in the DirEntry will be
None, for consistency, even on Windows.


-1

This consistency is unnecessary.

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 471 -- os.scandir() function -- a better and faster directory iterator

2014-06-30 Thread Ethan Furman

On 06/30/2014 03:07 PM, Tim Delaney wrote:

On 1 July 2014 03:05, Ben Hoyt wrote:


So, here's my alternative proposal: add an "ensure_lstat" flag to
scandir() itself, and don't have *any* methods on DirEntry, only
attributes.
...
Most importantly, *regardless of platform*, the cached stat result (if
not None) would reflect the state of the entry at the time the
directory was scanned, rather than at some arbitrary later point in
time when lstat() was first called on the DirEntry object.


I'm torn between whether I'd prefer the stat fields to be populated
on Windows if ensure_lstat=False or not. There are good arguments each
 way, but overall I'm inclining towards having it consistent with POSIX
- don't populate them unless ensure_lstat=True.

+0 for stat fields to be None on all platforms unless ensure_lstat=True.


If a Windows user just needs the free info, why should s/he have to pay the price of a full stat call?  I see no reason 
to hold the Windows side back and not take advantage of what it has available.  There are plenty of posix calls that 
Windows is not able to use, after all.


--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 471 -- os.scandir() function -- a better and faster directory iterator

2014-06-30 Thread Ethan Furman

On 06/30/2014 04:15 PM, Tim Delaney wrote:

On 1 July 2014 08:38, Ethan Furman wrote:

On 06/30/2014 03:07 PM, Tim Delaney wrote:


I'm torn between whether I'd prefer the stat fields to be populated
on Windows if ensure_lstat=False or not. There are good arguments each
way, but overall I'm inclining towards having it consistent with POSIX
- don't populate them unless ensure_lstat=True.

+0 for stat fields to be None on all platforms unless ensure_lstat=True.


If a Windows user just needs the free info, why should s/he have to pay
the price of a full stat call?  I see no reason to hold the Windows side
 back and not take advantage of what it has available.  There are plenty
of posix calls that Windows is not able to use, after all.


On Windows ensure_lstat would either be either a NOP (if the fields are
always populated), or it simply determines if the fields get populated.
 No extra stat call.


I suppose the exact behavior is still under discussion, as there are only two or three fields one gets "for free" on 
Windows (I think...), where as an os.stat call would get everything available for the platform.




On POSIX it's the difference between an extra stat call or not.


Agreed on this part.

Still, no reason to slow down the Windows side by throwing away info 
unnecessarily -- that's why this PEP exists, after all.

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 471 -- os.scandir() function -- a better and faster directory iterator

2014-06-30 Thread Ethan Furman

On 06/30/2014 06:28 PM, Ben Hoyt wrote:

I suppose the exact behavior is still under discussion, as there are only
two or three fields one gets "for free" on Windows (I think...), where as an
os.stat call would get everything available for the platform.


No, Windows is nice enough to give you all the same stat_result fields
during scandir (via FindFirstFile/FindNextFile) as a regular
os.stat().


Very nice.  Even less reason then to throw it away.  :)

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] My summary of the scandir (PEP 471)

2014-07-01 Thread Ethan Furman

On 07/01/2014 07:59 AM, Jonas Wielicki wrote:


I had the idea to treat a failing lstat() inside scandir() as if the
entry wasn’t found at all, but in this context, this seems wrong too.


Well, os.walk supports passing in an error handler -- perhaps scandir should as 
well.

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] My summary of the scandir (PEP 471)

2014-07-01 Thread Ethan Furman

On 07/01/2014 02:20 PM, Paul Moore wrote:


Please, let's stick to a low-level wrapper round the OS API for the
first iteration of this feature. Enhancements can be added later, when
real-world usage has proved their value.


+1
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] == on object tests identity in 3.x

2014-07-07 Thread Ethan Furman

On 07/07/2014 04:22 AM, Andreas Maier wrote:


Where is the discrepancy between the documentation of == and its default 
implementation on object documented?


There's seems to be no discrepancy (at least, you have not shown it), but to answer the question about why the default 
equals operation is an identity test:


  - all objects should be equal to themselves (there is only one that isn't, 
and it's weird)

  - equality tests should not, as a general rule, raise exceptions -- they 
should return True or False

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] == on object tests identity in 3.x

2014-07-07 Thread Ethan Furman

On 07/07/2014 09:56 AM, Andreas Maier wrote:

Am 07.07.2014 17:55, schrieb Ethan Furman:

On 07/07/2014 04:22 AM, Andreas Maier wrote:


Where is the discrepancy between the documentation of == and its
default implementation on object documented?


There's seems to be no discrepancy (at least, you have not shown it),


The documentation states consistently that == tests the equality of the value 
of an object. The default implementation
of == in both 2.x and 3.x tests the object identity. Is that not a discrepancy?


One could say that the value of an object is the object itself.  Since different objects are different, then they are 
not equal.



but to answer the question about why the default equals operation is an
identity test:

   - all objects should be equal to themselves (there is only one that
isn't, and it's weird)


I agree. But that is not a reason to conclude that different objects (as per 
their identity) should be unequal. Which is
what the default implementation does.


Python cannot know which values are important in an equality test, and which 
are not.  So it refuses to guess.

Think of a chess board, for example.  Are any two black pawns equal?  All 16 pawns came from the same Pawn class, the 
only differences would be in the color and position, but the movement type is the same for all.


So equality for a pawn might mean the same color, or it might mean color and position, or it might mean can move to the 
same position... it's up to the programmer to decide which of the possibilities is the correct one.  Quite frankly, have 
equality mean identity in this case also makes a lot of sense.




   - equality tests should not, as a general rule, raise exceptions --
they should return True or False


Why not? Ordering tests also raise exceptions if ordering is not implemented.


Besides the pawn example, this is probably a matter of practicality over purity -- equality tests are used extensively 
through-out Python, and having exceptions raised at possibly any moment would not be a fun nor productive environment.


Ordering is much less frequent, and since we already tried always ordering things, falling back to type name if 
necessary, we have discovered that that is not a good trade-off.  So now if one tries to order things without specifying 
how it should be done, one gets an exception.


--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Tracker Stats

2014-07-07 Thread Ethan Furman

On 07/07/2014 12:01 PM, francis wrote:

On 06/23/2014 10:12 PM, R. David Murray wrote:


The stats graphs are based on the data generated for the
weekly issue report.  I have a patched version of that
report that adds the bug/enhancement info.  I'll try to dig
it up this week; someone ping me if I forget :)  It think
the patch will need to be updated based on Ezio's changes.


ping


pong
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] == on object tests identity in 3.x

2014-07-07 Thread Ethan Furman

On 07/07/2014 08:29 AM, Andreas Maier wrote:


So the Python 2.7 implementation shows the same discrepancy as Python 3.x 
regarding the == and != default implementation.


Why do you see this as a discrepancy?

Just because two instances from the same object have the same value does not mean they are equal.  For a real-life 
example, look at twins:  biologically identical, yet not equal.


looking-forward-to-the-rebuttal-mega-thread'ly yrs,
--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] == on object tests identity in 3.x

2014-07-07 Thread Ethan Furman

On 07/07/2014 04:36 PM, Andreas Maier wrote:

Am 2014-07-07 19:43, schrieb Ethan Furman:


Python cannot know which values are important in an equality test, and which 
are not.  So it refuses to guess.


Well, one could argue that using the address of an object for its value 
equality test is pretty close to guessing,
considering that given a sensible definition of value equality, objects of 
different identity can very well be equal but
will always be considered unequal based on the address.


And what would be this 'sensible definition'?



So we have many cases of classes whose designers thought about whether a 
sensible definition of equality was needed, and
decided that an address/identity-based equality definition was just what they 
needed, yet they did not want to or could
not use the "is" operator?


1) The address of the object is irrelevant.  While that is what CPython uses, 
it is not what every Python uses.

2) The 'is' operator is specialized, and should only rarely be needed.  If 
equals is what you mean, use '=='.

3) If Python forced us to write our own __eq__ /for every single class/ what would happen?  Well, I suspect quite a few 
would make their own 'object' to inherit from, and would have the fallback of __eq__ meaning object identity. 
Practicality beats purity.




Can you give me an example for such a class (besides type object)? (I.e. a 
class that does not have __eq__() and
__ne__() but whose instances are compared with == or !=)


I never add __eq__ to my classes until I come upon a place where I need to check if two instances of those classes are 
'equal', for whatever I need equal to mean in that case.



Ordering is much less frequent, and since we already tried always ordering 
things, falling back to type name if
necessary, we have discovered that that is not a good trade-off.  So now if one 
tries to order things without
specifying how it should be done, one gets an exception.


In Python 2, the default ordering implementation on type object uses the 
identity (address) as the basis for ordering.
In Python 3, that was changed to raise an exception. That seems to be in sync 
with what you are saying.

Maybe it would have been possible to also change that for the default equality 
implementation in Python 3. But it was
not changed. As I wrote in another response, we now need to document this 
properly.


Doc patches are gratefully accepted.  :)

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] == on object tests identity in 3.x

2014-07-07 Thread Ethan Furman

On 07/07/2014 04:49 PM, Benjamin Peterson wrote:


Probably the best argument for the behavior is that "x is y" should
imply "x == y", which preludes raising an exception. No such invariant
is desired for ordering, so default implementations of < and > are not
provided in Python 3.


Nice.  This bit should definitely make it into the doc patch if not already in 
the docs.

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] == on object tests identity in 3.x

2014-07-07 Thread Ethan Furman

On 07/07/2014 05:12 PM, Andreas Maier wrote:

Am 2014-07-07 18:09, schrieb Ethan Furman:


Just because two instances from the same object have the same value does not 
mean they are equal.  For a real-life
example, look at twins:  biologically identical, yet not equal.


I think they *are* equal in Python if they have the same value, by definition, 
because somewhere the Python docs state
that equality compares the object's values.


And is personality of no value, then?



The reality though is that value is more vague than equality test (as it was 
already pointed out in this thread): A
class designer can directly implement what equality means to the class, but he 
or she cannot implement an accessor
method for the value. The value plays a role only indirectly as part of 
equality and ordering tests.


Not sure what you mean by this.

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] == on object tests identity in 3.x

2014-07-07 Thread Ethan Furman

On 07/07/2014 06:58 PM, Steven D'Aprano wrote:

On Mon, Jul 07, 2014 at 04:52:17PM -0700, Ethan Furman wrote:

On 07/07/2014 04:49 PM, Benjamin Peterson wrote:


Probably the best argument for the behavior is that "x is y" should
imply "x == y", which preludes raising an exception. No such invariant
is desired for ordering, so default implementations of < and > are not
provided in Python 3.


Nice.  This bit should definitely make it into the doc patch if not already
in the docs.


However, saying this should not preclude classes where this is not the
case, e.g. IEEE-754 NANs. I would not like this wording (which otherwise
is very nice) to be used in the future to force reflexivity on object
equality.

https://en.wikipedia.org/wiki/Reflexive_relation

To try to cut off arguments:

- Yes, it is fine to have the default implementation of __eq__
   assume reflexivity.

- Yes, it is fine for standard library containers (lists, dicts,
   etc.) to assume reflexivity of their items.

- I'm fully aware that some people think the non-reflexivity of
   NANs is logically nonsensical and a mistake. I do not agree
   with them.

- I'm not looking to change anything here, the current behaviour
   is fine, I just want to ensure that an otherwise admirable doc
   change does not get interpreted in the future in a way that
   prevents classes from defining __eq__ to be non-reflexive.


+1
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] == on object tests identity in 3.x

2014-07-07 Thread Ethan Furman

On 07/07/2014 06:18 PM, Andreas Maier wrote:

Am 2014-07-08 01:50, schrieb Ethan Furman:


I never add __eq__ to my classes until I come upon a place where I need to 
check if two instances of those classes are
'equal', for whatever I need equal to mean in that case.


With that strategy, you would not be hurt if the default implementation raised 
an exception in case the two objects are
not identical. ;-)


Yes, I would.  Not identical means not equal until I say otherwise.  Raising an exception instead of returning False 
(for __eq__) would be horrible.


--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] == on object tests identity in 3.x

2014-07-07 Thread Ethan Furman

On 07/07/2014 08:34 PM, Stephen J. Turnbull wrote:

Ethan Furman writes:


And what would be this 'sensible definition' [of value equality]?


I think that's the wrong question.  I suppose Andreas's point is that
when the programmer doesn't provide a definition, there is no such
thing as a "sensible definition" to default to.  I disagree, but given
that as the point of discussion, asking what the definition is, is moot.


He eventually made that point, but until he did I thought he meant that there was such a sensible default definition, he 
just wasn't sharing what he thought it might be with us.




2) The 'is' operator is specialized, and should only rarely be
   needed.


Nitpick: Except that it's the preferred way to express identity with
singletons, AFAIK.  ("if x is None: ...", not "if x == None: ...".)


Not a nit at all, at least in my code -- the number of times I use '==' far outweighs the number of times I use 'is'. 
Thus, 'is' is rare.


(Now, of course, I'll have to go measure that assertion and probably find out I 
am wrong :/ ).

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Updates to PEP 471, the os.scandir() proposal

2014-07-08 Thread Ethan Furman

On 07/08/2014 11:05 AM, Ben Hoyt wrote:

Only exposing what the OS provides for free will make the API too difficult
to use in the common case. But is there a nice way to expand the API that
will allow the user who is trying to avoid extra expense know what
information is already available?

Even if the initial version doesn't have a way to check what information is
there for free, ensuring there is a clean way to add this in the future
would be really nice.


We could easily add ".had_type" and ".had_lstat" properties (not sure
on the names), that would be true if the is_X information and lstat
information was fetched, respectively. Basically both would always be
True on Windows, but on POSIX only had_type would be True d_type is
present and != DT_UNKNOWN.

I don't feel this is actually necessary, but it's not hard to add.

Thoughts?


Better to just have the attributes be None if they were not fetched.  None is better than hasattr anyway, at least in 
the respect of not having to catch exceptions to function properly.


--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Updates to PEP 471, the os.scandir() proposal

2014-07-08 Thread Ethan Furman

On 07/08/2014 12:34 PM, Ben Hoyt wrote:


Better to just have the attributes be None if they were not fetched.  None
is better than hasattr anyway, at least in the respect of not having to
catch exceptions to function properly.


The thing is, is_dir() and lstat() are not attributes (for a good
reason). Please read the relevant "Rejected ideas" sections and let us
know what you think. :-)


I did better than that -- I read the whole thing!  ;)

-1 on the PEP's implementation.

Just like an attribute does not imply a system call, having a method named 'is_dir' /does/ imply a system call, and not 
having one can be just as misleading.


If we have this:

size = 0
for entry in scandir('/some/path'):
size += entry.st_size

  - on Windows, this should Just Work (if I have the names correct ;)
  - on Posix, etc., this should fail noisily with either an AttributeError
('entry' has no 'st_size') or a TypeError (cannot add None)

and the solution is equally simple:

for entry in scandir('/some/path', stat=True):

  - if not Windows, perform a stat call at the same time

Now, of course, we might get errors.  I am not a big fan of wrapping everything in try/except, particularly when we 
already have a model to follow -- os.walk:


for entry in scandir('/some/path', stat=True, onerror=record_and_skip):

If we don't care if an error crashes the script, leave off onerror.

If we don't need st_size and friends, leave off stat=True.

If we get better performance on Windows instead of Linux, that's okay.

scandir is going into os because it may not behave the same on every platform.  Heck, even some non-os modules 
(multiprocessing comes to mind) do not behave the same on every platform.


I think caching the attributes for DirEntry is fine, but let's do it as a snapshot of that moment in time, not name now, 
and attributes in 30 minutes when we finally get to you because we had a lot of processing/files ahead of you (you being 
a DirEntry ;) .


--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Updates to PEP 471, the os.scandir() proposal

2014-07-08 Thread Ethan Furman

On 07/08/2014 01:22 PM, Ethan Furman wrote:


I think caching the attributes for DirEntry is fine, but let's do it as a 
snapshot of that moment in time, not name now,
and attributes in 30 minutes when we finally get to you because we had a lot of 
processing/files ahead of you (you being
a DirEntry ;) .


This bit is wrong, I think, since scandir is a generator -- there wouldn't be much time passing between the direntry 
call and the stat call in any case.  Hopefully my other points still hold.


--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Updates to PEP 471, the os.scandir() proposal

2014-07-08 Thread Ethan Furman

On 07/08/2014 06:08 PM, Ben Hoyt wrote:


Just like an attribute does not imply a system call, having a
method named 'is_dir' /does/ imply a system call, and not
having one can be just as misleading.


Why does a method imply a system call? os.path.join() and str.lower()
don't make system calls. Isn't it just a matter of clear
documentation? Anyway -- less philosophical discussion below.


In this case because the names are exactly the same as the os versions which 
/do/ make a system call.



I presume you're suggesting that is_dir/is_file/is_symlink should be
regular attributes, and accessing them should never do a system call.
But what if the system doesn't support d_type (eg: Solaris) or the
d_type value is DT_UNKNOWN (can happen on Linux, OS X, BSD)? The
options are:


So if I'm finally understanding the root problem here:

  - listdir returns a list of strings, one for each filename and one for
each directory, and keeps no other O/S supplied info.

  - os.walk, which uses listdir, then needs to go back to the O/S and
refetch the thrown-away information

  - so it's slow.

The solution:

  - have scandir /not/ throw away the O/S supplied info

and the new problem:

  - not all O/Ses provide the same (or any) extra info about the
directory entries

Have I got that right?

If so, I still like the attribute idea better (surprise!), we just need to revisit the 'ensure_lstat' (or whatever it's 
called) parameter:  instead of a true/false value, it could have a scale:


  - 0 = whatever the O/S gives us

  - 1 = at least the is_dir/is_file (whatever the other normal one is),
and if the O/S doesn't give it to us for free than call lstat

  - 2 = we want it all -- call lstat if necessary on this platform

After all, the programmer should know up front how much of the extra info will be needed for the work that is trying to 
be done.




We have a choice before us, a fork in the road. :-) We can choose one
of these options for the scandir API:

1) The current PEP 471 approach. This solves the issue with d_type
being missing or DT_UNKNOWN, it doesn't require onerror, and it's a
really tidy API that doesn't explode with AttributeErrors if you write
code on Windows (without thinking too hard) and then move to Linux. I
think all of these points are important -- the cross-platform one not
the least, because we want to make it easy, even *trivial*, for people
to write cross-platform code.


Yes, but we don't want a function that sucks equally on all platforms.  ;)



2) Nick Coghlan's model of only fetching the lstat value if
ensure_lstat=True, and including an onerror callback for error
handling when scandir calls lstat internally. However, as described,
we'd also need an ensure_type=True option, so that scandir() isn't way
slower than listdir() if you actually don't want the is_X values and
d_type is missing/unknown.


With the multi-level version of 'ensure_lstat' we do not need an extra 
'ensure_type'.

For reference, here's what get_tree_size() looks like with this approach, not 
including error handling with onerror:

  def get_tree_size(path):
   total = 0
   for entry in os.scandir(path, ensure_lstat=1):
   if entry.is_dir:
   total += get_tree_size(entry.full_name)
   else:
   total += entry.lstat_result.st_size
   return total

And if we added the onerror here it would be a line fragment, as opposed to the extra four lines (at least) for the 
try/except in the first example (which I cut).



Finally:

Thank you for writing scandir, and this PEP.  Excellent work.

Oh, and +1 for option 2, slightly modified.  :)

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Updates to PEP 471, the os.scandir() proposal

2014-07-09 Thread Ethan Furman

On 07/09/2014 05:48 AM, Ben Hoyt wrote:


So how about tweaking option #2 a tiny bit more to this:

def scandir(path='.', info=None, onerror=None): ...

* if info is None (the default), only the .name and .full_name
attributes are present
* if info is 'type', scandir ensures the is_dir/is_file/is_symlink
attributes are present and either True or False
* if info is 'lstat', scandir additionally ensures a .lstat is present
and is a full stat_result object
* if info is 'os', scandir returns the attributes the OS provides
(everything on Windows, only is_X -- most of the time -- on POSIX)


I would rather have the default for info be 'os': cross-platform is good, but there is no reason to force it on some 
poor script that is meant to run on a local machine and will never leave it.




* if onerror is not None and errors occur during any internal lstat()
call, onerror(exc) is called with the OSError exception object


As Paul mentioned, 'onerror(exc, DirEntry)' would be better.



Further point -- because the is_dir/is_file/is_symlink attributes are
booleans, it would be very bad for them to be present but None if you
didn't ask for (or the OS didn't return) the type information. Because
then "if entry.is_dir:" would be None and your code would think it
wasn't a directory, when actually you don't know. For this reason, all
attributes should fail with AttributeError if not fetched.


Fair point, and agreed.

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Updates to PEP 471, the os.scandir() proposal

2014-07-09 Thread Ethan Furman

On 07/09/2014 06:22 AM, Ben Hoyt wrote:


One issue with option #2 that I just realized -- does scandir yield the entry 
at all if there's a stat error? It
can't really, because the caller will expect the .lstat attribute to be set 
(assuming he asked for type='lstat') but
it won't be. Is effectively removing these entries just because the stat failed 
a problem? I kind of think it is. If
so, is there a way to solve it with option #2?


Leave it up to the onerror handler.  If it returns None, skip yielding the 
entry, otherwise yield whatever it returned
-- which also means the error handler should be able to set fields on the 
DirEntry:

  def log_err(exc, entry):
  logger.warn("Cannot stat {}".format(exc.filename))
  entry.lstat.st_size = 0
  return True

  def get_tree_size(path):
  total = 0
  for entry in os.scandir(path, info='lstat', onerror=log_err):
  if entry.is_dir:
  total += get_tree_size(entry.full_name)
  else:
  total += entry.lstat.st_size
  return total

This particular example doesn't benefit much from the addition, but this way we don't have to guess what the programmer 
wants or needs to do in the case of failure.


--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Updates to PEP 471, the os.scandir() proposal

2014-07-09 Thread Ethan Furman

On 07/09/2014 06:41 AM, Ethan Furman wrote:


Leave it up to the onerror handler.  If it returns None, skip yielding the 
entry, otherwise yield whatever it returned
-- which also means the error handler should be able to set fields on the 
DirEntry:

   def log_err(exc, entry):
   logger.warn("Cannot stat {}".format(exc.filename))
   entry.lstat.st_size = 0
   return True


Blah.  Okay, either return the DirEntry (possibly modified), or have the log_err return entry instead of True.  (Now 
where is that caffeine??)


--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Updates to PEP 471, the os.scandir() proposal

2014-07-09 Thread Ethan Furman

On 07/09/2014 08:35 AM, Ben Hoyt wrote:

One issue with option #2 that I just realized -- does scandir yield the
entry at all if there's a stat error? It
can't really, because the caller will expect the .lstat attribute to be
set (assuming he asked for type='lstat') but

it won't be. Is effectively removing these entries just because the stat
failed a problem? I kind of think it is. If
so, is there a way to solve it with option #2?



Leave it up to the onerror handler.  If it returns None, skip yielding the
entry, otherwise yield whatever it returned
-- which also means the error handler should be able to set fields on the
DirEntry:

   def log_err(exc, entry):
   logger.warn("Cannot stat {}".format(exc.filename))
   entry.lstat.st_size = 0
   return True


This is an interesting idea, but it's just getting more and more
complex, and I'm guessing that being able to change the attributes of
DirEntry will make the C implementation more complex.

Also, I'm not sure it's very workable. For log_err above, you'd
actually have to do something like this, right?

def log_err(exc, entry):
 logger.warn("Cannot stat {}".format(exc.filename))
 entry.lstat = os.stat_result((0, 0, 0, 0, 0, 0, 0, 0, 0, 0))
 return entry


I would imagine we would provide a helper function:

  def stat_result(st_size=0, st_atime=0, st_mtime=0, ...):
  return os.stat_result((st_size, st_atime, st_mtime, ...))

and then in onerror

  entry.lstat = stat_result()



Unless there's another simple way around this issue, I'm back to
loving the simplicity of option #1, which avoids this whole question.


Too simple is just as bad as too complex, and properly handling errors is rarely a simple task.  Either we provide a 
clean way to deal with errors in the API, or we force every user everywhere to come up with their own system.


Also, just because we provide it doesn't force people to use it, but if we 
don't provide it then people cannot use it.

To summarize the choice I think we are looking at:

  1) We provide a very basic tool that many will have to write wrappers
 around to get the desired behavior (choice 1)

  2) We provide a more advanced tool that, in many cases, can be used
 as-is, and is also fairly easy to extend to handle odd situations
(choice 2)

More specifically, if we go with choice 1 (no built-in error handling, no mutable DirEntry), how would I implement 
choice 2?  Would I have to write my own CustomDirEntry object?


--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Updates to PEP 471, the os.scandir() proposal

2014-07-09 Thread Ethan Furman

On 07/09/2014 10:10 AM, Paul Moore wrote:

On 9 July 2014 14:22, Ben Hoyt  wrote:

One issue with option #2 that I just realized -- does scandir yield
the entry at all if there's a stat error? It can't really, because the
caller will except the .lstat attribute to be set (assuming he asked
for type='lstat') but it won't be. Is effectively removing these
entries just because the stat failed a problem? I kind of think it is.
If so, is there a way to solve it with option #2?


So the issue is that you need to do a stat but it failed. You have
"whatever the OS gave you", but can't get anything more. This is only
an issue on POSIX, where the original OS call doesn't give you
everything, so it's fine, those POSIX people can just learn to cope
with their broken OS, right? :-)


LOL



More seriously, why not just return a DirEntry that says it's a file
with a stat entry that's all zeroes? That seems pretty harmless. And
the onerror function will be called, so if it is inappropriate the
application can do something. Maybe it's worth letting onerror return
a boolean that says whether to skip the entry, but that's as far as
I'd bother going.


I could live with this -- we could enhance it the future fairly easily if we 
needed to.

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Updates to PEP 471, the os.scandir() proposal

2014-07-09 Thread Ethan Furman

On 07/09/2014 11:04 AM, Paul Moore wrote:

On 9 July 2014 17:35, Ethan Furman  wrote:

More specifically, if we go with choice 1 (no built-in error handling, no
mutable DirEntry), how would I implement choice 2?  Would I have to write my
own CustomDirEntry object?


Having built-in error handling is, I think, a key point. That's where
#1 really falls down.

But a mutable DirEntry and/or letting onerror manipulate the result is
a lot more than just having a hook for being notified of errors. That
seems to me to be a step too far, in the current context.
Specifically, the tree size example doesn't need it.

Do you have a compelling use case that needs a mutable DirEntry? It
feels like YAGNI to me.


Not at this point.  As I indicated in my reply to your response, as long as we have the onerror machinery now we can 
tweak it later if real-world use shows it would be beneficial.


--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Updates to PEP 471, the os.scandir() proposal

2014-07-09 Thread Ethan Furman

On 07/09/2014 12:03 PM, Ben Hoyt wrote:


So here's the ways in which option #2 is now more complicated than option #1:

1) it has an additional "info" argument, the values of which have to
be documented ('os', 'type', 'lstat', and what each one means)
2) it has an additional "onerror" argument, the signature of which and
fairly complicated return value is non-obvious and has to be
documented
3) it requires user modification of the DirEntry object, which needs
documentation, and is potentially hard to implement
4) because the DirEntry object now allows modification, you need a
stat_result() helper function to help you build your own stat values

I'm afraid points 3 and 4 here add way too much complexity.


I'm okay with dropping 3 and 4, and making the return from onerror being simply True to yield the entry, and False/None 
to skip it.  That should make implementation much easier, and documentation not too strenuous either.


--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Updates to PEP 471, the os.scandir() proposal

2014-07-09 Thread Ethan Furman

On 07/09/2014 01:57 PM, Paul Moore wrote:

On 9 July 2014 21:24, Victor Stinner wrote:


Example where you may sometimes need is_dir(), but not always
---
for entry in os.scandir(path):
   if ignore_entry(entry.name):
  # this entry is not interesting, lstat_result is useless here
  continue
   if entry.is_dir():  # fetch required data if needed
  continue
   ...


That is an extremely good point, and articulates why I've always been
a bit uncomfortable with the whole ensure_stat idea.


On a system which did not supply is_dir automatically I would write that as:

  for entry in os.scandir(path):  # info defaults to 'os', which is basically 
None in this case
  if ignore_entry(entry.name):
  continue
  if os.path.isdir(entry.full_name):
  # do something interesting

Not hard to read or understand, no time wasted in unnecessary lstat calls.

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Updates to PEP 471, the os.scandir() proposal

2014-07-09 Thread Ethan Furman

On 07/09/2014 01:24 PM, Victor Stinner wrote:


Sorry, I didn't follow the whole discussion. IMO DirEntry must use
methods and you should not expose nor document which infos are already
provided by the OS or not. DirEntry should be a best-effort black-box
object providing an API similar to pathlib.Path. is_dir() may be fast?
fine, but don't say it in the documentation because Python must remain
portable and you should not write code specific to one specific
platform.


Okay, so using that logic we should head over to the os module and remove:

ctermid, getenv, getegid, geteuid, getgid, getgrouplist, getgroups, getpgid, getpgrp, getpriority, PRIO_PROCESS, 
PRIO_PGRP, PRIO_USER, getresuid, getresgid, getuid, initgroups, putenv, setegid, seteuid, setgid, setgroups, 
setpriority, setregid, setrusgid, setresuid, setreuid, getsid, setsid, setuid, unsetenv, fchmod, fchown, fdatasync, 
fpathconf, fstatvfs, ftruncate, lockf, F_LOCK, F_TLOCK, F_ULOCK, F_TEST, O_DSYNC, O_RSYNC, O_SYNC, O_NDELAY, O_NONBLOCK, 
O_NOCTTY, O_SHLOCK, O_EXLOCK, O_CLOEXEC, O_BINARY, O_NOINHERIT, O_SHORT_LIVED, O_TEMPORARY, O_RANDOM, O_SEQUENTIAL, 
O_TEXT, ...


Okay, I'm tired of typing, but that list is not even half-way through the os page, and those are all methods or 
attributes that are not available on either Windows or Unix or some flavors of Unix.


Oh, and all those upper-case attributes?  Yup, documented.  And when we don't document it ourselves we often refer 
readers to their system documentation because Python does not, in fact, return exactly the same results on all platforms 
-- particularly when calling into the OS.


--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Updates to PEP 471, the os.scandir() proposal

2014-07-09 Thread Ethan Furman

On 07/09/2014 02:42 PM, Ben Hoyt wrote:


Okay, so using that [no platform specific] logic we should head over to the os 
module and remove:

ctermid, getenv, getegid...

Okay, I'm tired of typing, but that list is not even half-way through the os
page, and those are all methods or attributes that are not available on
either Windows or Unix or some flavors of Unix.


True, is this really the precedent we want to *aim for*. listdir() is
cross-platform,


and listdir has serious performance issues, which is why you developed scandir.


Oh, and all those [snipped] upper-case attributes?  Yup, documented.  And when 
we
don't document it ourselves we often refer readers to their system
documentation because Python does not, in fact, return exactly the same
results on all platforms -- particularly when calling into the OS.


But again, why a worse, less cross-platform API when a simple,
cross-platform one is a method call away?


For the same reason we don't use code that makes threaded behavior better, but 
kills the single thread application.

If the programmer would rather have consistency on all platforms rather than performance on the one being used, 
`info='lstat'` is the option to use.


I like the 'onerror' API better primarily because it gives a single point to deal with the errors.  This has at least a 
couple advantages:


  - less duplication of code: in the tree_size example, the error
handling is duplicated twice

  - readablity: with the error handling in a separate routine, one
does not have to jump around the try/except blocks looking for
what happens if there are no errors

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Updates to PEP 471, the os.scandir() proposal

2014-07-09 Thread Ethan Furman

On 07/09/2014 02:38 PM, Victor Stinner wrote:

2014-07-09 22:44 GMT+02:00 Ethan Furman:

On 07/09/2014 01:24 PM, Victor Stinner wrote:


[...] Python must remain
portable and you should not write code specific to one specific
platform.



Okay, so using that logic we should head over to the os module and remove: (...)


My comment was specific to the PEP 471, design of the DirEntry class.


And my comment was to the point of there being methods/attributes/return values that /do/ vary by platform, and /are/ 
documented as such.  Even stat itself is not the same on Windows as posix.


--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Updates to PEP 471, the os.scandir() proposal

2014-07-09 Thread Ethan Furman

On 07/09/2014 02:33 PM, Ben Hoyt wrote:


On a system which did not supply is_dir automatically I would write that as:

   for entry in os.scandir(path):
   if ignore_entry(entry.name):
   continue
   if os.path.isdir(entry.full_name):
   # do something interesting

Not hard to read or understand, no time wasted in unnecessary lstat calls.


No, but how do you know whether you're on "a system which did not
supply is_dir automatically"? The above is not cross-platform, or at
least, not efficient cross-platform, which defeats the whole point of
scandir -- the above is no better than listdir().


Hit a directory with 100,000 entries and you'll change your mind.  ;)

Okay, so the issue is you /want/ to write an efficient, cross-platform 
routine...

hrmmm.

thinking

Okay, marry the two ideas together:

  scandir(path, info=None, onerror=None)
  """
  Return a generator that returns one directory entry at a time in a 
DirEntry object
  info:  None --> DirEntries will have whatever attributes the O/S provides
 'type'  --> DirEntries will already have at least the file/dir 
distinction
 'stat'  --> DirEntries will also already have stat information
  """

  DirEntry.is_dir()
 Return True if this is a directory-type entry; may call os.lstat if the 
cache is empty.

  DirEntry.is_file()
 Return True if this is a file-type entry; may call os.lstat if the cache 
is empty.

  DirEntry.is_symlink()
 Return True if this is a symbolic link; may call os.lstat if the cache is 
empty.

  DirEntry.stat
 Return the stat info for this link; may call os.lstat if the cache is 
empty.


This way both paradigms are supported.

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Updates to PEP 471, the os.scandir() proposal

2014-07-09 Thread Ethan Furman

On 07/09/2014 04:22 PM, MRAB wrote:

On 2014-07-09 23:50, Ethan Furman wrote:


Okay, marry the two ideas together:

scandir(path, info=None, onerror=None)
"""
Return a generator that returns one directory entry at a time in a 
DirEntry object


Should that be "that yields one directory entry at a time"?


Yes, thanks.


info:  None --> DirEntries will have whatever attributes the O/S 
provides
   'type'  --> DirEntries will already have at least the file/dir 
distinction
   'stat'  --> DirEntries will also already have stat information
"""

DirEntry.is_dir()
   Return True if this is a directory-type entry; may call os.lstat if the 
cache is empty.

DirEntry.is_file()
   Return True if this is a file-type entry; may call os.lstat if the cache 
is empty.

DirEntry.is_symlink()
   Return True if this is a symbolic link; may call os.lstat if the cache 
is empty.

DirEntry.stat
   Return the stat info for this link; may call os.lstat if the cache is 
empty.


Why is "is_dir", et al, functions, but "stat" not a function?


Good point.  Make stat a function as well.

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Updates to PEP 471, the os.scandir() proposal

2014-07-09 Thread Ethan Furman

On 07/09/2014 05:15 PM, Victor Stinner wrote:

2014-07-09 17:29 GMT+02:00 Ben Hoyt :

Would this not "break" the tree size script being discussed in the
other thread, as it would follow links and include linked directories
in the "size" of the tree?


The get_tree_size() function in the PEP would use: "if not
entry.is_symlink() and entry.is_dir():".

Note: First I wrote "if entry.is_dir() and not entry.is_symlink():",
but this syntax is slower on Linux because is_dir() has to call
lstat().


Wouldn't it only have to call lstat if the entry was, in fact, a link?



There are only a few cases where you want to handle symlinks
differently: archive (ex: tar), compute the size of a directory (ex:
du does not follow symlinks by default, du -L follows them), remove a
directory.


I agree with Victor here.  If the entry is a link I would want to know if it was a link to a directory or a link to a 
file.  If I care about not following sym links I can check is_symlink() (or whatever it's called).


--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Updates to PEP 471, the os.scandir() proposal

2014-07-10 Thread Ethan Furman

On 07/10/2014 06:58 AM, Nick Coghlan wrote:


The info we want for scandir is that of the *link itself*. That makes it
easy to implement things like the "followlinks" flag of os.walk. The
 *far end* of the link isn't relevant at this level.


This also mirrors listdir, correct?  scandir is simply* returning something 
smarter than a string.


The docs just need to be clear that DirEntry objects always match lstat(), 
never stat().


Agreed.

--
~Ethan~

* As well as being a less resource-intensive generator.  :)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3121, 384 Refactoring Issues

2014-07-10 Thread Ethan Furman

On 07/10/2014 04:57 PM, Alexander Belopolsky wrote:

On Thu, Jul 10, 2014 at 2:59 PM, Mark Lawrence wrote:


I'm just curious as to why there are 54 open issues after both of
these PEPs have been accepted and 384 is listed as finished.  Did
 we hit some unforeseen technical problem which stalled development?


I tried to bring some sanity to that effort by opening a "meta issue":

http://bugs.python.org/issue15787

My enthusiasm, however, vanished after I reviewed the refactoring for the 
datetime module:

http://bugs.python.org/issue15390

My main objections are to following PEP 384 
 (Stable ABI) within stdlib
modules.  I see little benefit for the stdlib (which is shipped fresh with 
every new version of Python) from following
those guidelines.


If we aren't going to implement the changes (and I agree there's little value for the stdlib to do so), let's mark the 
issues as "won't fix" and close them.


And thanks, Mark, for bringing it up.

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Updates to PEP 471, the os.scandir() proposal

2014-07-10 Thread Ethan Furman

On 07/09/2014 09:02 PM, Nick Coghlan wrote:

On 9 Jul 2014 17:14, "Ethan Furman" wrote:


I like the 'onerror' API better primarily because it gives a single
point to deal with the errors. [...]


The "onerror" approach can also deal with readdir failing, which the
 PEP currently glosses over.


Do we want this, though?  I can see an error handler for individual entries, but if one of the *dir commands fails that 
would seem to be fairly catastrophic.



I'm somewhat inclined towards the current approach in the PEP, but I'd like to 
see an explanation of two aspects:

1. How a scandir variant with an 'onerror' option could be implemented given 
the version in the PEP


Here's a stab at it:

def scandir_error(path, info=None, onerror=None):
for entry in scandir(path):
if info == 'type':
try:
entry.is_dir()
except OSError as exc:
if onerror is None:
raise
if not onerror(exc, entry):
continue
elif info == 'lstat':
try:
entry.lstat()
except OSError as exc:
if onerror is None:
raise
if not onerror(exc, entry):
continue
yield entry

Here it is again with an attempt to deal with opendir/readdir/closedir 
exceptions:

def scandir_error(path, info=None, onerror=None):
entries = scandir(path)
try:
entry = next(entries)
except StopIteration:
# pass it through
raise
except Exception as exc:
if onerror is None:
raise
if not onerror(exc, 'what else here?'):
# what do we do on False?
# what do we do on True?
else:
for entry in scandir(path):
if info == 'type':
try:
entry.is_dir()
except OSError as exc:
if onerror is None:
raise
if not onerror(exc, entry):
continue
elif info == 'lstat':
try:
entry.lstat()
except OSError as exc:
if onerror is None:
raise
if not onerror(exc, entry):
continue
yield entry



2. How the existing scandir module handles the 'onerror' parameter to its 
directory walking function


Here's the first third of it from the repo:

def walk(top, topdown=True, onerror=None, followlinks=False):
"""Like os.walk(), but faster, as it uses scandir() internally."""
# Determine which are files and which are directories
dirs = []
nondirs = []
try:
for entry in scandir(top):
if entry.is_dir():
dirs.append(entry)
else:
nondirs.append(entry)
except OSError as error:
if onerror is not None:
onerror(error)
return
...

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] == on object tests identity in 3.x

2014-07-11 Thread Ethan Furman

On 07/11/2014 07:04 AM, Andreas Maier wrote:

Am 09.07.2014 03:48, schrieb Raymond Hettinger:


Personally, I see no need to make the same mistake by removing
the identity-implies-equality rule from the built-in containers.
There's no need to upset the apple cart for nearly zero benefit.


Containers delegate the equal comparison on the container to their elements; 
they do not apply identity-based comparison
to their elements. At least that is the externally visible behavior.


If that were true, then [NaN] == [NaN] would be False, and it is not.

Here is the externally visible behavior:

Python 3.5.0a0 (default:34881ee3eec5, Jun 16 2014, 11:31:20)
[GCC 4.7.3] on linux
Type "help", "copyright", "credits" or "license" for more information.
--> NaN = float('nan')
--> NaN == NaN
False
--> [NaN] == [NaN]
True

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] == on object tests identity in 3.x - list delegation to members?

2014-07-13 Thread Ethan Furman

On 07/13/2014 08:13 AM, Andreas Maier wrote:

Am 11.07.2014 22:54, schrieb Ethan Furman:


Here is the externally visible behavior:

Python 3.5.0a0 (default:34881ee3eec5, Jun 16 2014, 11:31:20)
[GCC 4.7.3] on linux
Type "help", "copyright", "credits" or "license" for more information.
--> NaN = float('nan')
--> NaN == NaN
False
--> [NaN] == [NaN]
True


Ouch, that hurts ;-)


Yeah, I've been bitten enough times that now I try to always test code before I 
post.  ;)



Test #8: Same object of class C
(C.__eq__() implemented with equality of x,
 C.__ne__() returning NotImplemented):

   obj1: type=, str=C(256), id=39406504
   obj2: type=, str=C(256), id=39406504

   a) obj1 is obj2: True
C.__eq__(): self=39406504, other=39406504, returning True


This is interesting/weird/odd -- why is __eq__ being called for an 'is' test?

--- test_eq.py 
class TestEqTrue:
def __eq__(self, other):
print('Test.__eq__ returning True')
return True

class TestEqFalse:
def __eq__(self, other):
print('Test.__eq__ returning False')
return False

tet = TestEqTrue()
print(tet is tet)
print(tet in [tet])

tef = TestEqFalse()
print(tef is tef)
print(tef in [tef])
---

When I run this all I get is four Trues, never any messages about being in 
__eq__.

How did you get that result?

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remaining decisions on PEP 471 -- os.scandir()

2014-07-13 Thread Ethan Furman

On 07/13/2014 05:33 PM, Ben Hoyt wrote:


On the recent python-dev thread, Victor especially made some well
thought out suggestions. It seems to me there's general agreement that
the basic API in PEP 471 is good (with Ethan not a fan at first, but
it seems he's on board after further discussion :-).


I would still like to have 'info' and 'onerror' added to the basic API, but I agree that having methods and caching on 
first lookup is good.




That said, I think there's basically one thing remaining to decide:
whether or not to have DirEntry.is_dir() and .is_file() follow
symlinks by default.


We should have a flag for that, and default it to False:

  scandir(path, *, followlinks=False, info=None, onerror=None)

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] == on object tests identity in 3.x - list delegation to members?

2014-07-13 Thread Ethan Furman

On 07/13/2014 10:33 PM, Andreas Maier wrote:

Am 14.07.2014 04:55, schrieb Ethan Furman:

On 07/13/2014 08:13 AM, Andreas Maier wrote:

Test #8: Same object of class C
(C.__eq__() implemented with equality of x,
 C.__ne__() returning NotImplemented):

   obj1: type=, str=C(256), id=39406504
   obj2: type=, str=C(256), id=39406504

   a) obj1 is obj2: True
C.__eq__(): self=39406504, other=39406504, returning True


This is interesting/weird/odd -- why is __eq__ being called for an 'is'
test?


The debug messages are printed before the result is printed. So this is the 
debug message for the next case, 8.b).


Ah, whew!  That's a relief.


Sorry for not explaining it.


Had I been reading more closely I would (hopefully) have noticed that, but I 
was headed out the door at the time.

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Python Job Board

2014-07-14 Thread Ethan Furman

has now been dead for five months.

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python Job Board

2014-07-14 Thread Ethan Furman

On 07/14/2014 10:43 AM, Skip Montanaro wrote:

On Mon, Jul 14, 2014 at 11:59 AM, Brett Cannon wrote:


This is the wrong place to ask about this. It falls under the purview of the
web site who you can email at webmaster@ or submit an issue at
https://github.com/python/pythondotorg . But I know from PSF status reports
that it's being actively rewritten and fixed to make it manageable for more
than one person to run easily.


Agree with that. I originally skipped this post because I'm pretty
sure MAL who is heavily involved with the rewrite effort) still hangs
out here. I will modify Brett's admonition a bit though. A better
place to comment about the job board (and perhaps volunteer to help
with the current effort) is j...@python.org.


Mostly just hoping to raise awareness in case anybody here is able/willing to 
pitch in.

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python Job Board

2014-07-14 Thread Ethan Furman

On 07/14/2014 06:01 PM, Wes Turner wrote:

 From 
http://www.reddit.com/r/Python/comments/17c69p/i_was_told_by_a_friend_that_learning_python_for/c84bswd
:


* http://www.python.org/community/jobs/
* https://jobs.github.com/positions?description=python
* http://careers.joelonsoftware.com/jobs?searchTerm=python
* http://www.linkedin.com/jsearch?keywords=python
* http://www.indeed.com/q-Python-jobs.html
* http://www.simplyhired.com/a/jobs/list/q-python
* http://seeker.dice.com/jobsearch/servlet/JobSearch?op=300&FREE_TEXT=python
* http://careers.stackoverflow.com/jobs/tag/python
* http://www.pythonjobs.com/
* http://www.djangojobs.org/


Nice, thanks!

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remaining decisions on PEP 471 -- os.scandir()

2014-07-14 Thread Ethan Furman

On 07/14/2014 07:48 PM, Ben Hoyt wrote:


In any case, here's the modified proposal:

scandir(path='.') -> generator of DirEntry objects, which have:

* name: name as per listdir()
* full_name: full path name (not necessarily absolute), equivalent of
os.path.join(path, entry.name)
* is_dir(follow_symlinks=True): like os.path.isdir(entry.full_name),
but free in most cases; cached per entry
* is_file(follow_symlinks=True): like os.path.isfile(entry.full_name),
but free in most cases; cached per entry
* is_symlink(): like os.path.islink(), but free in most cases; cached per entry
* stat(follow_symlinks=True): like os.stat(entry.full_name,
follow_symlinks=follow_symlinks); cached per entry

The above may not be quite perfect, but it's good, and I think there's
been enough bike-shedding on the API. :-)


Looks doable.  Just make sure the cached entries reflect the 'follow_symlinks' setting -- so a symlink could end up with 
both an lstat cached entry and a stat cached entry.


--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remaining decisions on PEP 471 -- os.scandir()

2014-07-15 Thread Ethan Furman

On 07/14/2014 11:25 PM, Victor Stinner wrote:


Again: remove any garantee about the cache in the definitions of methods,
instead copy the doc from os.path and os. Add a global remark saying that
 most methods don't need any syscall in general, except for symlinks (with
 follow_symlinks=True).


I don't understand what you're saying here.  The fact that DirEnrry.is_xxx will use cached values *must* be documented, 
or our users will waste huge amounts of time trying to figure out why an unknowingly cached value is no longer matching 
the current status.


~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 471 "scandir" accepted

2014-07-23 Thread Ethan Furman

On 07/21/2014 03:26 PM, Victor Stinner wrote:


The PEP is accepted.


Thanks, Victor!

Congratulations, Ben!

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Surely "nullable" is a reasonable name?

2014-08-04 Thread Ethan Furman

On 08/04/2014 12:12 AM, Larry Hastings wrote:


It's my contention that "nullable" is the correct name.  But I've been asked to 
bring up the topic for discussion, to
see if a consensus forms around this or around some other name.

Let the bike-shedding begin,


I think the original name is okay, but 'allow_none' is definitely clearer.

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] sum(...) limitation

2014-08-07 Thread Ethan Furman

On 08/07/2014 03:06 PM, Chris Barker wrote:

[snip timings, etc.]

I don't remember where, but I believe that cPython has an optimization built in for repeated string concatenation, which 
is probably why you aren't seeing big differences between the + and the sum().


A little testing shows how to defeat that optimization:

  blah = ''
  for string in ['booyah'] * 10:
  blah = string + blah

Note the reversed order of the addition.

--> timeit.Timer("for string in ['booya'] * 10: blah = blah + string", "blah = 
''").repeat(3, 1)
[0.021117210388183594, 0.013692855834960938, 0.00768280029296875]

--> timeit.Timer("for string in ['booya'] * 10: blah = string + blah", "blah = 
''").repeat(3, 1)
[15.301048994064331, 15.343288898468018, 15.268463850021362]

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] sum(...) limitation

2014-08-07 Thread Ethan Furman

On 08/07/2014 04:01 PM, Ethan Furman wrote:

On 08/07/2014 03:06 PM, Chris Barker wrote:

 the + and the sum().


Yeah, that 'sum' should be 'join'  :/

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] sum(...) limitation

2014-08-07 Thread Ethan Furman

On 08/07/2014 04:01 PM, Ethan Furman wrote:

On 08/07/2014 03:06 PM, Chris Barker wrote:

--> timeit.Timer("for string in ['booya'] * 10: blah = blah + string", "blah = 
''").repeat(3, 1)
[0.021117210388183594, 0.013692855834960938, 0.00768280029296875]

--> timeit.Timer("for string in ['booya'] * 10: blah = string + blah", "blah = 
''").repeat(3, 1)
[15.301048994064331, 15.343288898468018, 15.268463850021362]


Oh, and the join() timings:

--> timeit.Timer("blah = ''.join(['booya'] * 10)", "blah = ''").repeat(3, 1)
[0.0014629364013671875, 0.0014190673828125, 0.0011930465698242188]

So, + is three orders of magnitude slower than join.

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] sum(...) limitation

2014-08-08 Thread Ethan Furman

On 08/08/2014 08:23 AM, Chris Barker wrote:


So my final question is this:

repeated string concatenation is not the "recommended" way to do this -- but 
nevertheless, cPython has an optimization
that makes it fast and efficient, to the point that there is no practical 
performance reason to prefer appending to a
list and calling join()) afterward.

So why not apply a similar optimization to sum() for strings?


That I cannot answer -- I find the current situation with sum highly irritating.

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] sum(...) limitation

2014-08-08 Thread Ethan Furman

On 08/08/2014 05:34 PM, Raymond Hettinger wrote:


On Aug 8, 2014, at 11:09 AM, Ethan Furman mailto:et...@stoneleaf.us>> wrote:


So why not apply a similar optimization to sum() for strings?


That I cannot answer -- I find the current situation with sum highly irritating.



It is only irritating if you are misusing sum().


Actually, I have an advanced degree in irritability -- perhaps you've noticed 
in the past?

I don't use sum at all, or at least very rarely, and it still irritates me.  It feels like I'm being told I'm too dumb 
to figure out when I can safely use sum and when I can't.


--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] sum(...) limitation

2014-08-11 Thread Ethan Furman

On 08/11/2014 08:50 PM, Stephen J. Turnbull wrote:

Chris Barker - NOAA Federal writes:


It seems pretty pedantic to say: we could make this work well,
but we'd rather chide you for not knowing the "proper" way to do
it.


Nobody disagrees.  But backward compatibility gets in the way.


Something that currently doesn't work, starts to.  How is that a backward 
compatibility problem?

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Documenting enum types

2014-08-14 Thread Ethan Furman

On 08/14/2014 08:51 AM, Ben Hoyt wrote:

The enemy must be documented and exported, since users will encounter them.


enum == enemy? Is that you, Raymond? ;-)


ROFL!  Thanks, I needed that!

:D

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Multiline with statement line continuation

2014-08-15 Thread Ethan Furman

On 08/12/2014 08:38 PM, Steven D'Aprano wrote:


[1] Technically not, since it's the comma, not the ( ), which makes a
tuple, but a lot of people don't know that and treat it as if it the
parens were compulsary.


It might as well be, because if there can be a non-tuple way to interpret the comma that way takes precedence, and then 
the parens /are/ required to disambiguate and get the tuple you wanted.


--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Multiline with statement line continuation

2014-08-15 Thread Ethan Furman

On 08/13/2014 10:32 AM, Steven D'Aprano wrote:


(2) Also note that *this is already the case*, since tuples are made by
the commas, not the parentheses. E.g. this succeeds:

# Not a tuple, actually two context managers.
with open("/tmp/foo"), open("/tmp/bar", "w"):
pass


Thanks for proving my point!  A comma, and yet we did *not* get a tuple from it.

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Multiline with statement line continuation

2014-08-15 Thread Ethan Furman

On 08/15/2014 08:08 PM, Steven D'Aprano wrote:

On Fri, Aug 15, 2014 at 02:08:42PM -0700, Ethan Furman wrote:

On 08/13/2014 10:32 AM, Steven D'Aprano wrote:


(2) Also note that *this is already the case*, since tuples are made by
the commas, not the parentheses. E.g. this succeeds:

# Not a tuple, actually two context managers.
with open("/tmp/foo"), open("/tmp/bar", "w"):
pass


Thanks for proving my point!  A comma, and yet we did *not* get a tuple
from it.


Um, sorry, I don't quite get you. Are you agreeing or disagreeing with
me? I spent half of yesterday reading the static typing thread over on
Python-ideas and it's possible my brain has melted down *wink* but I'm
confused by your response.


My point is that commas don't always make a tuple, and your example above is a case in point:  we have a comma 
separating two context managers, but we do not have a tuple, and your comment even says so.



is a poor argument (that is, I'm disagreeing with it), since *single*
line parens-free with statements are already syntactically a tuple:

 with spam, eggs, cheese:  # Commas make a tuple, not parens.


This point I do not understand -- commas /can/ create a tuple, but don't /necessarily/ create a tuple.  So, 
semantically: no tuple.  Syntactically: I don't think there's a tuple there this way either.  I suppose one of us should 
look it up in the lexar.  ;)


--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 467: Minor API improvements for bytes & bytearray

2014-08-17 Thread Ethan Furman

On 08/17/2014 10:16 AM, Donald Stufft wrote:


For the record I’ve had all of the problems that Nick states and I’m
+1 on this change.


I've had many of the problems Nick states and I'm also +1.

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 467: Minor API improvements for bytes & bytearray

2014-08-17 Thread Ethan Furman

On 08/17/2014 02:19 PM, Raymond Hettinger wrote:

On Aug 17, 2014, at 11:33 AM, Ethan Furman wrote:


I've had many of the problems Nick states and I'm also +1.


There are two code snippets below which were taken from the standard library.


[...]

My issues are with 'bytes', not 'bytearray'.  'bytearray(10)' actually makes sense.  I certainly have no problem with 
bytearray and bytes not being exactly the same.


My primary issues with bytes is not being able to do b'abc'[2] == b'c', and with not being able to do x = b'abc'[2]; y = 
bytes(x); assert y == b'c'.


And because of the backwards compatibility issues I would deprecate, because we have a new 'better' way, but not remove, 
the current functionality.


I pretty much agree exactly with what Donald Stufft said about it.

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 467: Minor API improvements for bytes & bytearray

2014-08-17 Thread Ethan Furman

On 08/17/2014 04:08 PM, Nick Coghlan wrote:


I'm fine with postponing the deprecation elements indefinitely (or just 
deprecating bytes(int) and leaving
bytearray(int) alone).


+1 on both pieces.

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Bytes path support

2014-08-20 Thread Ethan Furman

On 08/20/2014 03:31 PM, Nick Coghlan wrote:


On 21 Aug 2014 08:19, "Greg Ewing" mailto:greg.ew...@canterbury.ac.nz>> wrote:


Antoine Pitrou wrote:


I think if you want low-level features (such as unconverted bytes paths under 
POSIX), it is reasonable to point you to low-level APIs.



The problem with scandir() in particular is that there is
currently *no* low-level API exposed that gives the same
functionality.

If scandir() is not to support bytes paths, I'd suggest
exposing the opendir() and readdir() system calls with
bytes path support.


scandir is low level (the entire os module is low level). In fact, aside from 
pathlib, I'd consider pretty much every
API we have that deals with paths to be low level - that's a large part of the 
reason we needed pathlib!


If scandir is low-level, and the low-level API's are the ones that should support bytes paths, then scandir should 
support bytes paths.


Is that what you meant to say?

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Bytes path support

2014-08-20 Thread Ethan Furman

On 08/20/2014 05:15 PM, Nick Coghlan wrote:

On 21 August 2014 09:33, Ethan Furman  wrote:

On 08/20/2014 03:31 PM, Nick Coghlan wrote:


scandir is low level (the entire os module is low level). In fact, aside
from pathlib, I'd consider pretty much every
API we have that deals with paths to be low level - that's a large part of
the reason we needed pathlib!


If scandir is low-level, and the low-level API's are the ones that should
support bytes paths, then scandir should support bytes paths.

Is that what you meant to say?


Yes. The discussions around PEP 471 *deferred* discussions of bytes
and file descriptor support to their own RFEs (not needing a PEP),
they didn't decide definitively not to support them. So Serhiy's
thread is entirely pertinent to that question.


Thanks for clearing that up.  I hate feeling confused.  ;)



Note that adding bytes support still *should not* hold up the initial
PEP 471 implementation - it should be done as a follow on RFE.


Agreed.

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 476: Enabling certificate validation by default!

2014-08-29 Thread Ethan Furman

On 08/29/2014 01:00 PM, M.-A. Lemburg wrote:

On 29.08.2014 21:47, Alex Gaynor wrote:


I've just submitted PEP 476, on enabling certificate validation by default for
HTTPS clients in Python. Please have a look and let me know what you think.


Thanks for the PEP. I think this is generally a good idea,
but some important parts are missing from the PEP:

  * transition plan:

I think starting with warnings in Python 3.5 and going
for exceptions in 3.6 would make a good transition

Going straight for exceptions in 3.5 is not in line with
our normal procedures for backwards incompatible changes.

  * configuration:

It would be good to be able to switch this on or off
without having to change the code, e.g. via a command
line switch and environment variable; perhaps even
controlling whether or not to raise an exception or
warning.

  * choice of trusted certificate:

Instead of hard wiring using the system CA roots into
Python it would be good to just make this default and
permit the user to point Python to a different set of
CA roots.

This would enable using self signed certs more easily.
Since these are often used for tests, demos and education,
I think it's important to allow having more control of
the trusted certs.


+1 for PEP with above changes.

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 475, Retry system calls failing with EINTR

2014-08-31 Thread Ethan Furman

On 08/31/2014 02:19 PM, Marko Rauhamaa wrote:

Victor Stinner :


Sorry but I don't understand your remark. What is your problem with
retrying syscall on EINTR?


The application will often want the EINTR return (exception) instead of
having the function resume on its own.


Examples?

As an ignorant person in this area, I do not know why I would ever want to have EINTR raised instead just getting the 
results of, say, my read() call.


--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 476: Enabling certificate validation by default!

2014-09-03 Thread Ethan Furman

On 09/03/2014 08:58 AM, R. David Murray wrote:


I'm OK with letting go of this invalid-cert issue myself, given the lack
of negative feedback Twisted got.  I'll just keep my fingers crossed.


I apologize if I missed this point, but if we have the source code then it is possible to go in and directly modify the 
application/utility to be able to talk over https to a router with an invalid certificate?  This is an option when 
creating the ssl_context?


--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 476: Enabling certificate validation by default!

2014-09-03 Thread Ethan Furman

On 09/03/2014 10:15 AM, Alex Gaynor wrote:

Ethan Furman writes:


I apologize if I missed this point, but if we have the source code then it is
possible to go in and directly modify the application/utility to be able to
talk over https to a router with an invalid certificate?  This is an option
when creating the ssl_context?


Yes, it's totally possible to create (and pass to ``http.client``) an
``SSLContext`` which doesn't verify various things. My proposal is only about
changing what happens when you don't explicitly pass a context.


Excellent.  Last question (I hope): it is possible to (easily) create an SSLContext that will verify against a 
self-signed certificate?


--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 476: Enabling certificate validation by default!

2014-09-03 Thread Ethan Furman

On 09/03/2014 12:10 PM, R. David Murray wrote:

On Wed, 03 Sep 2014 10:09:36 -0700, Ethan Furman  wrote:

On 09/03/2014 08:58 AM, R. David Murray wrote:


I'm OK with letting go of this invalid-cert issue myself, given the lack
of negative feedback Twisted got.  I'll just keep my fingers crossed.


I apologize if I missed this point, but if we have the source code then it is 
possible to go in and directly modify the
application/utility to be able to talk over https to a router with an invalid 
certificate?  This is an option when
creating the ssl_context?


The immediately preceding paragraph that you didn't quote said that the
context was 3rd party applications, not source code under your control.
Yes, you can (usually) still hack the source, but there are good reasons to
prefer to not do that, unfamiliarity with the codebase being just one of
them.


I appreciate that there is a distinction, yet in most cases we have the source code available (it is the nature of 
Python) and if push comes to shove (and a bunch of other colloquialisms) then modifying that source code can get you up 
and running again.


--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 476: Enabling certificate validation by default!

2014-09-03 Thread Ethan Furman

On 09/03/2014 04:36 PM, Antoine Pitrou wrote:

On Thu, 4 Sep 2014 09:19:56 +1000
Nick Coghlan  wrote:


Python is routinely updated to bugfix releases by Linux distributions
and other distribution channels, you usually have no say over what's
shipped in those updates. This is not like changing the major version
used for executing the script, which is normally a manual change.


We can potentially deal with the more conservative part of the user base on
the redistributor side - so long as the PEP says it's OK for us to not
apply this particular change if we deem it appropriate to do so.


So people would believe python.org that they would get HTTPS cert
validation by default, but their upstream distributor would have
disabled it for them? That's even worse...


I agree.  If the vendors don't want to have validation by default, they should 
stick with 2.7.8.

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 476: Enabling certificate validation by default!

2014-09-03 Thread Ethan Furman

On 09/03/2014 05:00 PM, Ethan Furman wrote:

On 09/03/2014 04:36 PM, Antoine Pitrou wrote:

On Thu, 4 Sep 2014 09:19:56 +1000
Nick Coghlan  wrote:


Python is routinely updated to bugfix releases by Linux distributions
and other distribution channels, you usually have no say over what's
shipped in those updates. This is not like changing the major version
used for executing the script, which is normally a manual change.


We can potentially deal with the more conservative part of the user base on
the redistributor side - so long as the PEP says it's OK for us to not
apply this particular change if we deem it appropriate to do so.


So people would believe python.org that they would get HTTPS cert
validation by default, but their upstream distributor would have
disabled it for them? That's even worse...


I agree.  If the vendors don't want to have validation by default, they should 
stick with 2.7.8.


If good argument can be made for why we should make validation by default optional, then that point should be well made 
in Python's release notes, and some easy programmatic way to tell if validation is on or off  (which may just be more 
docs saying call SSLContext and examine the results:  xxx means you're validating, yyy means you are not).


--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] List insert at index that is well out of range - behaves like append

2014-09-15 Thread Ethan Furman

On 09/15/2014 03:46 PM, Mark Lawrence wrote:

On 15/09/2014 23:29, Mark Shannon wrote:


I think this is an OK forum for this question.


It isn't.  ;)


If someone isn't sure if something is a bug or not, then why not ask
here before reporting it on the bug tracker?


The first stop should still be the main Python list, or Python Dev would be inundated with questions about why this or 
that doesn't work the same way as .  If the responses from Python list indicate that it is 
(or probably is) a bug, then possibly a post here to verify -- but a bug-tracker entry at that point is quite reasonable.



This does seem strange behaviour, and the documentation for list.insert
gives no clue as to why this behaviour was chosen.


I assume it's based on the concepts of slicing.  From the docs "s.insert(i, x) 
- inserts x into s at the index given by
i (same as s[i:i] = [x])".  Although shouldn't that read s[i:i+1] = [x] ?


No.  If it was `s[i:i+1]` then the ith element would be replaced by the 
inserted object.

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] 3.5 release schedule PEP

2014-09-25 Thread Ethan Furman

On 09/24/2014 09:11 PM, Larry Hastings wrote:


Therefore: if VC14 doesn't ship by 3.5 RC1, currently set at August 5, 2015, I 
decree we have to ship 3.5 with the
previous version.

Reasonable?


Seems reasonable to me.

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Fixing 2.7.x

2014-10-06 Thread Ethan Furman

With the incredibly long life span of 2.7, which bugs should we *not* fix?

For example, in http://bugs.python.org/issue22297 I mentioned one reason to not fix that bug was that the fix was not in 
3.1-3.3, but 2.7 will outlive all those plus a couple more.


So, what are the current guidelines on what to fix?  Is it still security only, 
with the rest being carrots for switching?

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] mUTF-7 support?

2014-10-09 Thread Ethan Furman

On 10/09/2014 03:47 PM, Jesus Cea wrote:

[]  mUTF-7 support  [...]

What do you think?. Could be considered for Python 3.5?.


+1

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Status of C compilers for Python on Windows

2014-10-29 Thread Ethan Furman

On 10/29/2014 03:09 PM, Paul Moore wrote:

On 29 October 2014 20:26, Donald Stufft  wrote:

This sounds like something good for packaging.python.org


Yeah, I wondered about that. I'll work up a patch for that. But the
more I think about it, it really is trivial:


I am reminded of an interview question I was once asked which was prefaced with: 
"Here's an easy one..."

My reply was, if you know the answer, it's easy!



- For non-free MSVC, install the appropriate version, and everything just works.
- For Python 2.7 (32 or 64 bit), install the compiler for Python 2.7
package and everything just works as long as you're using setuptools.
- For 32 bit Python 3.2-3.4, install Visual Studio Express and
everything just works.
- For 64 bit Python 3.2-3.4, install the SDK, set some environment
variables, and everything just works.
- For Python 3.5+, install the current Visual Studion Express and
everything just works.


I would suggest
  - where to get these packages
  - where to get any dependencies
  - any options to [not] specify during install
  - what environment variables to what value
  - where one should be at when one starts the compile process

and, of course, a gotchas section for uncommon but frustrating things that 
might go wrong.

And thanks for doing this!

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Status of C compilers for Python on Windows

2014-10-29 Thread Ethan Furman

On 10/29/2014 03:46 PM, Paul Moore wrote:

On 29 October 2014 22:19, Ethan Furman  wrote:


   - where one should be at when one starts the compile process


I don't understand this. It's just "pip wheel foo" to build a wheel
for foo (which will be downloaded), or "pip wheel ." or  "python
setup.py bdist_wheel" as you prefer for a local package.


Hmmm...  That looks like it's for installing/compiling somebody else's package.  Is that last command sufficient to 
prepare one's own wheel for uploading to PyPI, or there something else to do?


--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] The role of NotImplemented: What is it for and when should it be used?

2014-11-03 Thread Ethan Furman

Just to be clear, this is about NotImplemented, not NotImplementedError.

tl;dr  When a binary operation fails, should an exception be raised or 
NotImplemented returned?


When a binary operation in Python is attempted, there are two possibilities:

  - it can work
  - it can't work

The main reason [1] that it can't work is that the two operands are of different types, and the first type does not know 
how to deal with the second type.


The question then becomes: how does the first type tell Python that it cannot perform the requested operation?  The most 
obvious answer is to raise an exception, and TypeError is a good candidate.  The problem with the exception raising 
approach is that once an exception is raised, Python doesn't try anything else to make the operation work.


What's wrong with that?  Well, the second type might know how to perform the operation, and in fact that is why we have 
the reflected special methods, such as __radd__ and __rmod__ -- but if the first type raises an exception the __rxxx__ 
methods will not be tried.


Okay, how can the first type tell Python that it cannot do what is requested, but to go ahead and check with the second 
type to see if it does?  That is where NotImplemented comes in -- if a special method (and only a special method) 
returns NotImplemented then Python will check to see if there is anything else it can do to make the operation succeed; 
if all attempts return NotImplemented, then Python itself will raise an appropriate exception [2].


In an effort to see how often NotImplemented is currently being returned I crafted a test script [3] to test the types 
bytes, bytearray, str, dict, list, tuple, Enum, Counter, defaultdict, deque, and OrderedDict with the operations for 
__add__, __and__, __floordiv__, __iadd__, __iand__, __ifloordiv__, __ilshift__, __imod__, __imul__, __ior__, __ipow__, 
__irshift__, __isub__, __itruediv__, __ixor__, __lshift__, __mod__, __mul__, __or__, __pow__, __rshift__, __sub__, 
__truediv__, and __xor__.


Here are the results of the 275 tests:

testing control...

ipow -- Exception  raised
errors in Control -- misunderstanding or bug?

testing types against a foreign class

iadd(Counter()) -- Exception <'SomeOtherClass' object has no attribute 'items'> 
raised instead of TypeError
iand(Counter()) -- NotImplemented not returned, TypeError not raised
ior(Counter()) -- Exception <'SomeOtherClass' object has no attribute 'items'> 
raised instead of TypeError
isub(Counter()) -- Exception <'SomeOtherClass' object has no attribute 'items'> 
raised instead of TypeError


testing types against a subclass

mod(str()) -- NotImplemented not returned, TypeError not raised

iadd(Counter()) -- Exception <'subtype' object has no attribute 'items'> raised 
(should have worked)
iand(Counter()) -- NotImplemented not returned, TypeError not raised
ior(Counter()) -- Exception <'subtype' object has no attribute 'items'> raised 
(should have worked)
isub(Counter()) -- Exception <'subtype' object has no attribute 'items'> raised 
(should have worked)


Two observations:

  - __ipow__ doesn't seem to behave properly in the 3.x line (that error 
doesn't show up when testing against 2.7)

  - Counter should be returning NotImplemented instead of raising an 
AttributeError, for three reasons [4]:
- a TypeError is more appropriate
- subclasses /cannot/ work with the current implementation
- __iand__ is currently a silent failure if the Counter is empty, and the 
other operand should trigger a failure

Back to the main point...

So, if my understanding is correct:

  - NotImplemented is used to signal Python that the requested operation could 
not be performed
  - it should be used by the binary special methods to signal type mismatch 
failure, so any subclass gets a chance to work.

Is my understanding correct?  Is this already in the docs somewhere, and I just 
missed it?

--
~Ethan~

[1] at least, it's the main reason in my code
[2] usually a TypeError, stating either that the operation is not supported, or 
the types are unorderable
[3] test script at the end
[4] https://bugs.python.org/issue22766 [returning NotImplemented was rejected]

-- 8< 

from collections import Counter, defaultdict, deque, OrderedDict
from fractions import Fraction
from decimal import Decimal
from enum import Enum
import operator
import sys

py_ver = sys.version_info[:2]

types = (
bytes, bytearray, str, dict, list, tuple,
Enum, Counter, defaultdict, deque, OrderedDict,
)
numeric_types = int, float, Decimal, Fraction

operators = (
'__add__', '__and__', '__floordiv__',
'__iadd__', '__iand__', '__ifloordiv__', '__ilshift__',
'__imod__', '__imul__', '__ior__', '__ipow__',
'__irshift__', '__isub__', '__itru

Re: [Python-Dev] The role of NotImplemented: What is it for and when should it be used?

2014-11-03 Thread Ethan Furman

On 11/03/2014 08:12 AM, R. David Murray wrote:

On Mon, 03 Nov 2014 15:05:31 +, Brett Cannon  wrote:

On Mon Nov 03 2014 at 5:31:21 AM Ethan Furman  wrote:


Just to be clear, this is about NotImplemented, not NotImplementedError.

tl;dr  When a binary operation fails, should an exception be raised or
NotImplemented returned?



The docs for NotImplemented suggest it's only for rich comparison methods
and not all binary operators:
https://docs.python.org/3/library/constants.html#NotImplemented . But then
had I not read that I would have said all binary operator methods should
return NotImplemented when the types are incompatible.


Ethan opened an issue and then changed those docs, but I now believe
that the docs should be changed back (see the discussion in issue
22766).


I was wondering myself, which is why I started this thread.

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The role of NotImplemented: What is it for and when should it be used?

2014-11-03 Thread Ethan Furman

Summary:

NotImplemented _should_ be used by the normal and reflected binary methods 
(__lt__, __add__, __xor__, __rsub__, etc.)

NotImplemented _may_ be used by the in-place binary methods (__iadd__, __ixor__, etc.), but the in-place methods are 
also free to raise an exception.


Correct?

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Dinamically set __call__ method

2014-11-04 Thread Ethan Furman

This list is for the development _of_ Python, not development _with_ Python.

Try asking on Python List.

(forwarding...)

On 11/04/2014 08:52 AM, Roberto Martínez wrote:


I am trying to replace dinamically the __call__ method of an object using 
setattr.
Example:

$ cat testcall.py
class A:
 def __init__(self):
 setattr(self, '__call__', self.newcall)

 def __call__(self):
 print("OLD")

 def newcall(self):
 print("NEW")

a=A()
a()

I expect to get "NEW" instead of "OLD", but in Python 3.4 I get "OLD".

$ python2.7 testcall.py
NEW
$ python3.4 testcall.py
OLD

I have a few questions:

- Is this an expected behavior?
- Is possible to replace __call__ dinamically in Python 3? How?


In 2.7 that would be a classic class, about which I know little.

In 3.x you have a new class, one which inherits from 'object'.  When you replace __call__ you need to replace it the 
class, not on the instance:


  setattr(__self__.__class__, self.newcall)

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Real-world use of Counter

2014-11-05 Thread Ethan Furman
I'm looking for real-world uses of collections.Counter, specifically to see if anyone has been surprised by, or had to 
spend extra-time debugging, issues with the in-place operators.


If sufficient and/or compelling use-cases are uncovered, the behavior of 
Counter may be slightly modified.

Background:

Most Python data types will cause a TypeError to be raised if unusable types 
are passed in:

--> {'a': 0}.update(5)
TypeError: 'int' object is not iterable

--> [1, 2, 3].extend(3.14)
TypeError: 'float' object is not iterable

--> from collections import Counter
--> Counter() + [1, 2, 3]
TypeError: unsupported operand type(s) for +: 'Counter' and 'list'

Most Counter in-place methods also behave this way:

--> c /= [1, 2, 3]
TypeError: unsupported operand type(s) for /=: 'Counter' and 'list'

However, in the case of a handful of Counter in-place methods (+=, -=, &=, and 
|=), this is what happens instead:

--> c += [1, 2, 3]
AttributeError: 'list' object has no attribute 'items'

Even worse (in my opinion) is the case of an empty Counter `and`ed with an 
incompatible type:

--> c &= [1, 2, 3]
-->

No error is raised at all.

In order to avoid unnecessary code churn (the fix itself is quite simple), the maintainer of the collections module 
wants to know if anybody has actually been affected by these inconsistencies, and if so, whether it was a minor 
inconvenience, or a compelling use-case.


So, if this has bitten you, now is the time to speak up!  :)

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Real-world use of Counter

2014-11-05 Thread Ethan Furman

On 11/05/2014 10:09 AM, MRAB wrote:

On 2014-11-05 16:33, Ethan Furman wrote:


Even worse (in my opinion) is the case of an empty Counter `and`ed with an 
incompatible type:

--> c &= [1, 2, 3]
-->

No error is raised at all.


The final example, however, is odd. I think that one should be fixed.


https://bugs.python.org/issue22801

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Real-world use of Counter

2014-11-05 Thread Ethan Furman

On 11/05/2014 10:56 AM, Raymond Hettinger wrote:


Please stop using the mailing lists as way to make an end-run around 
discussions on the tracker.
http://bugs.python.org/issue22766


You said that without compelling, real-world use cases you don't like to make 
changes.

The tracker has a very limited audience, while the mailing lists have a much greater chance of reaching the users who 
may actually have the use-cases you would consider.


You call it an end-run, I call it research.



Also, as asked the question is a bit loaded.  Effectively, it asks "has anyone 
ever been surprised by an exception
raised by a duck-typed function or method"?


Actually, it's asking, "Most other duck-typed methods will still raise a TypeError, but these few don't.  Has that ever 
been a problem for you?"




The in-place operations on counters are duck-typed.  They are intended (by 
design) to work with ANY type that has an
items() method.   The exception raised if doesn't have on is an AttributeError 
saying that the operand needs to have an
items() method.


It would still be duck-typed with a `hasattr` call on the second operand checking for the necessary method, a TypeError 
could just as easily state the problem is a missing `items()` method, and then those three [*] in-place methods would 
raise the same error as the 20(?) other Counter methods under similar conditions.




Please let this one die.  It seems to have become your pet project even after 
I've made a decision and explained my
rationale.


You indicated you might make a change with sufficient real-world use-cases, so I'm trying to find some.  If none show 
up, I'll let the matter drop.


--
~Ethan~

[*] Amusingly enough, there are four overridden in-place methods:  three raise an AttributeError, but one (&=) still 
raises a TypeError.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Real-world use of Counter

2014-11-07 Thread Ethan Furman

Thank you everyone for the discussion, it has been, as always, most 
educational.  :)

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Improvements for Pathlib

2014-11-08 Thread Ethan Furman

On 11/08/2014 10:46 AM, Xavier Morel wrote:

On 2014-11-08, at 16:46 , Ionel Cristian Mărieș wrote:


In the current incarnation Pathlib is missing some key features I need in
 my usecases. I want to contribute them but i'd like a bit of feedback on
 the new api before jumping to implementation.



#1. A context manager for temporary files and dirs (that cleans everything
 up on exit).


Why would pathlib need to provide this when tempfile already does?


Because tempfile doesn't accept PathLib instances?

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Who's using VS/Windows to work on Python?

2014-11-13 Thread Ethan Furman
On 11/13/2014 11:47 AM, Steve Dower wrote:
> 
> Just wondering who is regularly/occasionally using VS 2010 to work on Python?

Very occasional.

In fact, my MSDS subscription expired and I missed the last call for renewals.  
:(

If it will help (and I can get a renewed subscription), I can built/test on my 
Win7 machine.

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 479: Change StopIteration handling inside generators

2014-11-21 Thread Ethan Furman
On 11/21/2014 05:47 AM, Raymond Hettinger wrote:
> 
> Also, the proposal breaks a reasonably useful pattern of calling 
> next(subiterator)
> inside a generator and letting the generator terminate when the data stream  
> ends.
>
> Here is an example that I have taught for years:
> 
> def [...]
> it1 = iter(iterable1)
> it2 = iter(iterable2)
> while True:
> v1 = next(it1)
> v2 = next(it2)
> yield v1, v2

Stepping back a little and looking at this code, sans header, let's consider 
the possible desired behaviors:

  - have an exact match-up between the two iterators, error otherwise
  - stop when one is exhausted
  - pad shorter one to longer one

Two of those three possible options are going to require dealing with the 
StopIteration that shouldn't escape -- is the
trade of keeping one option short and simple worth the pain caused by the 
error-at-a-distance bugs caused when a
StopIteration does escape that shouldn't have?

--
~Ethan~



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 479: Change StopIteration handling inside generators

2014-11-22 Thread Ethan Furman
On 11/22/2014 06:31 AM, Nick Coghlan wrote:
> 
> A particularly relevant variant of the idiom is the approach of writing
> "__iter__" directly as a generator, rather than creating a separate custom
> iterator class. In that context, the similarities between the __iter__
> implementation and the corresponding explicit __next__ implementation is a
> beneficial feature.

https://docs.python.org/3/reference/datamodel.html?highlight=__iter__#object.__iter__
--
> This method is called when an iterator is required for a container.
> This method should return a new iterator object that can iterate
> over all the objects in the container. For mappings, it should
> iterate over the keys of the container, and should also be made
> available as the method keys().

> Iterator objects also need to implement this method; they are
> required to return themselves. For more information on iterator
> objects, see Iterator Types.

Unless the object is itself at iterator, the __iter__ method is allowed to 
return any iterator object; whether that
iterator is constructed by a separate class entirely, or by using the iter() 
function, or by writing a generator, should
have no bearing on how we write generators themselves.

--
~Ethan~



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 479: Change StopIteration handling inside generators

2014-11-22 Thread Ethan Furman
On 11/22/2014 05:11 PM, Raymond Hettinger wrote:
>> On Nov 22, 2014, at 2:45 PM, Chris Angelico wrote:
>>
>> Does your middleware_generator work with just a single element,
>> yielding either one output value or none?
> 
> I apologize if I didn't make the point clearly.  The middleware example was 
> just simple outline of calling next(), doing some processing, and yielding a
> result while letting the StopIteration float through from the next() call.

[middleware example]

def middleware_generator(source_generator):
it = source_generator()
input_value = next(it)
output_value = do_something_interesting(input_value)
yield output_value

The point that Chris made that you should be refuting is this one:

>> What happens if do_something_interesting happens to raise
>> StopIteration? Will you be surprised that this appears identical to
>> the source generator yielding nothing?

--
~Ethan~



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 479: Change StopIteration handling inside generators

2014-11-23 Thread Ethan Furman
On 11/22/2014 08:53 PM, Guido van Rossum wrote:
>
> In order to save everyone's breath, I am *accepting* the proposal of PEP
> 479.

Excellent.

Chris, thank you for your time, effort, and thoroughness in championing this 
PEP.

--
~Ethan~



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Move selected documentation repos to PSF BitBucket account?

2014-11-23 Thread Ethan Furman
On 11/22/2014 11:13 PM, Donald Stufft wrote:
>> On Nov 23, 2014, at 1:49 AM, Nick Coghlan wrote:
>>
>> I took the git knowledge I acquired by necessity at Red Hat and
>> figured out how to apply it to hg. All the same features are there in
>> hg, they're just switched off by default (mainly because the core
>> Mercurial devs are adamant that any potentially history destroying
>> operation in a version control system must be opt-in).

If you could find the time to write up something about that I'm sure it would 
be helpful.  :)


>> We already have lots of potential contributors (if we didn't, review
>> bandwidth wouldn't be the bottleneck the way it is today), and the
>> functional differences between GitHub and BitBucket from a barrier to
>> entry perspective are so low as to be trivial.
> 
> That’s not really true. It’s more than just “can I log in”, potential
> contributors are more likely to already know how to use Github too and
> are more likely to not want to deal with another site. I know personally
> if I see a project is on bitbucket my chances of contributing to that
> project drop drastically, even if they are using git on bitbucket,
> just because I know that I’m going to get frustrated to some degree.

I feel the same way, only in reverse.  I've learned hg, and to a lesser extent 
bitbucket, but have not learned git nor
github, and would rather not (available bandwidth and all that).


>> Moving from self-hosted Mercurial repos to externally hosted Mercurial
>> repos is a low risk change. It reduces maintenance overhead and lowers
>> barriers to external contribution, both without alienating existing
>> contributors by forcing them to change their workflows.
>>
>> Proposing to *also* switch from Mercurial to git significantly
>> increases the cost of the change, while providing minimal incremental
>> benefit.

Whatever our personal feelings of hg vs git, and bitbucket vs github, that 
makes sense.

--
~Ethan~



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 479: Change StopIteration handling inside generators

2014-11-23 Thread Ethan Furman
On 11/22/2014 12:30 PM, Raymond Hettinger wrote:


Pre-PEP 479:
---
> def middleware_generator(source_generator):
> it = source_generator()
> input_value = next(it)
> output_value = do_something_interesting(input_value)
> yield output_value

Post-PEP 479:

> def middleware_generator(source_generator):
> it = source_generator()
> try:
> input_value = next(it)
> except StopIteration:
> return # This causes StopIteration to be reraised
> output_value = do_something_interesting(input_value)
> yield output_value

While I am in favor of PEP 479, and I have to agree with Raymond that this 
isn't pretty.

Currently, next() accepts an argument of what to return if the iterator is 
empty.  Can we enhance that in some way so
that the overall previous behavior could be retained?

Something like:

 def middleware_generator(source_generator):
 it = source_generator()

 input_value = next(it, gen_exit=True)  # or exc_type=GeneratorExit ?

 output_value = do_something_interesting(input_value)
 yield output_value

Then, if the iterator is empty, instead of raising StopIteration, or returning 
some value that would then have to be
checked, it could raise some other exception that is understood to be normal 
generator termination.

--
~Ethan~



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Move selected documentation repos to PSF BitBucket account?

2014-11-23 Thread Ethan Furman
On 11/23/2014 08:55 AM, Brett Cannon wrote:
>
> Sure, but I would never compare our infrastructure needs to Red Hat. =) You
> also have to be conservative in order to minimize downtown and impact for
> cost reasons. As an open source project we don't have those kinds of worry;
> we just have to worry about keeping everyone happy.

Minimizing downtime and impact is important for us, too.  The Python job board 
has now been down for nine months --
that's hardly good PR.

--
~Ethan~



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Move selected documentation repos to PSF BitBucket account?

2014-11-23 Thread Ethan Furman
On 11/23/2014 08:55 AM, Brett Cannon wrote:
>
> Fourth, do any core developers feel strongly about not using GitHub?

Dous GitHub support hg?  If not, I am strongly opposed.

--
~Ethan~



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Move selected documentation repos to PSF BitBucket account?

2014-11-23 Thread Ethan Furman
On 11/23/2014 10:14 AM, Brett Cannon wrote:
> On Sun Nov 23 2014 at 1:08:58 PM Ethan Furman  wrote:
>>
>> Dous GitHub support hg?  If not, I am strongly opposed.
>>
> 
> Depends on what you mean by "support". If you mean natively, then no. If
> you mean "I want more of a hg CLI" then you can get that with
> http://hg-git.github.io/ .

Well, if somebody documents it, I suppose I can follow along.  ;)


> And can I just say this is all bringing back "wonderful" flashbacks of the
> SourceForge to our own infrastructure move as well as the svn to hg move. =/

My apologies.  Change can be hard.

My concern is that we will end up with multiple different workflows depending 
on which part of Python we're working on,
which will lead to more time spent learning more about how to do it instead of 
doing it, and more time spent recovering
from errors because of the differences.

--
~Ethan~



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


  1   2   3   4   5   6   7   8   9   10   >