Re: [Python-Dev] pathlib handling of trailing slash (Issue #21039)

2014-08-08 Thread Paul Moore
On 7 August 2014 02:55, Antoine Pitrou  wrote:
> pathlib is generally concerned with filesystem operations written in Python,
> not arbitrary third-party tools. Also it is probably easy to append the
> trailing slash in your command-line invocation, if so desired.

I had a use case where I wanted to allow a config file to contain
"path: foo" to create a file called foo, and "path: foo/" to create a
directory. It was a shortcut for specifying an explicit "directory:
true" parameter as well.

The fact that pathlib stripped the slash made coding this mildly
tricky (especially as I wanted to cater for Windows users writing
"foo\\"...) It's not a showstopper, but I agree that semantically,
being able to distinguish whether an input had a trailing slash is
sometimes useful.

Paul
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pathlib handling of trailing slash (Issue #21039)

2014-08-08 Thread Alexander Belopolsky
On Fri, Aug 8, 2014 at 8:27 AM, Paul Moore  wrote:

> I had a use case where I wanted to allow a config file to contain
> "path: foo" to create a file called foo, and "path: foo/" to create a
> directory. It was a shortcut for specifying an explicit "directory:
> true" parameter as well.
>

Here is my use case: I have a database application that can save a table in
a variety of formats based on the supplied file name.  For example,
save('t.csv', t) saves in CSV text format while save('t', t)  saves in the
default binary format.  In addition, it supports "splayed" format where a
table is saved in multiple files across a directory - one file per column.
 The native database save function chooses this format when file name ends
with a slash: save('t/', t).   I would like to make the save() function in
Python that works like this, but takes pathlib.Path instances instead of
str, but in the current version, I cannot supply 't/' as a Path instance.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Summary of Python tracker Issues

2014-08-08 Thread Python tracker

ACTIVITY SUMMARY (2014-08-01 - 2014-08-08)
Python tracker at http://bugs.python.org/

To view or respond to any of the issues listed below, click on the issue.
Do NOT respond to this message.

Issues counts and deltas:
  open4602 (+10)
  closed 29340 (+43)
  total  33942 (+53)

Open issues with patches: 2177 


Issues opened (39)
==

#21039: pathlib strips trailing slash
http://bugs.python.org/issue21039  reopened by pitrou

#21591: "exec(a, b, c)" not the same as "exec a in b, c" in nested fun
http://bugs.python.org/issue21591  reopened by Arfrever

#22121: IDLE should start with HOME as the initial working directory
http://bugs.python.org/issue22121  opened by mark

#22123: Provide a direct function for types.SimpleNamespace()
http://bugs.python.org/issue22123  opened by mark

#22125: Cure signedness warnings introduced by #22003
http://bugs.python.org/issue22125  opened by dw

#22126: mc68881 fpcr inline asm breaks clang -flto build
http://bugs.python.org/issue22126  opened by ivank

#22128: patch: steer people away from codecs.open
http://bugs.python.org/issue22128  opened by Frank.van.Dijk

#22131: uuid.bytes optimization
http://bugs.python.org/issue22131  opened by kevinlondon

#22133: IDLE: Set correct WM_CLASS on X11
http://bugs.python.org/issue22133  opened by sahutd

#22135: allow to break into pdb with Ctrl-C for all the commands that 
http://bugs.python.org/issue22135  opened by xdegaye

#22137: Test imaplib API on all methods specified in RFC 3501
http://bugs.python.org/issue22137  opened by zvyn

#22138: patch.object doesn't restore function defaults
http://bugs.python.org/issue22138  opened by chepner

#22139: python windows 2.7.8 64-bit wrong binary version
http://bugs.python.org/issue22139  opened by Andreas.Richter

#22140: "python-config --includes" returns a wrong path (double prefix
http://bugs.python.org/issue22140  opened by Michael.Dussere

#22141: rlcompleter.Completer matches too much
http://bugs.python.org/issue22141  opened by donlorenzo

#22143: rlcompleter.Completer has duplicate matches
http://bugs.python.org/issue22143  opened by donlorenzo

#22144: ellipsis needs better display in lexer documentation
http://bugs.python.org/issue22144  opened by François-René.Rideau

#22145: <> in parser spec but not lexer spec
http://bugs.python.org/issue22145  opened by François-René.Rideau

#22147: PosixPath() constructor should not accept strings with embedde
http://bugs.python.org/issue22147  opened by ischwabacher

#22148: frozen.c should #include  instead of "importlib.h
http://bugs.python.org/issue22148  opened by jbeck

#22149: the frame of a suspended generator should not have a local tra
http://bugs.python.org/issue22149  opened by xdegaye

#22150: deprecated-removed directive is broken in Sphinx 1.2.2
http://bugs.python.org/issue22150  opened by berker.peksag

#22153: There is no standard TestCase.runTest implementation
http://bugs.python.org/issue22153  opened by vadmium

#22154: ZipFile.open context manager support
http://bugs.python.org/issue22154  opened by Ralph.Broenink

#22155: Out of date code example for tkinter's createfilehandler
http://bugs.python.org/issue22155  opened by vadmium

#22156: Fix compiler warnings
http://bugs.python.org/issue22156  opened by haypo

#22157: FAIL: test_with_pip (test.test_venv.EnsurePipTest)
http://bugs.python.org/issue22157  opened by snehal

#22158: RFC 6531 (SMTPUTF8) support in smtpd.PureProxy
http://bugs.python.org/issue22158  opened by zvyn

#22159: smtpd.PureProxy and smtpd.DebuggingServer do not work with dec
http://bugs.python.org/issue22159  opened by zvyn

#22160: Windows installers need to be updated following OpenSSL securi
http://bugs.python.org/issue22160  opened by alex

#22161: Remove unsupported code from ctypes
http://bugs.python.org/issue22161  opened by serhiy.storchaka

#22163: max_wbits set incorrectly to -zlib.MAX_WBITS in tarfile, shoul
http://bugs.python.org/issue22163  opened by edulix

#22164: cell object cleared too early?
http://bugs.python.org/issue22164  opened by pitrou

#22165: Empty response from http.server when directory listing contain
http://bugs.python.org/issue22165  opened by jleedev

#22166: test_codecs "leaking" references
http://bugs.python.org/issue22166  opened by zach.ware

#22167: iglob() has misleading documentation (does indeed store names 
http://bugs.python.org/issue22167  opened by roysmith

#22168: Turtle Graphics RawTurtle problem
http://bugs.python.org/issue22168  opened by Kent.D..Lee

#22171: stack smash when using ctypes/libffi to access union
http://bugs.python.org/issue22171  opened by wes.kerfoot

#22173: Update lib2to3.tests and test_lib2to3 to use test discovery
http://bugs.python.org/issue22173  opened by zach.ware



Most recent 15 issues with no replies (15)
==

#22173: Update lib2to3.tests and test_lib2to3 to use test discovery
http://bugs.python.org/issue22173

#22171: stack smash when using ctypes/libffi t

Re: [Python-Dev] sum(...) limitation

2014-08-08 Thread Chris Barker
On Thu, Aug 7, 2014 at 4:01 PM, Ethan Furman  wrote:

> I don't remember where, but I believe that cPython has an optimization
> built in for repeated string concatenation, which is probably why you
> aren't seeing big differences between the + and the sum().
>

Indeed -- clearly so.

A little testing shows how to defeat that optimization:

  blah = ''
>   for string in ['booyah'] * 10:
>   blah = string + blah
>
> Note the reversed order of the addition.
>

thanks -- cool trick.

Oh, and the join() timings:
> --> timeit.Timer("blah = ''.join(['booya'] * 10)", "blah =
> ''").repeat(3, 1)
> [0.0014629364013671875, 0.0014190673828125, 0.0011930465698242188]
> So, + is three orders of magnitude slower than join.


only one if if you use the optimized form of + and not even that if you
need to build up the list first, which is the common use-case.

So my final question is this:

repeated string concatenation is not the "recommended" way to do this --
but nevertheless, cPython has an optimization that makes it fast and
efficient, to the point that there is no practical performance reason to
prefer appending to a list and calling join()) afterward.

So why not apply a similar optimization to sum() for strings?

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] sum(...) limitation

2014-08-08 Thread Ethan Furman

On 08/08/2014 08:23 AM, Chris Barker wrote:


So my final question is this:

repeated string concatenation is not the "recommended" way to do this -- but 
nevertheless, cPython has an optimization
that makes it fast and efficient, to the point that there is no practical 
performance reason to prefer appending to a
list and calling join()) afterward.

So why not apply a similar optimization to sum() for strings?


That I cannot answer -- I find the current situation with sum highly irritating.

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] sum(...) limitation

2014-08-08 Thread Raymond Hettinger

On Aug 8, 2014, at 11:09 AM, Ethan Furman  wrote:

>> So why not apply a similar optimization to sum() for strings?
> 
> That I cannot answer -- I find the current situation with sum highly 
> irritating.
> 

It is only irritating if you are misusing sum().

The str.__add__ optimization was put in because
it was common for people to accidentally incur
the performance penalty.

With sum(), we don't seem to have that problem
(I don't see people using it to add lists except
just to show that could be done).


Raymond


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] sum(...) limitation

2014-08-08 Thread Ethan Furman

On 08/08/2014 05:34 PM, Raymond Hettinger wrote:


On Aug 8, 2014, at 11:09 AM, Ethan Furman mailto:et...@stoneleaf.us>> wrote:


So why not apply a similar optimization to sum() for strings?


That I cannot answer -- I find the current situation with sum highly irritating.



It is only irritating if you are misusing sum().


Actually, I have an advanced degree in irritability -- perhaps you've noticed 
in the past?

I don't use sum at all, or at least very rarely, and it still irritates me.  It feels like I'm being told I'm too dumb 
to figure out when I can safely use sum and when I can't.


--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] sum(...) limitation

2014-08-08 Thread Alexander Belopolsky
On Fri, Aug 8, 2014 at 8:56 PM, Ethan Furman  wrote:

> I don't use sum at all, or at least very rarely, and it still irritates me.


You are not alone.  When I see sum([a, b, c]), I think it is a + b + c, but
in Python it is 0 + a + b + c.  If we had a "join" operator for strings
that is different form + - then sure, I would not try to use sum to join
strings, but we don't.  I have always thought that sum(x) is just a
shorthand for reduce(operator.add, x), but again it is not so in Python.
 While "sum should only be used for numbers,"  it turns out it is not a
good choice for floats - use math.fsum.  While "strings are blocked because
sum is slow," numpy arrays with millions of elements are not.  And try to
explain to someone that sum(x) is bad on a numpy array, but abs(x) is fine.
 Why have builtin sum at all if its use comes with so many caveats?
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] sum(...) limitation

2014-08-08 Thread Steven D'Aprano
On Fri, Aug 08, 2014 at 10:20:37PM -0400, Alexander Belopolsky wrote:
> On Fri, Aug 8, 2014 at 8:56 PM, Ethan Furman  wrote:
> 
> > I don't use sum at all, or at least very rarely, and it still irritates me.
> 
> 
> You are not alone.  When I see sum([a, b, c]), I think it is a + b + c, but
> in Python it is 0 + a + b + c.  If we had a "join" operator for strings
> that is different form + - then sure, I would not try to use sum to join
> strings, but we don't.

I've long believed that + is the wrong operator for concatenating 
strings, and that & makes a much better operator. We wouldn't be having 
these interminable arguments about using sum() to concatenate strings 
(and lists, and tuples) if the & operator was used for concatenation and 
+ was only used for numeric addition.


> I have always thought that sum(x) is just a
> shorthand for reduce(operator.add, x), but again it is not so in Python.

The signature of reduce is:

reduce(...)
reduce(function, sequence[, initial]) -> value

so sum() is (at least conceptually) a shorthand for reduce:

def sum(values, initial=0):
return reduce(operator.add, values, initial)

but that's an implementation detail, not a language promise, and sum() 
is free to differ from that simple version. Indeed, even the public 
interface is different, since sum() prohibits using a string as the 
initial value and only promises to work with numbers. The fact that it 
happens to work with lists and tuples is somewhat of an accident of 
implementation.


> While "sum should only be used for numbers,"  it turns out it is not a
> good choice for floats - use math.fsum.

Correct. And if you (generic you, not you personally) do not understand 
why simple-minded addition of floats is troublesome, then you're going 
to have a world of trouble. Anyone who is disturbed by the question of 
"should I use sum or math.fsum?" probably shouldn't be writing serious 
floating point code at all. Floating point computations are hard, and 
there is simply no escaping this fact.


> While "strings are blocked because
> sum is slow," numpy arrays with millions of elements are not.

That's not a good example. Strings are potentially O(N**2), which means 
not just "slow" but *agonisingly* slow, as in taking a week -- no 
exaggeration -- to concat a million strings. If it takes a nanosecond to 
concat two strings, then 1e6**2 such concatenations could take over 
eleven days. Slowness of such magnitude might as well be "the process 
has locked up".

In comparison, summing a numpy array with a million entries is not 
really slow in that sense. The time taken is proportional to the number 
of entries, and differs from summing a list only by a constant factor.

Besides, in the case of strings it is quite simple to decide "is the 
initial value a string?", whereas with lists or numpy arrays it's quite 
hard to decide "is the list or array so huge that the user will consider 
this too slow?". What counts as "too slow" depends on the machine it is 
running on, what other processes are running, and the user's mood, and 
leads to the silly result that summing an array of N items succeeds but 
N+1 items doesn't. So in the case of strings, it is easy to make a
blanket prohibition, but in the case of lists or arrays, there is no 
reasonable place to draw the line.


> And try to
> explain to someone that sum(x) is bad on a numpy array, but abs(x) is fine.

I think that's because sum() has to box up each and every element in the 
array into an object, which is wasteful, while abs() can delegate to a 
specialist array.__abs__ method. Although that's not something beginners 
should be expected to understand, no serious Python programmer should be 
confused by this. As a programmer, we should expect to have some 
understanding of our tools, how they work, their limitations, and when 
to use a different tool. That's why numpy has its own version of sum 
which is designed to work specifically on numpy arrays. Use a specialist 
tool for a specialist job:

py> with Stopwatch():
... sum(carray)  # carray is a numpy array of 7500 floats.
...
11250.0
time taken: 52.659770 seconds
py> with Stopwatch():
... numpy.sum(carray)
...
11250.0
time taken: 0.161263 seconds


>  Why have builtin sum at all if its use comes with so many caveats?

Because sum() is a perfectly reasonable general purpose tool for adding 
up small amounts of numbers where high floating point precision is not 
required. It has been included as a built-in because Python comes with 
"batteries included", and a basic function for adding up a few numbers 
is an obvious, simple battery. But serious programmers should be 
comfortable with the idea that you use the right tool for the right job.

If you visit a hardware store, you will find that even something as 
simple as the hammer exists in many specialist varieties. There are tack 
hammers, claw hammers, framing hammers, lump hammers, rubber and wooden 
mallets, "brass" non-sparking 

Re: [Python-Dev] sum(...) limitation

2014-08-08 Thread Greg Ewing

Steven D'Aprano wrote:
I've long believed that + is the wrong operator for concatenating 
strings, and that & makes a much better operator.


Do you have a reason for preferring '&' in particular, or
do you just want something different from '+'?

Personally I can't see why "bitwise and" on strings should
be a better metaphor for concatenation that "addition". :-)

--
Greg

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] sum(...) limitation

2014-08-08 Thread Antoine Pitrou

Le 09/08/2014 01:08, Steven D'Aprano a écrit :

On Fri, Aug 08, 2014 at 10:20:37PM -0400, Alexander Belopolsky wrote:

On Fri, Aug 8, 2014 at 8:56 PM, Ethan Furman  wrote:


I don't use sum at all, or at least very rarely, and it still irritates me.


You are not alone.  When I see sum([a, b, c]), I think it is a + b + c, but
in Python it is 0 + a + b + c.  If we had a "join" operator for strings
that is different form + - then sure, I would not try to use sum to join
strings, but we don't.


I've long believed that + is the wrong operator for concatenating
strings, and that & makes a much better operator. We wouldn't be having
these interminable arguments about using sum() to concatenate strings
(and lists, and tuples) if the & operator was used for concatenation and
+ was only used for numeric addition.


Come on. These arguments are interminable because many people (including 
you) love feeding interminable arguments. No need to blame Python for that.


And for that matter, this interminable discussion should probably have 
taken place on python-ideas or even python-list.


Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com