[Python-Dev] Use of objdump within ctypes _get_soname()

2018-10-09 Thread Ray Donnelly
Hi,

We ran into an issue on the Anaconda Distribution recently where we
added libarchive-c to conda-build (so we can un-compress more source
archive formats than tarfile supports) and everything was good a few
hours, until it hit various CI systems where objdump is not installed.

I was a bit surprised by this dependency and wondered if there'd be
interest in a fallback path that inspects the elf with some fairly
simple python code to determine the soname instead? I have code that
works already - though it could do with and a tidy up - and has been
tested on a variety of architectures. Would CPython be interested in
an attempt to upstream this?

Is it documented anywhere that objdump is needed to load some
extension modules on Linux?


Best regards,

Ray Donnelly,
Anaconda Inc,
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Use of objdump within ctypes _get_soname()

2018-10-09 Thread Gregory P. Smith
On Mon, Oct 8, 2018 at 11:59 PM Ray Donnelly 
wrote:

> Hi,
>
> We ran into an issue on the Anaconda Distribution recently where we
> added libarchive-c to conda-build (so we can un-compress more source
> archive formats than tarfile supports) and everything was good a few
> hours, until it hit various CI systems where objdump is not installed.
>
> I was a bit surprised by this dependency and wondered if there'd be
> interest in a fallback path that inspects the elf with some fairly
> simple python code to determine the soname instead? I have code that
> works already - though it could do with and a tidy up - and has been
> tested on a variety of architectures. Would CPython be interested in
> an attempt to upstream this?
>
> Is it documented anywhere that objdump is needed to load some
> extension modules on Linux?
>

Wow, that looks like gross code buried within ctypes.util (which
libarchive-c uses) that calls various platforms versions of objdump or
equivalent.  File a bugs.python.org issue and submit a PR, native ELF
header reading code for this makes sense.

-G


>
>
> Best regards,
>
> Ray Donnelly,
> Anaconda Inc,
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/greg%40krypto.org
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Arbitrary non-identifier string keys when using **kwargs

2018-10-09 Thread Jeff Hardy
On Sun, Oct 7, 2018 at 3:45 PM Terry Reedy  wrote:
>
> On 10/7/2018 1:34 PM, Chris Barker via Python-Dev wrote:
> > On Fri, Oct 5, 2018 at 3:01 PM Brett Cannon  > > wrote:
> >
> > I'm also fine with saying that keys in **kwargs that are not proper
> > identifiers is an implementation detail.
> >
> >
> > It's not just **kwargs -- you can also use arbitrary names with
> > setattr() / getattr() :
> >
> > In [6]: setattr(foo, "4 not an identifier", "this works")
> >
> > In [7]: getattr(foo, "4 not an identifier")
> > Out[7]: 'this works'
>
> When this behavior of set/getattr was discussed a decade or so ago,
> Guido said not to disable it, but I believe he said it should not be
> considered a language feature.  There are other situations where CPython
> is 'looser' than the spec.

>From an alternative implementation point of view, CPython's behaviour
*is* the spec. Practicality beats purity and all that.

- Jeff
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Arbitrary non-identifier string keys when using **kwargs

2018-10-09 Thread Guido van Rossum
My feeling is that limiting it to strings is fine, but checking those
strings for resembling identifiers is pointless and wasteful.

On Tue, Oct 9, 2018 at 9:40 AM Jeff Hardy  wrote:

> On Sun, Oct 7, 2018 at 3:45 PM Terry Reedy  wrote:
> >
> > On 10/7/2018 1:34 PM, Chris Barker via Python-Dev wrote:
> > > On Fri, Oct 5, 2018 at 3:01 PM Brett Cannon  > > > wrote:
> > >
> > > I'm also fine with saying that keys in **kwargs that are not proper
> > > identifiers is an implementation detail.
> > >
> > >
> > > It's not just **kwargs -- you can also use arbitrary names with
> > > setattr() / getattr() :
> > >
> > > In [6]: setattr(foo, "4 not an identifier", "this works")
> > >
> > > In [7]: getattr(foo, "4 not an identifier")
> > > Out[7]: 'this works'
> >
> > When this behavior of set/getattr was discussed a decade or so ago,
> > Guido said not to disable it, but I believe he said it should not be
> > considered a language feature.  There are other situations where CPython
> > is 'looser' than the spec.
>
> From an alternative implementation point of view, CPython's behaviour
> *is* the spec. Practicality beats purity and all that.
>
> - Jeff
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/guido%40python.org
>
-- 
--Guido (mobile)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python startup time

2018-10-09 Thread Gregory Szorc
On 5/1/2018 8:26 PM, Gregory Szorc wrote:
> On 7/19/2017 12:15 PM, Larry Hastings wrote:
>>
>>
>> On 07/19/2017 05:59 AM, Victor Stinner wrote:
>>> Mercurial startup time is already 45.8x slower than Git whereas tested
>>> Mercurial runs on Python 2.7.12. Now try to sell Python 3 to Mercurial
>>> developers, with a startup time 2x - 3x slower...
>>
>> When Matt Mackall spoke at the Python Language Summit some years back, I
>> recall that he specifically complained about Python startup time.  He
>> said Python 3 "didn't solve any problems for [them]"--they'd already
>> solved their Unicode hygiene problems--and that Python's slow startup
>> time was already a big problem for them.  Python 3 being /even slower/
>> to start was absolutely one of the reasons why they didn't want to upgrade.
>>
>> You might think "what's a few milliseconds matter".  But if you run
>> hundreds of commands in a shell script it adds up.  git's speed is one
>> of the few bright spots in its UX, and hg's comparative slowness here is
>> a palpable disadvantage.
>>
>>
>>> So please continue efforts for make Python startup even faster to beat
>>> all other programming languages, and finally convince Mercurial to
>>> upgrade ;-)
>>
>> I believe Mercurial is, finally, slowly porting to Python 3.
>>
>> https://www.mercurial-scm.org/wiki/Python3
>>
>> Nevertheless, I can't really be annoyed or upset at them moving slowly
>> to adopt Python 3, as Matt's objections were entirely legitimate.
> 
> I just now found found this thread when searching the archive for
> threads about startup time. And I was searching for threads about
> startup time because Mercurial's startup time has been getting slower
> over the past few months and this is causing substantial pain.
> 
> As I posted back in 2014 [1], CPython's startup overhead was >10% of the
> total CPU time in Mercurial's test suite. And when you factor in the
> time to import modules that get Mercurial to a point where it can run
> commands, it was more like 30%!
> 
> Mercurial's full test suite currently runs `hg` ~25,000 times. Using
> Victor's startup time numbers of 6.4ms for 2.7 and 14.5ms for
> 3.7/master, Python startup overhead contributes ~160s on 2.7 and ~360s
> on 3.7/master. Even if you divide this by the number of available CPU
> cores, we're talking dozens of seconds of wall time just waiting for
> CPython to get to a place where Mercurial's first bytecode can execute.
> 
> And the problem is worse when you factor in the time it takes to import
> Mercurial's own modules.
> 
> As a concrete example, I recently landed a Mercurial patch [2] that
> stubs out zope.interface to prevent the import of 9 modules on every
> `hg` invocation. This "only" saved ~6.94ms for a typical `hg`
> invocation. But this decreased the CPU time required to run the test
> suite on my i7-6700K from ~4450s to ~3980s (~89.5% of original) - a
> reduction of almost 8 minutes of CPU time (and over 1 minute of wall time)!
> 
> By the time CPython gets Mercurial to a point where we can run useful
> code, we've already blown most of or past the time budget where humans
> perceive an action/command as instantaneous. If you ignore startup
> overhead, Mercurial's performance compares quite well to Git's for many
> operations. But the reality is that CPython startup overhead makes it
> look like Mercurial is non-instantaneous before Mercurial even has the
> opportunity to execute meaningful code!
> 
> Mercurial provides a `chg` program that essentially spins up a daemon
> `hg` process running a "command server" so the `chg` program [written in
> C - no startup overhead] can dispatch commands to an already-running
> Python/`hg` process and avoid paying the startup overhead cost. When you
> run Mercurial's test suite using `chg`, it completes *minutes* faster.
> `chg` exists mainly as a workaround for slow startup overhead.
> 
> Changing gears, my day job is maintaining Firefox's build system. We use
> Python heavily in the build system. And again, Python startup overhead
> is problematic. I don't have numbers offhand, but we invoke likely a few
> hundred Python processes as part of building Firefox. It should be
> several thousand. But, we've had to "hack" parts of the build system to
> "batch" certain build actions in single process invocations in order to
> avoid Python startup overhead. This undermines the ability of some build
> tools to formulate a reasonable understanding of the DAG and it causes a
> bit of pain for build system developers and makes it difficult to
> achieve "no-op" and fast incremental builds because we're always
> invoking certain Python processes because we've had to move DAG
> awareness out of the build backend and into Python. At some point, we'll
> likely replace Python code with Rust so the build system is more "pure"
> and easier to maintain and reason about.
> 
> I've seen posts in this thread and elsewhere in the CPython development
> universe that challenge whether milliseconds in 

Re: [Python-Dev] Python startup time

2018-10-09 Thread Antoine Pitrou


Hi,

On Tue, 9 Oct 2018 14:02:02 -0700
Gregory Szorc  wrote:
> 
> Python 3.7 doesn't exhibit as much of a problem. But it is still there.
> A brief audit of the importer code and call stacks confirms it is the
> same problem - just less prevalent. Wall time execution of the test
> harness from Python 2.7 to Python 3.7 drops from ~37:43s to ~20:39.
> Overall kernel CPU time drops from ~75% to ~19%. And that wall time
> improvement is despite Python 3's slower process startup. So locking in
> the kernel is really a killer on Python 2.7.

Thanks for the detailed feedback.

> I hope someone finds this information useful to further improving
> [startup] performance. (And given that Python 3.7 is substantially
> faster by avoiding excessive readdir(), I wouldn't be surprised if this
> problem is already known!)

The macOS problem wasn't known, but the general problem of filesystem
calls was (in relation with e.g. networked filesystems).

Significant work went into improving Python 3 in that regard after the
import mechanism was rewritten in pure Python.  Nowadays Python caches
the contents of all sys.path directories, so (once the cache is primed)
it's mostly a single stat() call per directory to check whether the
cache is up-to-date.  This is not entirely free, but massively better
than what Python 2 did, which was to stat() many filename patterns in
each sys.path directory.

(of course, the fact that Python 3 imports many more modules at startup
mitigates the end result a bit)


As a sidenote, I was always shocked with how the Mercurial test suite
was architected.  You're wasting so much time launching processes that
I wonder why you kept it that way for so long :-)

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Arbitrary non-identifier string keys when using **kwargs

2018-10-09 Thread Steven D'Aprano
On Tue, Oct 09, 2018 at 09:37:48AM -0700, Jeff Hardy wrote:

> > When this behavior of set/getattr was discussed a decade or so ago,
> > Guido said not to disable it, but I believe he said it should not be
> > considered a language feature.  There are other situations where CPython
> > is 'looser' than the spec.
> 
> From an alternative implementation point of view, CPython's behaviour
> *is* the spec. Practicality beats purity and all that.

Are you speaking on behalf of all authors of alternate implementations, 
or even of some of them?

It certainly is not true that CPython's behaviour "is" the spec. PyPy 
keeps a list of CPython behaviour they don't match, either because they 
choose not to for other reasons, or because they believe that the 
CPython behaviour is buggy. I daresay IronPython and Jython have 
similar.

And this especially applies when CPython explicitly states that certain 
behaviour is implementation-dependent and could change in the future.


-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Arbitrary non-identifier string keys when using **kwargs

2018-10-09 Thread Steven D'Aprano
On Tue, Oct 09, 2018 at 10:26:50AM -0700, Guido van Rossum wrote:
> My feeling is that limiting it to strings is fine, but checking those
> strings for resembling identifiers is pointless and wasteful.

Sure. The question is, do we have to support uses where people 
intentionally smuggle non-identifier strings as keys via **kwargs?

I'm not saying we need to guard against it, only asking if we need to 
officially support it. The discussion on Python-Ideas is (partly) about 
making this a language feature.


-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Arbitrary non-identifier string keys when using **kwargs

2018-10-09 Thread Barry Warsaw
On Oct 9, 2018, at 16:21, Steven D'Aprano  wrote:
> 
> On Tue, Oct 09, 2018 at 10:26:50AM -0700, Guido van Rossum wrote:
>> My feeling is that limiting it to strings is fine, but checking those
>> strings for resembling identifiers is pointless and wasteful.
> 
> Sure. The question is, do we have to support uses where people
> intentionally smuggle non-identifier strings as keys via **kwargs?

I would not be in favor of that.  I think it doesn’t make sense to be able to 
smuggle those in via **kwargs when it’s not supported by Python’s 
grammar/syntax.

-Barry



signature.asc
Description: Message signed with OpenPGP
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Arbitrary non-identifier string keys when using **kwargs

2018-10-09 Thread Guido van Rossum
On Tue, Oct 9, 2018 at 5:17 PM Barry Warsaw  wrote:

> On Oct 9, 2018, at 16:21, Steven D'Aprano  wrote:
> >
> > On Tue, Oct 09, 2018 at 10:26:50AM -0700, Guido van Rossum wrote:
> >> My feeling is that limiting it to strings is fine, but checking those
> >> strings for resembling identifiers is pointless and wasteful.
> >
> > Sure. The question is, do we have to support uses where people
> > intentionally smuggle non-identifier strings as keys via **kwargs?
>
> I would not be in favor of that.  I think it doesn’t make sense to be able
> to smuggle those in via **kwargs when it’s not supported by Python’s
> grammar/syntax.
>

Well, it currently works in all Python implementations (definitely in
CPython, and presumably in PyPy and Jython because they tend to follow
CPython carefully). The less the spec leaves undefined the better, IMO, and
I fully expect we'll be breaking code that is doing this. So we might as
well make it the law.

For example, in some code bases it's a pretty common pattern to pass dicts
around using **kwds several levels deep, with no intention to unpack it
into individual keyword arguments -- the caller sends a dict, and the
receiver accepts a dict and does dict-y things to it. Sure, they probably
shouldn't be abusing **kwds, but they are, and I can't really blame them --
possibly this code evolved from a situation that did use keyword args.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Arbitrary non-identifier string keys when using **kwargs

2018-10-09 Thread Benjamin Peterson


On Tue, Oct 9, 2018, at 17:14, Barry Warsaw wrote:
> On Oct 9, 2018, at 16:21, Steven D'Aprano  wrote:
> > 
> > On Tue, Oct 09, 2018 at 10:26:50AM -0700, Guido van Rossum wrote:
> >> My feeling is that limiting it to strings is fine, but checking those
> >> strings for resembling identifiers is pointless and wasteful.
> > 
> > Sure. The question is, do we have to support uses where people
> > intentionally smuggle non-identifier strings as keys via **kwargs?
> 
> I would not be in favor of that.  I think it doesn’t make sense to be 
> able to smuggle those in via **kwargs when it’s not supported by 
> Python’s grammar/syntax.

Can anyone think of a situation where it would be advantageous for an 
implementation to reject non-identifier string kwargs? I can't.

I agree with Guido—banning it would be too much trouble for no benefit.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Arbitrary non-identifier string keys when using **kwargs

2018-10-09 Thread Chris Jerdonek
On Tue, Oct 9, 2018 at 7:13 PM Benjamin Peterson  wrote:
> On Tue, Oct 9, 2018, at 17:14, Barry Warsaw wrote:
> > On Oct 9, 2018, at 16:21, Steven D'Aprano  wrote:
> > >
> > > On Tue, Oct 09, 2018 at 10:26:50AM -0700, Guido van Rossum wrote:
> > >> My feeling is that limiting it to strings is fine, but checking those
> > >> strings for resembling identifiers is pointless and wasteful.
> > >
> > > Sure. The question is, do we have to support uses where people
> > > intentionally smuggle non-identifier strings as keys via **kwargs?
> >
> > I would not be in favor of that.  I think it doesn’t make sense to be
> > able to smuggle those in via **kwargs when it’s not supported by
> > Python’s grammar/syntax.
>
> Can anyone think of a situation where it would be advantageous for an 
> implementation to reject non-identifier string kwargs? I can't.

One possibility is that it could foreclose certain security bugs from
happening. For example, if someone has an API that accepts **kwargs,
they might have the mistaken assumption that the keys are identifiers
without special characters like ";" etc, and so they could make the
mistake of thinking they don't need to escape / sanitize them.

--Chris

>
> I agree with Guido—banning it would be too much trouble for no benefit.
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/chris.jerdonek%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Arbitrary non-identifier string keys when using **kwargs

2018-10-09 Thread Guido van Rossum
On Tue, Oct 9, 2018 at 7:49 PM Chris Jerdonek 
wrote:

> On Tue, Oct 9, 2018 at 7:13 PM Benjamin Peterson 
> wrote:
> > On Tue, Oct 9, 2018, at 17:14, Barry Warsaw wrote:
> > > On Oct 9, 2018, at 16:21, Steven D'Aprano  wrote:
> > > >
> > > > On Tue, Oct 09, 2018 at 10:26:50AM -0700, Guido van Rossum wrote:
> > > >> My feeling is that limiting it to strings is fine, but checking
> those
> > > >> strings for resembling identifiers is pointless and wasteful.
> > > >
> > > > Sure. The question is, do we have to support uses where people
> > > > intentionally smuggle non-identifier strings as keys via **kwargs?
> > >
> > > I would not be in favor of that.  I think it doesn’t make sense to be
> > > able to smuggle those in via **kwargs when it’s not supported by
> > > Python’s grammar/syntax.
> >
> > Can anyone think of a situation where it would be advantageous for an
> implementation to reject non-identifier string kwargs? I can't.
>
> One possibility is that it could foreclose certain security bugs from
> happening. For example, if someone has an API that accepts **kwargs,
> they might have the mistaken assumption that the keys are identifiers
> without special characters like ";" etc, and so they could make the
> mistake of thinking they don't need to escape / sanitize them.
>

Hm, that's not an entirely unreasonable concern. How would an attacker get
such keys *into* the dict? One possible scenario would be something that
parses a traditional web query string into a dict, passes it down through
**kwds, and then turns it back into another query string without proper
quoting. But the most common (and easiest) way to turn a dict into a query
string is calling urlencode(), which quotes unsafe characters.

I think we needn't rush this (and when in doubt, status quo wins, esp. when
there's no BDFL :-).

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Arbitrary non-identifier string keys when using **kwargs

2018-10-09 Thread Glenn Linderman

On 10/9/2018 7:46 PM, Chris Jerdonek wrote:

On Tue, Oct 9, 2018 at 7:13 PM Benjamin Peterson  wrote:

On Tue, Oct 9, 2018, at 17:14, Barry Warsaw wrote:

On Oct 9, 2018, at 16:21, Steven D'Aprano  wrote:

On Tue, Oct 09, 2018 at 10:26:50AM -0700, Guido van Rossum wrote:

My feeling is that limiting it to strings is fine, but checking those
strings for resembling identifiers is pointless and wasteful.

Sure. The question is, do we have to support uses where people
intentionally smuggle non-identifier strings as keys via **kwargs?

I would not be in favor of that.  I think it doesn’t make sense to be
able to smuggle those in via **kwargs when it’s not supported by
Python’s grammar/syntax.

Can anyone think of a situation where it would be advantageous for an 
implementation to reject non-identifier string kwargs? I can't.

One possibility is that it could foreclose certain security bugs from
happening. For example, if someone has an API that accepts **kwargs,
they might have the mistaken assumption that the keys are identifiers
without special characters like ";" etc, and so they could make the
mistake of thinking they don't need to escape / sanitize them.

--Chris
With that line of reasoning, one should make sure the data are 
identifiers too :)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Arbitrary non-identifier string keys when using **kwargs

2018-10-09 Thread Chris Jerdonek
On Tue, Oct 9, 2018 at 8:55 PM Guido van Rossum  wrote:
> On Tue, Oct 9, 2018 at 7:49 PM Chris Jerdonek  
> wrote:
>> On Tue, Oct 9, 2018 at 7:13 PM Benjamin Peterson  wrote:
>> > Can anyone think of a situation where it would be advantageous for an 
>> > implementation to reject non-identifier string kwargs? I can't.
>>
>> One possibility is that it could foreclose certain security bugs from
>> happening. For example, if someone has an API that accepts **kwargs,
>> they might have the mistaken assumption that the keys are identifiers
>> without special characters like ";" etc, and so they could make the
>> mistake of thinking they don't need to escape / sanitize them.
>
>
> Hm, that's not an entirely unreasonable concern. How would an attacker get 
> such keys *into* the dict?

I was just thinking json. It could be a config-file type situation, or
a web API that accepts json.

For example, there are JSON-RPC implementations in Python:
https://pypi.org/project/json-rpc/
that translate json dicts directly into **kwargs:
https://github.com/pavlov99/json-rpc/blob/f1b4e5e96661efd4026cb6143dc3acd75c6c4682/jsonrpc/manager.py#L112

On the server side, the application could be doing something like
assuming that the kwargs are e.g. column names paired with values to
construct a string in SQL or in some other language or format.

--Chris



> One possible scenario would be something that parses a traditional web query 
> string into a dict, passes it down through **kwds, and then turns it back 
> into another query string without proper quoting. But the most common (and 
> easiest) way to turn a dict into a query string is calling urlencode(), which 
> quotes unsafe characters.
>
> I think we needn't rush this (and when in doubt, status quo wins, esp. when 
> there's no BDFL :-).
>
> --
> --Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Arbitrary non-identifier string keys when using **kwargs

2018-10-09 Thread Serhiy Storchaka

10.10.18 05:12, Benjamin Peterson пише:

On Tue, Oct 9, 2018, at 17:14, Barry Warsaw wrote:

On Oct 9, 2018, at 16:21, Steven D'Aprano  wrote:


On Tue, Oct 09, 2018 at 10:26:50AM -0700, Guido van Rossum wrote:

My feeling is that limiting it to strings is fine, but checking those
strings for resembling identifiers is pointless and wasteful.


Sure. The question is, do we have to support uses where people
intentionally smuggle non-identifier strings as keys via **kwargs?


I would not be in favor of that.  I think it doesn’t make sense to be
able to smuggle those in via **kwargs when it’s not supported by
Python’s grammar/syntax.


Can anyone think of a situation where it would be advantageous for an 
implementation to reject non-identifier string kwargs? I can't.


I can. The space of identifiers is smaller than the space of all 
strings. We need just 6 bits per character for ASCII identifiers and 16 
bits per character for Unicode identifiers. We could use a special kind 
of strings for more compact representation of identifiers. It may be 
even possible to encode all identifiers used in the stdlib and in the 
program as a tagged 64-bit pointer. Currently dict has specialized code 
for string keys, it could have specialization for identifiers (used only 
for keyword arguments, instance dicts, etc). Argument parsing code can 
also utilize the fact that a special hash for short identifiers doesn't 
have collizions and compare just hashes.


All this looks fantastic, but I would not close doors for future 
optimizations.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com