[Python-Dev] Binary Operator for New-Style String Formatting

2009-06-21 Thread Jerry Chen
Hello all,

For better or for worse, I have created a patch against the py3k trunk
which introduces a binary operator '@' as an alternative syntax for
the new string formatting system introduced by PEP 3101 ("Advanced
String Formatting"). [1]

For common cases, this syntax should be as simple and as elegant as
its deprecated [2] predecessor ('%'), while also ensuring that more
complex use cases do not suffer needlessly.

I would just like to know whether this idea will float before
submitting the patch on Roundup and going through the formal PEP
process.  This is my first foray into the internals of the Python
core, and with any luck, I did not overlook any BDFL proclamations
banning all new binary operators for string formatting. :-)

QUICK EXAMPLES

>>> "{} {} {}" @ (1, 2, 3)
'1 2 3'

>>> "foo {qux} baz" @ {"qux": "bar"}
'foo bar baz'

One of the main complaints of a binary operator in PEP 3101 was the
inability to mix named and unnamed arguments:

The current practice is to use either a dictionary or a tuple as
the second argument, but as many people have commented ... this
lacks flexibility.

To address this, a convention of having the last element of a tuple
as the named arguments dictionary is introduced.

>>> "{} {qux} {}" @ (1, 3, {"qux": "bar"})
'1 bar 3'

Lastly, to print the repr() of a dictionary as an unnamed argument,
one would have to append an additional dictionary so there is no
ambiguity:

>>> "{}" @ {"foo": "bar"}
Traceback (most recent call last):
  File "", line 1, in 
IndexError: tuple index out of range

>>> "{}" @ ({"foo": "bar"}, {})
"{'foo': 'bar'}"

Admittedly, these workarounds are less than clean, but the
understanding is the '@' syntax would indeed be an alternative, so one
could easily fall back to the str.format() method or the format()
function.

IMPLEMENTATION

Code-wise, the grammar was edited per PEP 306 [3], and a
function was introduced in unicodeobject.c as PyUnicode_FormatPrime
(in the mathematical sense of A and A' -- I didn't fully understand or
want to intrude upon the *_FormatAdvanced namespace).

The PyUnicode_FormatPrime function transforms the incoming arguments,
i.e. the operands of the binary '@', and makes the appropriate
do_string_format() call.  Thus, I have reused as much code as
possible.

I have done my development with git by using two branches: 'master'
and 'subversion', the latter of which can be used to run 'svn update'
and merge back into master.  This way my code changes and the official
ones going into the Subversion repository can stay separate, meanwhile
allowing 'svn diff' to produce an accurate patch at any given time.

The code is available at:

http://github.com/jcsalterego/py3k-atsign/

The SVN patch [4] or related commit [5] are good starting points.

References:

[1] http://www.python.org/dev/peps/pep-3101
[2] http://docs.python.org/3.0/whatsnew/3.0.html
[3] http://www.python.org/dev/peps/pep-0306/
[4] http://github.com/jcsalterego/py3k-atsign/blob/master/py3k-atsign.diff
[5] 
http://github.com/jcsalterego/py3k-atsign/commit/5c8bdf72d9252cea78af2b7809613f6530e25db4

Thanks,
-- 
Jerry Chen
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Binary Operator for New-Style String Formatting

2009-06-21 Thread Jerry Chen
Ah, the people have spoken!

On Sun, Jun 21, 2009 at 2:12 PM, Terry Reedy wrote:
> The place to float trial balloons is the python-ideas list.

I'll put this one to rest, and as mentioned, will direct any future
suggestions to python-ideas instead of here.

Most of the arguments against my proposal state there is little gain
and much to lose (in terms of clarity or an "obvious way" to go about
string formatting) -- and, I agree.

> The only advantage '@' over '.format' is fewer characters.
> I think it would be more useful to agitate to give 'format' a one char
> synonym such as 'f'.

str.f() would be a great idea.

> One disadvantage of using an actual tuple rather than an arg quasi-tuple is
> that people would have to remember the trailing comma when printing one
> thing. '{}' @ (1,) rather than '{}' @ (a) == '{}' @ a. [If you say, 'Oh,
> then accept the latter', then there is a problem when a is a tuple!]

My code transforms both '{}' @ (a) and '{}' @ a to '{}'.format(a), but
the problem you speak of is probably an edge case I haven't quite
wrapped my head around.

For what it's worth, I spent a bit of time trying to work out the
syntactical quirks, including adapting the format tests in
Lib/test/test_unicode.py to this syntax and ensuring all the tests
passed.  In the end though, it seems to be an issue of usability and
clarity.

> Formatting is inherently an n-ary function who args are one format and an
> indefinite number of objects to plug in. Packaging the remaining args into
> an object to convert the function to binary is problematical, especially in
> Python with its mix of positional and named args. Even without that, there
> is possible confusion between a package as an arg in itself and a package as
> a container of multiple args. The % formatting problem with tuple puns was
> one of the reasons to seek a replacement.

Also (from R. David Murray):

> That said, I'm -1 on it.  The 'keywords as last item of tuple' reeks
> of code-smell to my nose, and I don't think you've addressed all of
> the reasons for why a method was chosen over an operator.  Python has a
> tradition of having "one obvious way" to do something, so introducing an
> "alternative" syntax that you admit is sub-optimal does not seem to me
> to have enough benefit to justify breaking that design guideline.

Well stated (and everyone else).

Just one last note: I think my end goal here was to preserve the
visual clarity and separation between format string and format
parameters, as I much prefer:

"%s %s %s" % (1, 2, 3)

over

"{0} {1} {2}".format(1, 2, 3)

The former is a style I've grown accustomed to, and if % is indeed
being slated for removal in Python 3.2, then I will miss it sorely
(or... just get over it).

Thanks to everyone who has provided constructive criticism and great
arguments.

Cheers,
-- 
Jerry Chen
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com