[issue9334] argparse does not accept options taking arguments beginning with dash (regression from optparse)

2021-12-26 Thread Anders Kaseorg

Anders Kaseorg  added the comment:

If argparse will not be developed further to fix this bug, then we should undo 
the deprecation of optparse in the documentation 
(https://bugs.python.org/issue37103), since the stated justification for that 
deprecation was that optparse will not be developed further.

The documentation should encourage programmers to use correct libraries, and 
optparse is correct here where argparse is not.  People who need the extra 
features of argparse and aren’t bothered by its incorrectness are welcome to 
decide to use it, but this is not the right default decision for the 
documentation to promote.

--

___
Python tracker 
<https://bugs.python.org/issue9334>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9334] argparse does not accept options taking arguments beginning with dash (regression from optparse)

2022-01-12 Thread Anders Kaseorg

Anders Kaseorg  added the comment:

> While optparse that it isn't being developed further, therebut will not
> be taken away.  IIRC the reason for this was that it too had become
> difficult to build out and that is what necessitated the creation of
> argparse -- there wasn't clean way to add the desired features
> (subparsers, actions, etc).

My concern is not that optparse will be taken away.  My concern is that the 
documentation incorrectly discourages its use.

https://docs.python.org/3/library/optparse.html
“Deprecated since version 3.2: The optparse module is deprecated and will not 
be developed further; development will continue with the argparse module.”

Given that the apparent conclusion of this bug is that argparse has also become 
too difficult to fix, either argparse should be deprecated for exactly the same 
reason, or optparse should be un-deprecated.

Most programs don’t need the extra features of argparse, and optparse doesn’t 
have this bug, so optparse is a better default choice; the documentation should 
not be encouraging argparse over it.

--

___
Python tracker 
<https://bugs.python.org/issue9334>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9334] argparse does not accept options taking arguments beginning with dash (regression from optparse)

2011-12-28 Thread Anders Kaseorg

Anders Kaseorg  added the comment:

James: That’s not related to this issue.  This issue is about options taking 
arguments beginning with dash (such as a2x --asciidoc-opts --safe, where --safe 
is the argument to --asciidoc-opts), not positional arguments beginning with 
dash.

Your observation isn’t a bug.  In all getopt-like parsers, -- is the only way 
to pass positional arguments beginning with -.  (Whether you shell-quoted the 
argument is irrelevant; the - is interpreted by the program, not the shell, 
after the shell has already stripped off the shell quoting.)

If your program doesn’t take any options and you’d like to parse positional 
arguments without requiring --, don’t use a getopt-like parser; use sys.argv 
directly.

If you still think your example is a bug, please file a separate report.

--

___
Python tracker 
<http://bugs.python.org/issue9334>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12844] Support more than 255 arguments

2011-08-25 Thread Anders Kaseorg

New submission from Anders Kaseorg :

This feels like an arbitrary restriction (obvious sequences have been replaced 
with ‘…’ to save space in this report):

>>> zip([0], [1], [2], …, [1999])
  File "", line 1
SyntaxError: more than 255 arguments

especially when this works:

>>> zip(*[[0], [1], [2], …, [1999]])
[(0, 1, 2, …, 1999)]

Apparently that limit bites some people:
https://docs.djangoproject.com/en/1.3/topics/http/urls/#module-django.conf.urls.defaults

The bytecode format doesn’t support directly calling a function with more than 
255 arguments.  But, it should still be pretty easy to compile such function 
calls by desugaring
  f(arg0, …, arg999, k0=v0, …, k999=v999)
into
  f(*(arg0, …, arg999), **{'k0': 'v0', …, 'k999': 'v999'})

--
components: Interpreter Core
messages: 142995
nosy: andersk
priority: normal
severity: normal
status: open
title: Support more than 255 arguments
type: feature request

___
Python tracker 
<http://bugs.python.org/issue12844>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12844] Support more than 255 arguments

2011-08-25 Thread Anders Kaseorg

Anders Kaseorg  added the comment:

I guess the desugaring is slightly more complicated in the case where the 
original function call already used *args or **kwargs:
  f(arg0, …, arg999, *args, k0=v0, …, k999=v999, **kwargs)
becomes something like
  f(*((arg0, …, arg999) + args),
**dict({'k0': 'v0', …, 'k999': 'v999'}, **kwargs))

--

___
Python tracker 
<http://bugs.python.org/issue12844>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9334] argparse does not accept options taking arguments beginning with dash (regression from optparse)

2011-03-26 Thread Anders Kaseorg

Anders Kaseorg  added the comment:

> @andersk: Would the restriction to only having flags with a fixed
> number of arguments be acceptable for your use case?

I think that’s fine.  Anyone coming from optparse won’t need options with 
optional arguments.

However, FWIW, GNU getopt_long() supports options with an optional argument 
under the restrictions that:
 • the option must be a long option,
 • the optional argument must be the only argument for the option, and
 • the argument, if present, must be supplied using the
   ‘--option=argument’ form, not the ‘--option argument’ form.
This avoids all parsing ambiguity.  It would be useful to have feature parity 
with getopt_long(), to facilitate writing Python wrapper scripts for C programs.

--

___
Python tracker 
<http://bugs.python.org/issue9334>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9334] argparse does not accept options taking arguments beginning with dash (regression from optparse)

2011-02-06 Thread Anders Kaseorg

Anders Kaseorg  added the comment:

There are some problems that ‘=’ can’t solve, such as options with nargs ≥ 2.  
optparse has no trouble with this:

>>> parser = optparse.OptionParser()
>>> parser.add_option('-a', nargs=2)
>>> parser.parse_args(['-a', '-first', '-second'])
(, [])

But inputting those arguments is _not possible_ with argparse.

>>> parser = argparse.ArgumentParser()
>>> parser.add_argument('-a', nargs=2)
>>> parser.parse_args(['-a', '-first', '-second'])
usage: [-h] [-a A A]
: error: argument -a: expected 2 argument(s)

--

___
Python tracker 
<http://bugs.python.org/issue9334>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9334] argparse does not accept options taking arguments beginning with dash (regression from optparse)

2011-02-06 Thread Anders Kaseorg

Anders Kaseorg  added the comment:

That would be a good first step.

I continue to advocate making that mode the default, because it’s consistent 
with how every other command line program works[1], and backwards compatible 
with the current argparse behavior.

As far as documentation for older versions, would it be reasonable to 
un-deprecate optparse until argparse becomes a suitable replacement?  There are 
still lots of programmers working in Python 2.7.

[1] bethard’s msg128047 is confusing positional arguments with option 
arguments.  All UNIX commands that accept option arguments have no trouble 
accepting option arguments that begin with -.  For example, ‘grep -e -pattern 
file’ is commonly used to search for patterns beginning with -.

--

___
Python tracker 
<http://bugs.python.org/issue9334>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9334] argparse does not accept options taking arguments beginning with dash (regression from optparse)

2011-02-06 Thread Anders Kaseorg

Anders Kaseorg  added the comment:

> (1) It's only deprecated in the documentation

Which is why I suggested un-deprecating it in the documentation.  (I want to 
avoid encouraging programmers to switch away from optparse until this bug is 
fixed.)

> # proposed behavior
> parser = ArgumentParser(error_on_unknown_options=False)

Perhaps you weren’t literally proposing “error_on_unknown_options=False” as the 
name of the new flag, but note that neither the current nor proposed behaviors 
have nothing to do with whether arguments look like known or unknown options.  
Under the proposed behavior, anything in argument position (--asciidoc-opts 
___) is parsed as an argument, no matter what it looks like.

So a more accurate name might be “refuse_dashed_args=False”, or more generally 
(in case prefix_chars != '-'), “refuse_prefixed_args=False”?

--

___
Python tracker 
<http://bugs.python.org/issue9334>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8376] Tutorial offers dangerous advice about iterator s: “__iter__() can just return self”

2010-04-11 Thread Anders Kaseorg

New submission from Anders Kaseorg :

The Python tutorial offers some dangerous advice about adding iterator behavior 
to a class:
http://docs.python.org/tutorial/classes.html#iterators
“By now you have probably noticed that most container objects can be looped 
over using a for statement:
…
Having seen the mechanics behind the iterator protocol, it is easy to add 
iterator behavior to your classes. Define a __iter__() method which returns an 
object with a next() method. If the class defines next(), then __iter__() can 
just return self:”

This is reasonable advice for writing an iterator class, but terrible advice 
for writing a container class, because it encourages you to associate a single 
iterator with the container, which breaks nested iteration and leads to 
hard-to-find bugs.  (One of those bugs recently made its way into the code 
handout for a problem set in MIT’s introductory CS course, 6.00.)

A container class’s __iter__() should return a generator or an instance of a 
separate iterator class, not self.  The tutorial should make this clearer.

--
assignee: georg.brandl
components: Documentation
messages: 102918
nosy: andersk, georg.brandl
severity: normal
status: open
title: Tutorial offers dangerous advice about iterators: “__iter__() can just 
return self”

___
Python tracker 
<http://bugs.python.org/issue8376>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8376] Tutorial offers dangerous advice about iterator s: “__iter__() can just return self”

2010-04-12 Thread Anders Kaseorg

Anders Kaseorg  added the comment:

As an experienced Python programmer I am obviously aware that the tutorial is 
trying to teach how to make an iterator class, not how to make a container 
class.

But the tutorial doesn’t make that *clear*.  It should be much more explicit 
about what it is explaining to avoid confusing those concepts in the minds of 
beginners.  (Or even the staff of the MIT introductory CS course.)

One way to fix this confusion would be to explain what one should do for both 
container classes and iterator classes:

"""
Having seen the mechanics behind the iterator protocol, it is easy to add 
iterator behavior to your container classes.  Define a :meth:`__iter__` method 
which returns an object of a separate iterator class.  The iterator class 
should have a :meth:`next` method and an :meth:`__iter__` method (the 
:meth:`__iter__` method of the iterator class should just return ``self``)::

   class ReverseList(object):
   "Container that lets you iterate over the items backwards"
   def __init__(self, data):
   self.data = data
   def __iter__(self):
   return ReverseIterator(self.data)

   class ReverseIterator(object):
   "Iterator for looping over a sequence backwards"
   def __init__(self, data):
   self.data = data
   self.index = len(data)
   def __iter__(self):
   return self
   def next(self):
   if self.index == 0:
   raise StopIteration
   self.index = self.index - 1
   return self.data[self.index]

   >>> for char in ReverseIterator('spam'):
   ... print char
   ...
   m
   a
   p
   s
   >>> for char in ReverseList([1,2,3]):
   ... print char
   ...
   3
   2
   1
"""

--

___
Python tracker 
<http://bugs.python.org/issue8376>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43323] UnicodeEncodeError: surrogates not allowed when parsing invalid charset

2022-03-26 Thread Anders Kaseorg

Anders Kaseorg  added the comment:

It could and does, as quoted in my original report.

Content-Type: text/plain; charset*=utf-8”''utf-8%E2%80%9D

That’s a U+201D right double quotation mark.

This is not a valid charset for the charset of course, but it seems like the 
code was intended to handle an invalid charset value without crashing, so it 
should also handle an invalid charset charset (despite the absurdity of the 
entire concept of a charset charset).

--

___
Python tracker 
<https://bugs.python.org/issue43323>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue44680] Reference cycles from a WeakKeyDictionary value to its key aren’t collected

2021-07-19 Thread Anders Kaseorg

New submission from Anders Kaseorg :

Because WeakKeyDictionary unconditionally maintains strong references to its 
values, the garbage collector fails to collect a reference cycle from a 
WeakKeyDictionary value to its key.  For example, the following program 
unexpectedly leaks memory:

from weakref import WeakKeyDictionary
class C: pass
d = WeakKeyDictionary()
while True:
c = C()
d[c] = [c]

I would expect a WeakKeyDictionary value to be marked live _if_ its key is 
marked live, not unconditionally.  This could be implemented with garbage 
collector support for ephemerons 
(https://www.researchgate.net/publication/221320677_Ephemerons_A_New_Finalization_Mechanism).

To motivate this issue, a typical use of WeakKeyDictionary is as a hygienic 
replacement for patching extra properties into third-party objects:

# before:
obj._extra_state = ExtraState(obj)
# after:
extra_states = WeakKeyDictionary()
extra_states[o] = ExtraState(obj)

However, such a conversion will introduce this memory leak if ExtraState(obj) 
has any transitive references to obj.

This leak does not occur in JavaScript:

class C {}
const d = new WeakMap();
while (true) {
  const c = new C();
  d[c] = [c];
}

--
components: Library (Lib)
messages: 397841
nosy: andersk
priority: normal
severity: normal
status: open
title: Reference cycles from a WeakKeyDictionary value to its key aren’t 
collected
type: resource usage
versions: Python 3.9

___
Python tracker 
<https://bugs.python.org/issue44680>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue44680] Reference cycles from a WeakKeyDictionary value to its key aren’t collected

2021-07-19 Thread Anders Kaseorg


Anders Kaseorg  added the comment:

> extra_states[o] = ExtraState(obj)

(Typo for extra_states[obj] = ExtraState(obj), obviously.)

--

___
Python tracker 
<https://bugs.python.org/issue44680>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43323] UnicodeEncodeError: surrogates not allowed when parsing invalid charset

2021-02-25 Thread Anders Kaseorg

New submission from Anders Kaseorg :

We ran into a UnicodeEncodeError exception using email.parser to parse this 
email 
<https://lists.cam.ac.uk/pipermail/cl-isabelle-users/2021-February/msg00135.html>,
 with full headers available in the raw archive 
<https://lists.cam.ac.uk/pipermail/cl-isabelle-users/2021-February.txt>.  The 
offending header is hilariously invalid:

Content-Type: text/plain; charset*=utf-8”''utf-8%E2%80%9D

but I’m filing an issue since the parser is intended to be robust against 
invalid input.  Minimal reproduction:

>>> import email, email.policy
>>> email.message_from_bytes(b"Content-Type: text/plain; 
>>> charset*=utf-8\xE2\x80\x9D''utf-8%E2%80%9D", policy=email.policy.default)
Traceback (most recent call last):
  File "", line 1, in 
  File "/usr/local/lib/python3.10/email/__init__.py", line 46, in 
message_from_bytes
return BytesParser(*args, **kws).parsebytes(s)
  File "/usr/local/lib/python3.10/email/parser.py", line 123, in parsebytes
return self.parser.parsestr(text, headersonly)
  File "/usr/local/lib/python3.10/email/parser.py", line 67, in parsestr
return self.parse(StringIO(text), headersonly=headersonly)
  File "/usr/local/lib/python3.10/email/parser.py", line 57, in parse
return feedparser.close()
  File "/usr/local/lib/python3.10/email/feedparser.py", line 187, in close
self._call_parse()
  File "/usr/local/lib/python3.10/email/feedparser.py", line 180, in _call_parse
self._parse()
  File "/usr/local/lib/python3.10/email/feedparser.py", line 256, in _parsegen
if self._cur.get_content_type() == 'message/delivery-status':
  File "/usr/local/lib/python3.10/email/message.py", line 578, in 
get_content_type
value = self.get('content-type', missing)
  File "/usr/local/lib/python3.10/email/message.py", line 471, in get
return self.policy.header_fetch_parse(k, v)
  File "/usr/local/lib/python3.10/email/policy.py", line 163, in 
header_fetch_parse
return self.header_factory(name, value)
  File "/usr/local/lib/python3.10/email/headerregistry.py", line 608, in 
__call__
return self[name](name, value)
  File "/usr/local/lib/python3.10/email/headerregistry.py", line 196, in __new__
cls.parse(value, kwds)
  File "/usr/local/lib/python3.10/email/headerregistry.py", line 453, in parse
kwds['decoded'] = str(parse_tree)
  File "/usr/local/lib/python3.10/email/_header_value_parser.py", line 126, in 
__str__
return ''.join(str(x) for x in self)
  File "/usr/local/lib/python3.10/email/_header_value_parser.py", line 126, in 

return ''.join(str(x) for x in self)
  File "/usr/local/lib/python3.10/email/_header_value_parser.py", line 798, in 
__str__
for name, value in self.params:
  File "/usr/local/lib/python3.10/email/_header_value_parser.py", line 783, in 
params
value = value.decode(charset, 'surrogateescape')
UnicodeEncodeError: 'utf-8' codec can't encode characters in position 5-7: 
surrogates not allowed

--
components: email
messages: 387685
nosy: andersk, barry, r.david.murray
priority: normal
severity: normal
status: open
title: UnicodeEncodeError: surrogates not allowed when parsing invalid charset
versions: Python 3.10, Python 3.6, Python 3.7, Python 3.8, Python 3.9

___
Python tracker 
<https://bugs.python.org/issue43323>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37103] Undo deprecation of optparse

2019-05-30 Thread Anders Kaseorg

New submission from Anders Kaseorg :

The optparse library is currently marked in the documentation as deprecated in 
favor of argparse.  However, argparse uses a nonstandard reinterpretation of 
Unix command line grammars that makes certain arguments impossible to express, 
and causes scripts to break when ported from optparse to argparse.  See the bug 
report I filed nine years ago:

https://bugs.python.org/issue9334
argparse does not accept options taking arguments beginning with dash 
(regression from optparse)

The conclusion of the core developers (e.g. msg309691) seems to have been that 
although it’s a valid bug, it can’t or won’t be fixed with the current argparse 
architecture.

I was asked by another core developer to file a bug report for the 
de-deprecation of optparse 
(https://discuss.python.org/t/pep-594-removing-dead-batteries-from-the-standard-library/1704/20),
 so here it is.

--
assignee: docs@python
components: Documentation, Library (Lib)
messages: 343997
nosy: andersk, docs@python
priority: normal
severity: normal
status: open
title: Undo deprecation of optparse
versions: Python 3.8

___
Python tracker 
<https://bugs.python.org/issue37103>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9334] argparse does not accept options taking arguments beginning with dash (regression from optparse)

2018-12-12 Thread Anders Kaseorg

Anders Kaseorg  added the comment:

porton: Please don’t steal someone else’s issue to report a different bug.  
Open a new issue instead.

--
title: argparse: add a full fledged parser as a subparser -> argparse does not 
accept options taking arguments beginning with dash (regression from optparse)

___
Python tracker 
<https://bugs.python.org/issue9334>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32601] PosixPathTest.test_expanduser fails in NixOS build sandbox

2018-01-19 Thread Anders Kaseorg

New submission from Anders Kaseorg :

PosixPathTest.test_expanduser fails in the NixOS build sandbox, where every 
user has home directory /, so it falls off the end of the for pwdent in 
pwd.getpwall() loop.

nixbld:x:30001:3:Nix build user:/:/noshell
nobody:x:65534:65534:Nobody:/:/noshell

==
FAIL: test_expanduser (__main__.PosixPathTest)
--
Traceback (most recent call last):
  File "/nix/store/mdak9gcy16dc536ws08rshyakd1l7srj-test_pathlib.py", line 
2162, in test_expanduser
self.assertEqual(p3.expanduser(), P(otherhome) / 'Documents')
AssertionError: PosixPath('/Documents') != PosixPath('Documents')

--
components: Tests
messages: 310282
nosy: andersk
priority: normal
pull_requests: 5091
severity: normal
status: open
title: PosixPathTest.test_expanduser fails in NixOS build sandbox
type: behavior
versions: Python 3.7

___
Python tracker 
<https://bugs.python.org/issue32601>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15429] types.NoneType missing

2012-07-22 Thread Anders Kaseorg

Changes by Anders Kaseorg :


--
assignee: docs@python
components: Documentation
nosy: andersk, docs@python
priority: normal
severity: normal
status: open
title: types.NoneType missing
type: behavior
versions: Python 3.2

___
Python tracker 
<http://bugs.python.org/issue15429>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15429] types.NoneType missing

2012-07-22 Thread Anders Kaseorg

New submission from Anders Kaseorg :

http://docs.python.org/py3k/library/constants.html#None says that None is the 
sole value type types.NoneType.  However, NoneType was removed from the types 
module with Python 3.

--

___
Python tracker 
<http://bugs.python.org/issue15429>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8376] Tutorial offers dangerous advice about iterator s: “__iter__() can just return self”

2010-07-22 Thread Anders Kaseorg

Anders Kaseorg  added the comment:

I don’t think that small change is good enough, if it is still the case that 
the only provided example is the dangerous one.

It would be easy to clarify the differences between the classes:

>>> rl = test.ReverseList('spam')
>>> [c for c in rl]
['m', 'a', 'p', 's']
>>> [c for c in rl]
['m', 'a', 'p', 's']
>>> ri = iter(rl)
>>> ri

>>> [c for c in ri]
['m', 'a', 'p', 's']
>>> [c for c in ri]
[]

--

___
Python tracker 
<http://bugs.python.org/issue8376>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8376] Tutorial offers dangerous advice about iterator s: “__iter__() can just return self”

2010-07-22 Thread Anders Kaseorg

Anders Kaseorg  added the comment:

Antoine: That’s true.

Amaury: See my original bug description (“This is reasonable advice for writing 
an iterator class, but terrible advice for writing a container class…”), and my 
other comments.

There is nothing wrong with explaining how to write an iterator, but the 
explanation needs to make clear that this is _not_ how you write a container.  
Currently the section opens with a misleading motivation (“By now you have 
probably noticed that most container objects can be looped over using a for 
statement”), but it does actually not explain how to write a container at all.  
So I proposed some language and an example to fix that.

--

___
Python tracker 
<http://bugs.python.org/issue8376>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9334] argparse does not accept options taking arguments beginning with dash (regression from optparse)

2010-07-22 Thread Anders Kaseorg

New submission from Anders Kaseorg :

Porting the a2x program to argparse from the now-deprecated optparse subtly 
breaks it when certain options are passed:

$ a2x --asciidoc-opts --safe gitcli.txt
$ ./a2x.argparse --asciidoc-opts --safe gitcli.txt
usage: a2x [-h] [--version] [-a ATTRIBUTE] [--asciidoc-opts ASCIIDOC_OPTS]
   [--copy] [--conf-file CONF_FILE] [-D PATH] [-d DOCTYPE]
   [--epubcheck] [-f FORMAT] [--icons] [--icons-dir PATH] [-k]
   [--lynx] [-L] [-n] [-r PATH] [-s] [--stylesheet STYLESHEET]
   [--safe] [--dblatex-opts DBLATEX_OPTS] [--fop]
   [--fop-opts FOP_OPTS] [--xsltproc-opts XSLTPROC_OPTS] [-v]
a2x: error: argument --asciidoc-opts: expected one argument

Apparently argparse uses a heuristic to try to guess whether an argument looks 
like an argument or an option, going so far as to check whether it looks like a 
negative number (!).  It should _never_ guess: the option was specified to take 
an argument, so the following argument should always be parsed as an argument.

Small test case:

>>> import optparse
>>> parser = optparse.OptionParser(prog='a2x')
>>> parser.add_option('--asciidoc-opts',
... action='store', dest='asciidoc_opts', default='',
... metavar='ASCIIDOC_OPTS', help='asciidoc options')
>>> parser.parse_args(['--asciidoc-opts', '--safe'])
(, [])

>>> import argparse
>>> parser = argparse.ArgumentParser(prog='a2x')
>>> parser.add_argument('--asciidoc-opts',
... action='store', dest='asciidoc_opts', default='',
... metavar='ASCIIDOC_OPTS', help='asciidoc options')
>>> parser.parse_args(['--asciidoc-opts', '--safe'])
usage: a2x [-h] [--asciidoc-opts ASCIIDOC_OPTS]
a2x: error: argument --asciidoc-opts: expected one argument

--
components: Library (Lib)
messages: 111221
nosy: andersk
priority: normal
severity: normal
status: open
title: argparse does not accept options taking arguments beginning with dash 
(regression from optparse)
versions: Python 2.7, Python 3.2, Python 3.3

___
Python tracker 
<http://bugs.python.org/issue9334>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9334] argparse does not accept options taking arguments beginning with dash (regression from optparse)

2010-07-22 Thread Anders Kaseorg

Anders Kaseorg  added the comment:

> Though in general I find argparse's default behavior more useful.

I’m not sure I understand.  Why is it useful for an option parsing library to 
heuristically decide, by default, that I didn’t actually want to pass in the 
valid option that I passed in?  Shouldn’t that be up to the caller (or up to 
the program, if it explicitly decides to reject such arguments)?

Keep in mind that the caller might be another script instead of a user.

--

___
Python tracker 
<http://bugs.python.org/issue9334>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9334] argparse does not accept options taking arguments beginning with dash (regression from optparse)

2010-07-23 Thread Anders Kaseorg

Anders Kaseorg  added the comment:

> Note that the negative number heuristic you're complaining about
> doesn't actually affect your code below.

Yes it does:

>>> import argparse
>>> parser = argparse.ArgumentParser(prog='a2x')
>>> parser.add_argument('--asciidoc-opts',
... action='store', dest='asciidoc_opts', default='',
... metavar='ASCIIDOC_OPTS', help='asciidoc options')
>>> parser.parse_args(['--asciidoc-opts', '-1'])
Namespace(asciidoc_opts='-1')
>>> parser.parse_args(['--asciidoc-opts', '-one'])
usage: a2x [-h] [--asciidoc-opts ASCIIDOC_OPTS]
a2x: error: argument --asciidoc-opts: expected one argument

> Your problem is that you want "--safe" to be treated as a positional
> argument even though you've declared it as an option.

No, it doesn’t matter whether --safe was declared as an option: argparse 
rejected it on the basis of beginning with a dash (as I demonstrated in my 
small test case, which did not declare --safe as an option, and again in the 
example above with -one).

> Either the user wants a conf file named "--safe", or the user
> accidentally forgot to type the name of the conf file.

But it’s not argparse’s job to decide that the valid option I passed was 
actually a typo for something invalid.  This would be like Python rejecting the 
valid call
  shell = "bash"
  p = subprocess.Popen(shell)
just because shell happens to also be a valid keyword argument for the Popen 
constructor and I might have forgotten to specify its value.

Including these special heuristics by default, that (1) are different from the 
standard behavior of all other option parsing libraries and (2) interfere with 
the ability to pass certain valid options, only leads to strange 
inconsistencies between command line programs written in different languages, 
and ultimately makes the command line harder to use for everyone.  The default 
behavior should be the standard one.

--

___
Python tracker 
<http://bugs.python.org/issue9334>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9334] argparse does not accept options taking arguments beginning with dash (regression from optparse)

2010-07-26 Thread Anders Kaseorg

Anders Kaseorg  added the comment:

> I still disagree. You're giving the parser ambiguous input. If a
> parser sees "--foo --bar", and "--foo" is a valid option, but "--bar"
> is not, this is a legitimately ambiguous situation.

There is no ambiguity.  According to the way that every standard option parsing 
library has worked for decades, the parser knows that --foo takes an argument, 
so the string after --foo is in a different grammatical context than options 
are, and is automatically interpreted as an argument to --foo.  (It doesn’t 
matter whether that string begins with a dash, is a valid argument, might 
become a valid argument in some future version, looks like a negative number, 
or any other such condition.)

  arguments = *(positional-argument / option) [-- *(positional-argument)]
  positional-argument = string
  option = foo-option / bar-option
  foo-option = "--foo" string
  bar-option = "--bar"

This is just like how variable names in Python are in a different grammatical 
position than keyword argument names, so that Popen(shell) is not confused with 
Popen(shell=True).  This is not ambiguity; it simply follows from the standard 
definition of the grammar.

argparse’s alternative interpretation of that string as another option does not 
make sense because it violates the requirement that --foo has been defined to 
take an argument.

The only justification for considering that input ambiguous is if you start 
assuming that argparse knows better than the user (“the user accidentally 
forgot to type the name of the conf file”) and try to guess what they meant.  
This violates the user’s expectations of how the command line should work.  It 
also creates subtle bugs in scripts that call argparse-based programs (think 
about call(["program", "--foo", foo_argument]) where foo_argument comes from 
some complex computation or even untrusted network input).

> Changing the default behavior is really a non-starter unless you can
> propose a sensible transition strategy (as is always necessary for
> changing APIs in backwards incompatible ways).

This would not be a backwards incompatible change, since every option that 
previously parsed successfully would also parse in the same way after the fix.

--

___
Python tracker 
<http://bugs.python.org/issue9334>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9334] argparse does not accept options taking arguments beginning with dash (regression from optparse)

2010-07-26 Thread Anders Kaseorg

Anders Kaseorg  added the comment:

>   arguments = *(positional-argument / option) [-- *(positional-argument)]
>   positional-argument = string
>   option = foo-option / bar-option
>   foo-option = "--foo" string
>   bar-option = "--bar"

Er, obviously positional arguments before the first ‘--’ can’t begin with a 
dash (I don’t think there’s any confusion over how those should work).
  arguments = *(non-dash-positional-argument / option) ["--" 
*(positional-argument)]
  non-dash-positional-argument = 
  positional-argument = string

The point was just that the grammar unambiguously allows the argument of --foo 
to be any string.

--

___
Python tracker 
<http://bugs.python.org/issue9334>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8376] Tutorial offers dangerous advice about iterators: “__iter__() can just return self”

2016-10-14 Thread Anders Kaseorg

Anders Kaseorg added the comment:

Usui, this is a tutorial intended for beginners.  Even if the change from 
“most” to “built-in” were a relevant one (and I don’t see how it is), beginners 
cannot possibly be expected to parse that level of meaning out of a single word.

The difference between iterators and containers deserves at least a couple of 
sentences and preferably an example that includes both, as I proposed in 
http://bugs.python.org/issue8376#msg102966.  Do you disapprove of that proposal?

--

___
Python tracker 
<http://bugs.python.org/issue8376>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com