[issue39165] Completeness and symmetry in RE, avoid `findall(...)[0]`

2019-12-30 Thread Juancarlo Añez

New submission from Juancarlo Añez :

The problematic `findall(...)[0]` is a common anti-pattern in Python programs. 
The reason is lack of symmetry and completeness in the `re` module.

The original proposal in `python-ideas` was to add `re.findfirst(pattern, 
string, flags=0, default=_mark)` with more or less the semantics of 
`next(findall(pattern, string, flags=flags), default=default)`. 

The referenced PR adds `findalliter(pattern, string, flags=0)` with the value 
semantics of `findall()` over a generator, implements `findall()` as `return 
list(findalliter(...))`, and implements `findfirst()`. 

Consistency and correctness are likely because all tests pass with the 
redefined `findall()`.

--
components: Library (Lib)
messages: 359039
nosy: apalala
priority: normal
pull_requests: 17191
severity: normal
status: open
title: Completeness and symmetry in RE, avoid `findall(...)[0]`
type: enhancement
versions: Python 3.8

___
Python tracker 
<https://bugs.python.org/issue39165>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39165] Completeness and symmetry in RE, avoid `findall(...)[0]`

2020-01-01 Thread Juancarlo Añez

Juancarlo Añez  added the comment:

The discussion on python-ideas favored the inclusion of `findfirst()`. At any 
rate, not having a generator version of `findall()` is an important omission.

Another user's search of Github public repositories found that 
`findall(...)[0]` is prevalent. python-ideas agreed that the cause was the 
incompleteness/asymmetry in `re`.

--

___
Python tracker 
<https://bugs.python.org/issue39165>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39165] Completeness and symmetry in RE, avoid `findall(...)[0]`

2020-01-04 Thread Juancarlo Añez

Juancarlo Añez  added the comment:

There's no way to assert that `findall(...)[0]` is efficient enough in most 
cases. It is easy to see that that it is risky in every case, as runtime may be 
exponential, and memory O(len(input)). A mistake in the regular expression may 
easily result in an out-of-memory, which can only be debugged with a series of 
tests using `search()`.

A problem with `re.search(...)` is that id doesn't have the return value 
semantics of `findall(...)[0]`, and those semantics seem to be what appeal to 
Python programmers. It takes several lines of code (the ones in 
`findalliter()`) to have the same result as `findal(...)[0]` when using 
`search()`. `findall()` is the sole, lonely function in `re` with its 
return-value semantics.

Also this proposal embeds `first()` within the body of `findfirst(...)`, but by 
the implementation one should consider if `first()` shouldn't be part of 
`itertools`, perhaps with a different name, like `take_one()`.

One should also consider that although third-party extensions to `itertools` 
already provide the equivalent of `first()`, `findalliter()` and `findfirst()` 
do not belong there, and there are no mainstream third-party extensions to `re` 
where they would fit.

--

___
Python tracker 
<https://bugs.python.org/issue39165>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39165] Completeness and symmetry in RE, avoid `findall(...)[0]`

2020-01-17 Thread Juancarlo Añez

Juancarlo Añez  added the comment:

The bottom problem, as I see it, is that, historically, `re.search()` returns 
`None` when there is no match, instead of returning a `Match` object that is 
consistent with "no match" (evaluates to `False`, etc.)

The above seems too difficult to repair as so much existing code relies on 
those semantics (`if match is None` is the risky bit). 

Hence, `findall()`, `findalliter()`, and `findfirst()`.

--

___
Python tracker 
<https://bugs.python.org/issue39165>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39165] Completeness and symmetry in RE, avoid `findall(...)[0]`

2020-01-17 Thread Juancarlo Añez

Juancarlo Añez  added the comment:

The analysis done by Terry bypasses the fact that `search(...)` returns `None` 
when there is no match, so indexing or calling methods in its result is not 
safe code. 

`findall()` returns an empty list when there is no match.

`findalliter()` returns an empty iterator when there is no match.

`findfirst()` may return a `default` value when there is no match.

If `search()` is proposed to replace `findall()[0]`, then the idiom has to be 
(depending on the case):

m[0] if (m := re.search(...)) else '0'
m.groups() if (m := re.search(...)) else '0'

In contrast, `findfirst()` returns a value that is the same as `findall()` when 
there is a match, or a `default` if there is no match.

m[0] if (m := re.findall(...)) else '0'

Compare with:

findfirst(..., default='0')

--

___
Python tracker 
<https://bugs.python.org/issue39165>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17343] Add a version of str.split which returns an iterator

2021-02-26 Thread Juancarlo Añez

Juancarlo Añez  added the comment:

def isplit(text, sep=None, maxsplit=-1):
"""
A lowmemory-footprint version of:

iter(text.split(sep, maxsplit))

Adapted from https://stackoverflow.com/a/9770397
"""

if maxsplit == 0:
yield text
else:
rsep = re.escape(sep) if sep else r'\s+'
regex = fr'(?:^|{rsep})((?:(?!{rsep}).)*)'

for n, p in enumerate(re.finditer(regex, text)):
if 0 <= maxsplit <= n:
yield p.string[p.start(1):]
return
yield p.group(1)

--
nosy: +apalala

___
Python tracker 
<https://bugs.python.org/issue17343>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14902] test_logging failed

2012-07-01 Thread Juancarlo Añez

Juancarlo Añez  added the comment:

My local timezone is (VET,VET) == time.tzname, and test_logging fails because 
time.timezone is off by 30 minutes. I couldn't find the cause for the problem 
with time.timezone, but logging is not to blame. I'm running the tests on 
Ubuntu 12.04 AMD64 which handles my time zone correctly throughout.

I'm submitting a patch that allows test_logging to pass by not relying on 
time.timezone.

--
keywords: +patch
nosy: +apalala
Added file: http://bugs.python.org/file26224/test_logging_wo_timezone.patch

___
Python tracker 
<http://bugs.python.org/issue14902>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14902] test_logging failed

2012-07-03 Thread Juancarlo Añez

Juancarlo Añez  added the comment:

@Vinay The test *is* broken in theory, because it uses today's time.timezone to 
make calculations over a datetime in the past (1993), even when official time 
zones have changes in recent years for Caracas, Moscow, and others: 
http://www.timeanddate.com/news/time/. As it is, the test will pass on some 
locations and fail on others, even if time.timezone is correct.

If time.timezone is wrong for certain locations is a separate issue that I will 
post as soon as I complete the unit test. I took a look at 
Modules/timemodule.c,and there seems to be nothing wrong there.

In short, the bug is: test_time() incorrectly uses the current time.timezone to 
make calculations over dates in the past.

--

___
Python tracker 
<http://bugs.python.org/issue14902>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14902] test_logging failed

2012-07-03 Thread Juancarlo Añez

Juancarlo Añez  added the comment:

> And datetime.datetime.now().tzinfo is always None.

I can reproduce that.

--

___
Python tracker 
<http://bugs.python.org/issue14902>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14902] test_logging failed

2012-07-03 Thread Juancarlo Añez

Juancarlo Añez  added the comment:

I did extensive testing on time.timezone, and it is correct as far as the 
current date is concerned. The problem, as mentioned before, is that 
test_logging is using time.timezone for dates in the past for which the time 
zone may have been different from the current one on the current location.

The attached patch shows that time calculations involving time.timezone may not 
be valid for dates different from the current one, as not even 
daylight-savings/summer times are taken into account, so the test may also fail 
depending on the time of the year it is run on.

--
Added file: http://bugs.python.org/file26246/test_timezones.patch

___
Python tracker 
<http://bugs.python.org/issue14902>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14902] test_logging failed

2012-07-03 Thread Juancarlo Añez

Juancarlo Añez  added the comment:

@Vinay No reason. datetime.astimezone(None) is documented in 3.3. You may even 
use:

r.created = time.mktime(dt.astimezone().timetuple())

--

___
Python tracker 
<http://bugs.python.org/issue14902>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14902] test_logging failed

2012-07-03 Thread Juancarlo Añez

Changes by Juancarlo Añez :


--
type: compile error -> behavior

___
Python tracker 
<http://bugs.python.org/issue14902>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15247] io.open() is inconsistent re os.open()

2012-07-03 Thread Juancarlo Añez

New submission from Juancarlo Añez :

>>> import io
>>> d = io.open('.')
Traceback (most recent call last):
  File "", line 1, in 
IsADirectoryError: [Errno 21] Is a directory: '.'
>>> 

>>> import os
>>> d = io.open(os.open('.',0))
>>> d
<_io.TextIOWrapper name=3 mode='r' encoding='UTF-8'>
>>>

--
components: Library (Lib)
messages: 164633
nosy: apalala
priority: normal
severity: normal
status: open
title: io.open() is inconsistent re os.open()
type: behavior
versions: Python 2.7, Python 3.3

___
Python tracker 
<http://bugs.python.org/issue15247>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15247] io.open() is inconsistent re os.open()

2012-07-03 Thread Juancarlo Añez

Juancarlo Añez  added the comment:

io.open() clearly doesn't care about opening directories as long as they are 
passed as os.open() file descriptors. Quite unexpected!

--

___
Python tracker 
<http://bugs.python.org/issue15247>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15247] io.open() is inconsistent re os.open()

2012-07-04 Thread Juancarlo Añez

Juancarlo Añez  added the comment:

Note that attempting subsequent operations on the returned object do raise 
IsADirectoryError.


>>> import io
>>> import os
>>> d = io.open(os.open('.',0))
>>> d.read()
Traceback (most recent call last):
  File "", line 1, in 
IsADirectoryError: [Errno 21] Is a directory
>>>

--

___
Python tracker 
<http://bugs.python.org/issue15247>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15558] webbrowser output to console

2012-08-04 Thread Juancarlo Añez

New submission from Juancarlo Añez:

Under Ubuntu Linux 11.10 and 12.04, webbroser.open() will output the following 
message to the console:

Created new window in existing browser session.

The behavior is both unexpected and troublesome.

--
components: Library (Lib)
messages: 167443
nosy: apalala
priority: normal
severity: normal
status: open
title: webbrowser output to console
type: behavior
versions: Python 2.7, Python 3.2

___
Python tracker 
<http://bugs.python.org/issue15558>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15618] turtle.pencolor() chokes on unicode

2012-08-10 Thread Juancarlo Añez

New submission from Juancarlo Añez:

>>> t.pencolor(u'red')
Traceback (most recent call last):
  File "", line 1, in 
  File "", line 1, in pencolor
  File "/usr/lib/python2.7/lib-tk/turtle.py", line 2166, in pencolor
color = self._colorstr(args)
  File "/usr/lib/python2.7/lib-tk/turtle.py", line 2600, in _colorstr
return self.screen._colorstr(args)
  File "/usr/lib/python2.7/lib-tk/turtle.py", line , in _colorstr
r, g, b = [round(255.0*x) for x in (r, g, b)]
TypeError: can't multiply sequence by non-int of type 'float'

--
components: Library (Lib)
messages: 167883
nosy: apalala
priority: normal
severity: normal
status: open
title: turtle.pencolor() chokes on unicode
type: behavior
versions: Python 2.7

___
Python tracker 
<http://bugs.python.org/issue15618>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15618] turtle.pencolor() chokes on unicode

2012-08-10 Thread Juancarlo Añez

Juancarlo Añez added the comment:

This patch solves the problem by making turtle check for string against 
basestring insted of str.

--
keywords: +patch
Added file: http://bugs.python.org/file26758/turtle_unicode.patch

___
Python tracker 
<http://bugs.python.org/issue15618>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15618] turtle.pencolor() chokes on unicode

2012-08-10 Thread Juancarlo Añez

Juancarlo Añez added the comment:

The bug showed up in a script that used:

from __future__ import unicode_literals

--

___
Python tracker 
<http://bugs.python.org/issue15618>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15620] readline.clear_history() missing in test_readline.py

2012-08-10 Thread Juancarlo Añez

New submission from Juancarlo Añez:

$ lsb_release -a
LSB Version:
core-2.0-amd64:core-2.0-noarch:core-3.0-amd64:core-3.0-noarch:core-3.1-amd64:core-3.1-noarch:core-3.2-amd64:core-3.2-noarch:core-4.0-amd64:core-4.0-noarch
Distributor ID: Ubuntu
Description:Ubuntu 12.04.1 LTS
Release:12.04
Codename:   precise

$ hg branch
2.7

$ ./python Lib/test/test_readline.py
testHistoryUpdates (__main__.TestHistoryManipulation) ... ERROR

==
ERROR: testHistoryUpdates (__main__.TestHistoryManipulation)
--
Traceback (most recent call last):
  File "Lib/test/test_readline.py", line 16, in testHistoryUpdates
readline.clear_history()
AttributeError: 'module' object has no attribute 'clear_history'

--
Ran 1 test in 0.003s

FAILED (errors=1)
Traceback (most recent call last):
  File "Lib/test/test_readline.py", line 43, in 
test_main()
  File "Lib/test/test_readline.py", line 40, in test_main
run_unittest(TestHistoryManipulation)
  File "/art/python/cpython/Lib/test/test_support.py", line 1125, in 
run_unittest
_run_suite(suite)
  File "/art/python/cpython/Lib/test/test_support.py", line 1108, in _run_suite
raise TestFailed(err)
test.test_support.TestFailed: Traceback (most recent call last):
  File "Lib/test/test_readline.py", line 16, in testHistoryUpdates
readline.clear_history()
AttributeError: 'module' object has no attribute 'clear_history'

--
components: Tests
messages: 167919
nosy: apalala
priority: normal
severity: normal
status: open
title: readline.clear_history() missing in test_readline.py
type: behavior
versions: Python 2.7

___
Python tracker 
<http://bugs.python.org/issue15620>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15620] readline.clear_history() missing in test_readline.py

2012-08-10 Thread Juancarlo Añez

Juancarlo Añez added the comment:

$ dpkg -l | grep readline
ii  libreadline-dev 6.2-8   
GNU readline and history libraries, development files
ii  libreadline55.2-11  
GNU readline and history libraries, run-time libraries
ii  libreadline66.2-8   
GNU readline and history libraries, run-time libraries
ii  libreadline6-dev6.2-8   
GNU readline and history libraries, development files
ii  readline-common 6.2-8   
GNU readline and history libraries, common files

--

___
Python tracker 
<http://bugs.python.org/issue15620>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15620] readline.clear_history() missing in test_readline.py

2012-08-10 Thread Juancarlo Añez

Juancarlo Añez added the comment:

Check if clear_history() is available before calling it.

--
keywords: +patch
Added file: 
http://bugs.python.org/file26761/readline_clear_history_available.patch

___
Python tracker 
<http://bugs.python.org/issue15620>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15620] readline.clear_history() missing in test_readline.py

2012-08-10 Thread Juancarlo Añez

Changes by Juancarlo Añez :


Removed file: 
http://bugs.python.org/file26761/readline_clear_history_available.patch

___
Python tracker 
<http://bugs.python.org/issue15620>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15620] readline.clear_history() missing in test_readline.py

2012-08-10 Thread Juancarlo Añez

Juancarlo Añez added the comment:

Check if clear_history() is available before calling it.

--
Added file: 
http://bugs.python.org/file26762/readline_clear_history_available.patch

___
Python tracker 
<http://bugs.python.org/issue15620>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com