Re: Whittle it on down

2016-05-05 Thread Steven D'Aprano
On Thursday 05 May 2016 16:46, Stephen Hansen wrote:

> On Wed, May 4, 2016, at 11:04 PM, Steven D'Aprano wrote:
>> Start by writing a function or a regex that will distinguish strings that
>> match your conditions from those that don't. A regex might be faster, but
>> here's a function version.
>> ... snip ...
> 
> Yikes. I'm all for the idea that one shouldn't go to regex when Python's
> powerful string type can answer the problem more clearly, but this seems
> to go out of its way to do otherwise.
> 
> I don't even care about faster: Its overly complicated. Sometimes a
> regular expression really is the clearest way to solve a problem.

You're probably right, but I find it easier to reason about matching in 
Python rather than the overly terse, cryptic regular expression mini-
language.

I haven't tested my function version, but I'm 95% sure that it is correct. 
It trickiest part of it is the logic about splitting around ampersands. And 
I'll cheerfully admit that it isn't easy to extend to (say) "ampersand, or 
at signs". But your regex solution:

r"^[A-Z\s&]+$"

is much smaller and more compact, but *wrong*. For instance, your regex 
wrongly accepts both "&" and "  " as valid strings, and wrongly 
rejects "ΔΣΘΛ". Your Greek customers will be sad...

Oh, I just realised, I should have looked more closely at the examples 
given. because the specification given by DFS does not match the examples. 
DFS says that only uppercase letters and ampersands are allowed, but their 
examples include strings with spaces, e.g. 'FITNESS CENTERS' despite the 
lack of ampersands. (I read the spec literally as spaces only allowed if 
they surround an ampersand.) Oops, mea culpa. That makes the check function 
much simpler and easier to extend:


def check(string):
string = string.replace("&", "").replace(" ", "")
return string.isalpha() and string.isupper()


and now I'm 95% confident it is correct without testing, this time for sure!

;-)


-- 
Steve

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Whittle it on down

2016-05-05 Thread Stephen Hansen
On Thu, May 5, 2016, at 12:04 AM, Steven D'Aprano wrote:
> On Thursday 05 May 2016 16:46, Stephen Hansen wrote:
> > > On Wed, May 4, 2016, at 11:04 PM, Steven D'Aprano wrote:
> >> Start by writing a function or a regex that will distinguish strings that
> >> match your conditions from those that don't. A regex might be faster, but
> >> here's a function version.
> >> ... snip ...
> > 
> > Yikes. I'm all for the idea that one shouldn't go to regex when Python's
> > powerful string type can answer the problem more clearly, but this seems
> > to go out of its way to do otherwise.
> > 
> > I don't even care about faster: Its overly complicated. Sometimes a
> > regular expression really is the clearest way to solve a problem.
> 
> You're probably right, but I find it easier to reason about matching in 
> Python rather than the overly terse, cryptic regular expression mini-
> language.
> 
> I haven't tested my function version, but I'm 95% sure that it is
> correct. 
> It trickiest part of it is the logic about splitting around ampersands.
> And 
> I'll cheerfully admit that it isn't easy to extend to (say) "ampersand,
> or 
> at signs". But your regex solution:
> 
> r"^[A-Z\s&]+$"
> 
> is much smaller and more compact, but *wrong*. For instance, your regex 
> wrongly accepts both "&" and "  " as valid strings, and wrongly 
> rejects "ΔΣΘΛ". Your Greek customers will be sad...

Meh. You have a pedantic definition of wrong. Given the inputs, it
produced right output. Very often that's enough. Perfect is the enemy of
good, it's said. 

There's no situation where "&" and " " will exist in the given
dataset, and recognizing that is important. You don't have to account
for every bit of nonsense. 

If the OP needs a unicode-aware solution that redefines "A-Z" as perhaps
"\w" with an isupper call. Its still far simpler then you're suggesting.

-- 
Stephen Hansen
  m e @ i x o k a i . i o
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Whittle it on down

2016-05-05 Thread Steven D'Aprano
Oh, a further thought...


On Thursday 05 May 2016 16:46, Stephen Hansen wrote:

> On Wed, May 4, 2016, at 11:04 PM, Steven D'Aprano wrote:
>> Start by writing a function or a regex that will distinguish strings that
>> match your conditions from those that don't. A regex might be faster, but
>> here's a function version.
>> ... snip ...
> 
> Yikes. I'm all for the idea that one shouldn't go to regex when Python's
> powerful string type can answer the problem more clearly, but this seems
> to go out of its way to do otherwise.
> 
> I don't even care about faster: Its overly complicated. Sometimes a
> regular expression really is the clearest way to solve a problem.

Putting non-ASCII letters aside for the moment, how would you match these 
specs as a regular expression?

- All uppercase ASCII letters (A to Z only), optionally separated into words 
by either a bare ampersand (e.g. "AAA&AAA") or an ampersand with leading and 
trailing spaces (spaces only, not arbitrary whitespace): "AAA   & AAA".

- The number of spaces on either side of the ampersands need not be the 
same: "AAA&   BBB &   CCC" should match.

- Leading or trailing spaces, or spaces not surrounding an ampersand, must 
not match: "AAA BBB" must be rejected.

- Leading or trailing ampersands must also be rejected. This includes the 
case where the string is nothing but ampersands.

- Consecutive ampersands "AAA&&&BBB" and the empty string must be rejected.


I get something like this:

r"(^[A-Z]+$)|(^([A-Z]+[ ]*\&[ ]*[A-Z]+)+$)"


but it fails on strings like "AA   &  A &  A". What am I doing wrong?


For the record, here's my brief test suite:


def test(pat):
for s in ("", " ", "&" "A A", "A&", "&A", "A&&A", "A& &A"):
assert re.match(pat, s) is None
for s in ("A", "A & A", "AA&A", "AA   &  A &  A"):
assert re.match(pat, s)




-- 
Steve

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Whittle it on down

2016-05-05 Thread Peter Otten
Steven D'Aprano wrote:

> Oh, a further thought...
> 
> 
> On Thursday 05 May 2016 16:46, Stephen Hansen wrote:
> 
>> On Wed, May 4, 2016, at 11:04 PM, Steven D'Aprano wrote:
>>> Start by writing a function or a regex that will distinguish strings
>>> that match your conditions from those that don't. A regex might be
>>> faster, but here's a function version.
>>> ... snip ...
>> 
>> Yikes. I'm all for the idea that one shouldn't go to regex when Python's
>> powerful string type can answer the problem more clearly, but this seems
>> to go out of its way to do otherwise.
>> 
>> I don't even care about faster: Its overly complicated. Sometimes a
>> regular expression really is the clearest way to solve a problem.
> 
> Putting non-ASCII letters aside for the moment, how would you match these
> specs as a regular expression?
> 
> - All uppercase ASCII letters (A to Z only), optionally separated into
> words by either a bare ampersand (e.g. "AAA&AAA") or an ampersand with
> leading and
> trailing spaces (spaces only, not arbitrary whitespace): "AAA   & AAA".
> 
> - The number of spaces on either side of the ampersands need not be the
> same: "AAA&   BBB &   CCC" should match.
> 
> - Leading or trailing spaces, or spaces not surrounding an ampersand, must
> not match: "AAA BBB" must be rejected.
> 
> - Leading or trailing ampersands must also be rejected. This includes the
> case where the string is nothing but ampersands.
> 
> - Consecutive ampersands "AAA&&&BBB" and the empty string must be
> rejected.
> 
> 
> I get something like this:
> 
> r"(^[A-Z]+$)|(^([A-Z]+[ ]*\&[ ]*[A-Z]+)+$)"
> 
> 
> but it fails on strings like "AA   &  A &  A". What am I doing wrong?
> 
> 
> For the record, here's my brief test suite:
> 
> 
> def test(pat):
> for s in ("", " ", "&" "A A", "A&", "&A", "A&&A", "A& &A"):
> assert re.match(pat, s) is None
> for s in ("A", "A & A", "AA&A", "AA   &  A &  A"):
> assert re.match(pat, s)

>>> def test(pat):
... for s in ("", " ", "&" "A A", "A&", "&A", "A&&A", "A& &A"):
... assert re.match(pat, s) is None
... for s in ("A", "A & A", "AA&A", "AA   &  A &  A"):
... assert re.match(pat, s)
... 
>>> test("^A+( *& *A+)*$")
>>> 


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Whittle it on down

2016-05-05 Thread Steven D'Aprano
On Thursday 05 May 2016 17:34, Stephen Hansen wrote:


> Meh. You have a pedantic definition of wrong. Given the inputs, it
> produced right output. Very often that's enough. Perfect is the enemy of
> good, it's said.

And this is a *perfect* example of why we have things like this:

http://www.bbc.com/future/story/20160325-the-names-that-break-computer-
systems

"Nobody will ever be called Null."

"Nobody has quotation marks in their name."

"Nobody will have a + sign in their email address."

"Nobody has a legal gender other than Male or Female."

"Nobody will lean on the keyboard and enter gobbledygook into our form."

"Nobody will try to write more data than the space they allocated for it."


> There's no situation where "&" and " " will exist in the given
> dataset, and recognizing that is important. You don't have to account
> for every bit of nonsense.

Whenever a programmer says "This case will never happen", ten thousand 
computers crash.

http://www.kr41.net/2016/05-03-shit_driven_development.html


-- 
Steven D'Aprano

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: No SQLite newsgroup, so I'll ask here about SQLite, python and MS Access

2016-05-05 Thread cl
There's a gmane 'newsgroup from a mailing list' for sqlite:-

gmane.comp.db.sqlite.general

It's quite active and helpful too.   (Also 'announce' and others)

-- 
Chris Green
·
-- 
https://mail.python.org/mailman/listinfo/python-list


smtplib not working when python run under windows service via Local System account

2016-05-05 Thread loial
I have a python 2.7.10 script which is being run under a windows service on 
windows 2012 server .
The python script uses smtplib to send an email.

It works fine when the windows service is run as a local user, but not when the 
windows service is configured to run as Local System account. I get no 
exception from smtplib, but the email fails to arrive.

Any ideas?

 
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Interacting with Subprocesses

2016-05-05 Thread eryk sun
On Wed, May 4, 2016 at 4:04 PM, Akira Li <[email protected]> wrote:
>
> Pass stdin=PIPE, stdout=PIPE and use p.stdin, p.stdout file objects to
> write input, read output from the child process.
>
> Beware, there could be buffering issues or the child process may change
> its behavior some other way when the standard input/output streams are
> redirected. See
> http://pexpect.readthedocs.io/en/stable/FAQ.html#whynotpipe

On Linux, you may be able to use stdbuf [1] to modify standard I/O
buffering. stdbuf sets the LD_PRELOAD [2] environment variable to load
libstdbuf.so [3]. For example, the following shows the environment
variables created by "stdbuf -oL":

$ stdbuf -oL python -c 'import os;print os.environ["LD_PRELOAD"]'
/usr/lib/coreutils/libstdbuf.so
$ stdbuf -oL python -c 'import os;print os.environ["_STDBUF_O"]'
L

[1]: 
http://www.gnu.org/software/coreutils/manual/html_node/stdbuf-invocation.html
[2]: http://www.linuxjournal.com/article/7795
[3]: 
http://git.savannah.gnu.org/cgit/coreutils.git/tree/src/libstdbuf.c?id=v8.21

On Windows, if you can modify the program, then you can check for a
command-line option or an environment variable, like Python's -u and
PYTHONUNBUFFERED.

If you can't modify the source, I think you might be able to hack
something similar to the Linux LD_PRELOAD environment variable by
creating a stdbuf.exe launcher that debugs the process and injects a
DLL after the loader's first-chance breakpoint. The injected
stdbuff.dll would need to be able to get the standard streams for
common CRTs, such as by calling __acrt_iob_func for ucrtbase.dll.
Also, unlike the Linux command, stdbuf.exe would have to wait on the
child, since the Windows API doesn't have fork/exec.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Whittle it on down

2016-05-05 Thread DFS

On 5/5/2016 2:04 AM, Steven D'Aprano wrote:

On Thursday 05 May 2016 14:58, DFS wrote:


Want to whittle a list like this:

[...]

Want to keep all elements containing only upper case letters or upper
case letters and ampersand (where ampersand is surrounded by spaces)



Start by writing a function or a regex that will distinguish strings that
match your conditions from those that don't. A regex might be faster, but
here's a function version.

def isupperalpha(string):
return string.isalpha() and string.isupper()

def check(string):
if isupperalpha(string):
return True
parts = string.split("&")
if len(parts) < 2:
return False
# Don't strip leading spaces from the start of the string.
parts[0] = parts[0].rstrip(" ")
# Or trailing spaces from the end of the string.
parts[-1] = parts[-1].lstrip(" ")
# But strip leading and trailing spaces from the middle parts
# (if any).
for i in range(1, len(parts)-1):
parts[i] = parts[i].strip(" ")
 return all(isupperalpha(part) for part in parts)


Now you have two ways of filtering this. The obvious way is to extract
elements which meet the condition. Here are two ways:

# List comprehension.
newlist = [item for item in oldlist if check(item)]

# Filter, Python 2 version
newlist = filter(check, oldlist)

# Filter, Python 3 version
newlist = list(filter(check, oldlist))


In practice, this is the best (fastest, simplest) way. But if you fear that
you will run out of memory dealing with absolutely humongous lists with
hundreds of millions or billions of strings, you can remove items in place:


def remove(func, alist):
for i in range(len(alist)-1, -1, -1):
if not func(alist[i]):
del alist[i]


Note the magic incantation to iterate from the end of the list towards the
front. If you do it the other way, Bad Things happen. Note that this will
use less memory than extracting the items, but it will be much slower.

You can combine the best of both words. Here is a version that uses a
temporary list to modify the original in place:

# works in both Python 2 and 3
def remove(func, alist):
# Modify list in place, the fast way.
alist[:] = filter(check, alist)



You are out of your mind.





--
https://mail.python.org/mailman/listinfo/python-list


Re: Whittle it on down

2016-05-05 Thread DFS

On 5/5/2016 1:39 AM, Stephen Hansen wrote:




pattern = re.compile(r"^[A-Z\s&]+$")



output = [x for x in list if pattern.match(x)]




Holy Shr"^[A-Z\s&]+$"  One line of parsing!

I was figuring a few list comprehensions would do it - this is better.

(note: the reason I specified 'spaces around ampersand' is so it would
remove 'Q&A' if that ever came up - but some people write 'Q & A', so
I'll live with that exception, or try to tweak it myself.

You're the man, man.

Thank you!




--
https://mail.python.org/mailman/listinfo/python-list


Re: Whittle it on down

2016-05-05 Thread DFS

On 5/5/2016 1:53 AM, Jussi Piitulainen wrote:



Either way is easy to approximate with a regex:

import re
upper = re.compile(r'[A-Z &]+')
lower = re.compile(r'[^A-Z &]')
print([datum for datum in data if upper.fullmatch(datum)])
print([datum for datum in data if not lower.search(datum)])


This is similar to Hansen's solution.




I've skipped testing that the ampersand is between spaces, and I've
skipped the period. Adjust.


Will do.



This considers only ASCII upper case letters. You can add individual
letters that matter to you, or you can reach for the documentation to
find if there is some generic notation for all upper case letters.

The newer regex package on PyPI supports POSIX character classes like
[:upper:], I think, and there may or may not be notation for Unicode
character categories in re or regex - LU would be Letter, Uppercase.


Thanks.

--
https://mail.python.org/mailman/listinfo/python-list


Re: Whittle it on down

2016-05-05 Thread Random832


On Thu, May 5, 2016, at 04:41, Steven D'Aprano wrote:
> > There's no situation where "&" and " " will exist in the given
> > dataset, and recognizing that is important. You don't have to account
> > for every bit of nonsense.
> 
> Whenever a programmer says "This case will never happen", ten thousand 
> computers crash.

What crash can including such an entry in the output list cause?

Should the regex also ensure that the data only includes *english words*
separated by space-ampersand-space?
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Whittle it on down

2016-05-05 Thread Random832
On Thu, May 5, 2016, at 03:36, Steven D'Aprano wrote:
> Putting non-ASCII letters aside for the moment, how would you match these 
> specs as a regular expression?

Well, obviously *your* language (not the OP's), given the cases you
reject, is "one or more sequences of letters separated by
space*-ampersand-space*", and that is actually one of the easiest kinds
of regex to write: "[A-Z]+( *& *[A-Z]+)*".

However, your spec is wrong:

> - Leading or trailing spaces, or spaces not surrounding an ampersand,
> must not match: "AAA BBB" must be rejected.

The *very first* item in OP's list of good outputs is 'PHYSICAL FITNESS
CONSULTANTS & TRAINERS'.

If you want something that's extremely conservative (except for the
*very odd in context* choice of allowing arbitrary numbers of spaces -
why would you allow this but reject leading or trailing space?) and
accepts all of OP's input:

[A-Z]+(( *& *| +)[A-Z]+)*
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Whittle it on down

2016-05-05 Thread Stephen Hansen
On Thu, May 5, 2016, at 12:36 AM, Steven D'Aprano wrote:
> Oh, a further thought...
> 
> On Thursday 05 May 2016 16:46, Stephen Hansen wrote:
> > I don't even care about faster: Its overly complicated. Sometimes a
> > regular expression really is the clearest way to solve a problem.
> 
> Putting non-ASCII letters aside for the moment, how would you match these 
> specs as a regular expression?

I don't know, but mostly because I wouldn't even try. The requirements
are over-specified. If you look at the OP's data (and based on previous
conversation), he's doing web scraping and trying to pull out good data.
There's no absolutely perfect way to do that because the system he's
scraping isn't meant for data processing. The data isn't cleanly
articulated.

Instead, he wants a heuristic to pull out what look like section titles. 

The OP looked at the data and came up with a simple set of rules that
identify these section titles:

>> Want to keep all elements containing only upper case letters or upper 
case letters and ampersand (where ampersand is surrounded by spaces)

This translates naturally into a simple regular expression: an uppercase
string with spaces and &'s. Now, that expression doesn't 100% encode
every detail of that rule-- it allows both Q&A and Q & A-- but on my own
looking at the data, I suspect its good enough. The titles are clearly
separate from the other data scraped by their being upper cased. We just
need to expand our allowed character range into spaces and &'s.

Nothing in the OP's request demands the kind of rigorous matching that
your scenario does. Its a practical problem with a simple, practical
answer.

-- 
Stephen Hansen
  m e @ i x o k a i . i o
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Whittle it on down

2016-05-05 Thread DFS

On 5/5/2016 9:32 AM, Stephen Hansen wrote:

On Thu, May 5, 2016, at 12:36 AM, Steven D'Aprano wrote:

Oh, a further thought...

On Thursday 05 May 2016 16:46, Stephen Hansen wrote:

I don't even care about faster: Its overly complicated. Sometimes a
regular expression really is the clearest way to solve a problem.


Putting non-ASCII letters aside for the moment, how would you match these
specs as a regular expression?


I don't know, but mostly because I wouldn't even try. The requirements
are over-specified. If you look at the OP's data (and based on previous
conversation), he's doing web scraping and trying to pull out good data.
There's no absolutely perfect way to do that because the system he's
scraping isn't meant for data processing. The data isn't cleanly
articulated.

Instead, he wants a heuristic to pull out what look like section titles.



Assigned by a company named localeze, apparently.

http://www.usdirectory.com/cat/g0

https://www.neustarlocaleze.biz/welcome/




The OP looked at the data and came up with a simple set of rules that
identify these section titles:


Want to keep all elements containing only upper case letters or upper

case letters and ampersand (where ampersand is surrounded by spaces)

This translates naturally into a simple regular expression: an uppercase
string with spaces and &'s. Now, that expression doesn't 100% encode
every detail of that rule-- it allows both Q&A and Q & A-- but on my own
looking at the data, I suspect its good enough. The titles are clearly
separate from the other data scraped by their being upper cased. We just
need to expand our allowed character range into spaces and &'s.

Nothing in the OP's request demands the kind of rigorous matching that
your scenario does. Its a practical problem with a simple, practical
answer.



Yes.  And simplicity + practicality = successfulality.

And I do a sanity check before using the data anyway: after parse and 
cleanup and regex matching, I make sure all lists have the same number 
of elements:


lenData = 
[len(title),len(names),len(addr),len(street),len(city),len(state),len(zip)]


if len(set(lenData)) != 1:  alert the media


--
https://mail.python.org/mailman/listinfo/python-list


Re: Whittle it on down

2016-05-05 Thread Steven D'Aprano
On Thu, 5 May 2016 06:17 pm, Peter Otten wrote:

>> I get something like this:
>> 
>> r"(^[A-Z]+$)|(^([A-Z]+[ ]*\&[ ]*[A-Z]+)+$)"
>> 
>> 
>> but it fails on strings like "AA   &  A &  A". What am I doing wrong?

> test("^A+( *& *A+)*$")

Thanks Peter, that's nice!


-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list


Ctypes c_void_p overflow

2016-05-05 Thread Joseph L. Casale
I have CDLL function I use to get a pointer, several other functions happily 
accept this
pointer which is really a long when passed to ctypes.c_void_p. However, only 
one with
same type def in the prototype overflows. Docs suggest c_void_p takes an int 
but that
is not what the first call returns, nor what all but one function happily 
accept?

Anyone familiar enough with ctypes that can shed some light?

Thanks,
jlc
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Whittle it on down

2016-05-05 Thread Steven D'Aprano
On Thu, 5 May 2016 11:13 pm, Random832 wrote:

> On Thu, May 5, 2016, at 04:41, Steven D'Aprano wrote:
>> > There's no situation where "&" and " " will exist in the given
>> > dataset, and recognizing that is important. You don't have to account
>> > for every bit of nonsense.
>> 
>> Whenever a programmer says "This case will never happen", ten thousand
>> computers crash.
> 
> What crash can including such an entry in the output list cause?

How do I know? It depends what you do with that list.

But if you assume that your list contains alphabetical strings, and pass it
on to code that expects alphabetical strings, why is it so hard to believe
that it might choke when it receives a non-alphabetical string?


> Should the regex also ensure that the data only includes *english words*
> separated by space-ampersand-space?

That wasn't part of the specification. But for some applications, yes, you
should ensure the data includes only English words.



-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Ctypes c_void_p overflow

2016-05-05 Thread Steven D'Aprano
On Fri, 6 May 2016 01:42 am, Joseph L. Casale wrote:

> I have CDLL function I use to get a pointer, several other functions
> happily accept this pointer which is really a long when passed to
> ctypes.c_void_p. However, only one with same type def in the prototype
> overflows. Docs suggest c_void_p takes an int but that is not what the
> first call returns, nor what all but one function happily accept?
> 
> Anyone familiar enough with ctypes that can shed some light?

I'm not a ctypes expert, but you might get better responses if you show the
code you're using, the expected result, and the result you actually get.



-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Whittle it on down

2016-05-05 Thread Steven D'Aprano
On Thu, 5 May 2016 11:32 pm, Stephen Hansen wrote:

> On Thu, May 5, 2016, at 12:36 AM, Steven D'Aprano wrote:
>> Oh, a further thought...
>> 
>> On Thursday 05 May 2016 16:46, Stephen Hansen wrote:
>> > I don't even care about faster: Its overly complicated. Sometimes a
>> > regular expression really is the clearest way to solve a problem.
>> 
>> Putting non-ASCII letters aside for the moment, how would you match these
>> specs as a regular expression?
> 
> I don't know, but mostly because I wouldn't even try. 

Really? Peter Otten seems to have found a solution, and Random832 almost
found it too.


> The requirements 
> are over-specified. If you look at the OP's data (and based on previous
> conversation), he's doing web scraping and trying to pull out good data.

I'm not talking about the OP's data. I'm talking about *my* requirements.

I thought that this was a friendly discussion about regexes, but perhaps I
was mistaken. Because I sure am feeling a lot of hostility to the ideas
that regexes are not necessarily the only way to solve this, and that data
validation is a good thing.


> There's no absolutely perfect way to do that because the system he's
> scraping isn't meant for data processing. The data isn't cleanly
> articulated.

Right. Which makes it *more*, not less, important to be sure that your regex
doesn't match too much, because your data is likely to be contaminated by
junk strings that don't belong in the data and shouldn't be accepted. I've
done enough web scraping to realise just how easy it is to start grabbing
data from the wrong part of the file.


> Instead, he wants a heuristic to pull out what look like section titles.

Good for him. I asked a different question. Does my question not count?


> The OP looked at the data and came up with a simple set of rules that
> identify these section titles:
> 
>>> Want to keep all elements containing only upper case letters or upper
> case letters and ampersand (where ampersand is surrounded by spaces)

That simple rule doesn't match his examples, as I know too well because I
made the silly mistake of writing to the written spec as written without
reading the examples as well. As I already admitted. That was a silly
mistake because I know very well that people are really bad at writing
detailed specs that neither match too much nor too little.

But you know, I was more focused on the rest of his question, namely whether
it was better to extract the matches strings into a new list, or delete the
non-matches from the existing string, and just got carried away writing the
match function. I didn't actually expect anyone to use it. It was untested,
and I hinted that a regex would probably be better.

I was trying to teach DFS a generic programming technique, not solve his
stupid web scraping problem for him. What happens next time when he's
trying to filter a list of floats, or Widgets? Should he convert them to
strings so he can use a regex to match them, or should he learn about
general filtering techniques?


> This translates naturally into a simple regular expression: an uppercase
> string with spaces and &'s. Now, that expression doesn't 100% encode
> every detail of that rule-- it allows both Q&A and Q & A-- but on my own
> looking at the data, I suspect its good enough. The titles are clearly
> separate from the other data scraped by their being upper cased. We just
> need to expand our allowed character range into spaces and &'s.
> 
> Nothing in the OP's request demands the kind of rigorous matching that
> your scenario does. Its a practical problem with a simple, practical
> answer.

Yes, and that practical answer needs to reject:

- the empty string, because it is easy to mistakenly get empty strings when
scraping data, especially if you post-process the data;

- strings that are all spaces, because "   " cannot possibly be a title;

- strings that are all ampersands, because "&" is not a title, and it
almost surely indicates that your scraping has gone wrong and you're
reading junk from somewhere;

- even leading and trailing spaces are suspect: "  FOO  " doesn't match any
of the examples given, and it seems unlikely to be a title. Presumably the
strings have already been filtered or post-processed to have leading and
trailing spaces removed, in which case "  FOO  " reveals a bug.

 

-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Whittle it on down

2016-05-05 Thread Steven D'Aprano
On Thu, 5 May 2016 10:31 pm, DFS wrote:

> You are out of your mind.

That's twice you've tried to put me down, first by dismissing my comments
about text processing with "Linguist much", and now an outright insult. The
first time I laughed it off and made a joke about it. I won't do that
again.

You asked whether it was better to extract the matching strings into a new
list, or remove them in place in the existing list. I not only showed you
how to do both, but I tried to give you the mental tools to understand when
you should pick one answer over the other. And your response is to insult
me and question my sanity.

Well, DFS, I might be crazy, but I'm not stupid. If that's really how you
feel about my answers, I won't make the mistake of wasting my time
answering your questions in the future.

Over to you now.


-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Whittle it on down

2016-05-05 Thread Jussi Piitulainen
Steven D'Aprano writes:

> I get something like this:
>
> r"(^[A-Z]+$)|(^([A-Z]+[ ]*\&[ ]*[A-Z]+)+$)"
>
>
> but it fails on strings like "AA   &  A &  A". What am I doing wrong?

It cannot split the string as (LETTERS & LETTERS)(LETTERS & LETTERS)
when the middle part is just one LETTER. That's something of a
misanalysis anyway. I notice that the correct pattern has already been
posted at least thrice and you have acknowledged one of them.

But I think you are also trying to do too much with a single regex. A
more promising start is to think of the whole string as "parts" joined
with "glue", then split with a glue pattern and test the parts:

import re
glue = re.compile(" *& *| +")
keep, drop = [], []
for datum in data:
items = glue.split(datum)
if all(map(str.isupper, items)):
keep.append(datum)
else:
drop.append(datum)

That will cope with Greek, by the way.

It's annoying that the order of the branches of the glue pattern above
matters. One _does_ have problems when one uses the usual regex engines.

Capturing groups in the glue pattern would produce glue items in the
split output. Either avoid them or deal with them: one could split with
the underspecific "([ &]+)" and then check that each glue item contains
at most one ampersand. One could also allow other punctuation, and then
check afterwards.

One can use _another_ regex to test individual parts. Code above used
str.isupper to test a part. The improved regex package (from PyPI, to
cope with Greek) can do the same:

import regex
part = regex.compile("[[:upper:]]+")
glue = regex.compile(" *& *| *")

keep, drop = [], []
for datum in data:
items = glue.split(datum)
if all(map(part.fullmatch, items)):
keep.append(datum)
else:
drop.append(datum)

Just "[A-Z]+" suffices for ASCII letters, and "[A-ZÄÖ]+" copes with most
of Finnish; the [:upper:] class is nicer and there's much more that is
nicer in the newer regex package.

The point of using a regex for this is that the part pattern can then be
generalized to allow some punctuation or digits in a part, for example.
Anything that the glue pattern doesn't consume. (Nothing wrong with
using other techniques for this, either; str.isupper worked nicely
above.)

It's also possible to swap the roles of the patterns. Split with a part
pattern. Then check that the text between such parts is glue:

keep, drop = [], []
for datum in data:
items = part.split(datum)
if all(map(glue.fullmatch, items)):
keep.append(datum)
else:
drop.append(datum)

The point is to keep the patterns simple by making them more local, or
more relaxed, followed by a further test. This way they can be made to
do more, but not more than they reasonably can.

Note also the use of re.fullmatch instead of re.match (let alone
re.search) when a full match is required! This gets rid of all anchors
in the pattern, which may in turn allow fewer parentheses inside the
pattern.

The usual regex engines are not perfect, but parts of them are
fantastic.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Whittle it on down

2016-05-05 Thread Steven D'Aprano
On Thu, 5 May 2016 11:21 pm, Random832 wrote:

> On Thu, May 5, 2016, at 03:36, Steven D'Aprano wrote:
>> Putting non-ASCII letters aside for the moment, how would you match these
>> specs as a regular expression?
> 
> Well, obviously *your* language (not the OP's), given the cases you
> reject, is "one or more sequences of letters separated by
> space*-ampersand-space*", and that is actually one of the easiest kinds
> of regex to write: "[A-Z]+( *& *[A-Z]+)*".

One of the easiest kind of regex to write incorrectly:

py> re.match("[A-Z]+( *& *[A-Z]+)*", "A")
<_sre.SRE_Match object at 0xb7bf4aa0>


It doesn't even get the "all uppercase" part of the specification:

py> re.match("[A-Z]+( *& *[A-Z]+)*", "Azzz")
<_sre.SRE_Match object at 0xb7bf4aa0>

You failed to anchor the string at the beginning and end of the string, an
easy mistake to make, but that's the point. It's easy to make mistakes with
regexes because the syntax is so overly terse and unforgiving.

But I think I just learned something important today. I learned that's it's
not actually regexes that I dislike, it's regex culture that I dislike.
What I learned from this thread:


- Nobody could possibly want to support non-ASCII text. (Apart from the
approximately 6.5 billion people in the world that don't speak English of
course, an utterly insignificant majority.)

- Data validity doesn't matter, because there's no possible way that you
might accidentally scrape data from the wrong part of a HTML file and end
up with junk input.

- Even if you do somehow end up with junk, there couldn't possibly be any
real consequences to that.

- It doesn't matter if you match too much, or to little, that just means the
specs are too pedantic.


Hence the famous quote:

Some people, when confronted with a problem, think 
"I know, I'll use regular expressions." Now they 
have two problems.


It's not really regexes that are the problem.


> However, your spec is wrong:

How can you say that? It's *my* spec, I can specify anything I want.


>> - Leading or trailing spaces, or spaces not surrounding an ampersand,
>> must not match: "AAA BBB" must be rejected.
> 
> The *very first* item in OP's list of good outputs is 'PHYSICAL FITNESS
> CONSULTANTS & TRAINERS'.

That's very nice, but irrelevant. I'm not talking about the OP's outputs.
I'm giving my own.




-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Whittle it on down

2016-05-05 Thread Steven D'Aprano
On Fri, 6 May 2016 03:49 am, Jussi Piitulainen wrote:

> Steven D'Aprano writes:
> 
>> I get something like this:
>>
>> r"(^[A-Z]+$)|(^([A-Z]+[ ]*\&[ ]*[A-Z]+)+$)"
>>
>>
>> but it fails on strings like "AA   &  A &  A". What am I doing wrong?
> 
> It cannot split the string as (LETTERS & LETTERS)(LETTERS & LETTERS)
> when the middle part is just one LETTER. That's something of a
> misanalysis anyway. I notice that the correct pattern has already been
> posted at least thrice and you have acknowledged one of them.

Thrice? I've seen Peter's response (he made the trivial and obvious
simplification of just using A instead of [A-Z], but that was easy to
understand), and Random832 almost got it, missing only that you need to
match the entire string, not just a substring. If there was a third
response, I missed it.


> But I think you are also trying to do too much with a single regex. A
> more promising start is to think of the whole string as "parts" joined
> with "glue", then split with a glue pattern and test the parts:
> 
> import re
> glue = re.compile(" *& *| +")
> keep, drop = [], []
> for datum in data:
> items = glue.split(datum)
> if all(map(str.isupper, items)):
> keep.append(datum)
> else:
> drop.append(datum)

Ah, the penny drops! For a while I thought you were suggesting using this to
assemble a regex, and it just wasn't making sense to me. Then I realised
you were using this as a matcher: feed in the list of strings, and it
splits it into strings to keep and strings to discard. Nicely done, that is
a good technique to remember.

Thanks for the analysis!



-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Whittle it on down

2016-05-05 Thread Jussi Piitulainen
Steven D'Aprano writes:

> On Fri, 6 May 2016 03:49 am, Jussi Piitulainen wrote:
>
>> Steven D'Aprano writes:
>> 
>>> I get something like this:
>>>
>>> r"(^[A-Z]+$)|(^([A-Z]+[ ]*\&[ ]*[A-Z]+)+$)"
>>>
>>>
>>> but it fails on strings like "AA   &  A &  A". What am I doing wrong?
>> 
>> It cannot split the string as (LETTERS & LETTERS)(LETTERS & LETTERS)
>> when the middle part is just one LETTER. That's something of a
>> misanalysis anyway. I notice that the correct pattern has already been
>> posted at least thrice and you have acknowledged one of them.
>
> Thrice? I've seen Peter's response (he made the trivial and obvious
> simplification of just using A instead of [A-Z], but that was easy to
> understand), and Random832 almost got it, missing only that you need to
> match the entire string, not just a substring. If there was a third
> response, I missed it.

I think I saw another. I may be mistaken.

Random832's pattern is fine. You need to use re.fullmatch with it.

. .
-- 
https://mail.python.org/mailman/listinfo/python-list


PyDev 5.0.0 Released

2016-05-05 Thread Fabio Zadrozny
PyDev 5.0.0 Released

Release Highlights:
---

* **Important** PyDev now requires Java 8.

* PyDev 4.5.5 is the last release supporting Java 7.
* See: http://www.pydev.org/update_sites/index.html for the update site of
older versions of PyDev.
* See: the **PyDev does not appear after install** section on
http://www.pydev.org/download.html for help on using a Java 8 vm in Eclipse.

* PyUnit view now persists its state across restarts.

* Fixed issue in super() code completion.

* PyDev.Debugger updated to the latest version.

* No longer showing un-needed shell on Linux on startup when showing
donation dialog.

* Fixed pyedit_wrap_expression to avoid halt of the IDE on Ctrl+1 -> Wrap
expression.

What is PyDev?
---

PyDev is an open-source Python IDE on top of Eclipse for Python, Jython and
IronPython development.

It comes with goodies such as code completion, syntax highlighting, syntax
analysis, code analysis, refactor, debug, interactive console, etc.

Details on PyDev: http://pydev.org
Details on its development: http://pydev.blogspot.com


What is LiClipse?
---

LiClipse is a PyDev standalone with goodies such as support for Multiple
cursors, theming, TextMate bundles and a number of other languages such as
Django Templates, Jinja2, Kivy Language, Mako Templates, Html, Javascript,
etc.

It's also a commercial counterpart which helps supporting the development
of PyDev.

Details on LiClipse: http://www.liclipse.com/



Cheers,

--
Fabio Zadrozny
--
Software Developer

LiClipse
http://www.liclipse.com

PyDev - Python Development Environment for Eclipse
http://pydev.org
http://pydev.blogspot.com

PyVmMonitor - Python Profiler
http://www.pyvmmonitor.com/
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Whittle it on down

2016-05-05 Thread Random832
On Thu, May 5, 2016, at 14:03, Steven D'Aprano wrote:
> You failed to anchor the string at the beginning and end of the string,
> an easy mistake to make, but that's the point.

I don't think anchoring is properly a concern of the regex itself -
.match is anchored implicitly at the beginning, and one could easily
imagine an API that implicitly anchors at the end - or you can simply
check that the match length == the string length.

> - Data validity doesn't matter, because there's no possible way that you
> might accidentally scrape data from the wrong part of a HTML file and end
> up with junk input.

If you've scraped data from the wrong part of the file, then nothing you
do to your regex can prevent the junk input from coincidentally matching
the input format.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Whittle it on down

2016-05-05 Thread Random832
On Thu, May 5, 2016, at 14:27, Jussi Piitulainen wrote:
> Random832's pattern is fine. You need to use re.fullmatch with it.

Heh, in my previous post I said "and one could easily imagine an API
that implicitly anchors at the end". So easy to imagine it turns out
that someone already did, as it turns out. Batteries included indeed.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Whittle it on down

2016-05-05 Thread Stephen Hansen
On Thu, May 5, 2016, at 10:43 AM, Steven D'Aprano wrote:
> On Thu, 5 May 2016 11:32 pm, Stephen Hansen wrote:
> 
> > On Thu, May 5, 2016, at 12:36 AM, Steven D'Aprano wrote:
> >> Oh, a further thought...
> >> 
> >> On Thursday 05 May 2016 16:46, Stephen Hansen wrote:
> >> > I don't even care about faster: Its overly complicated. Sometimes a
> >> > regular expression really is the clearest way to solve a problem.
> >> 
> >> Putting non-ASCII letters aside for the moment, how would you match these
> >> specs as a regular expression?
> > 
> > I don't know, but mostly because I wouldn't even try. 
> 
> Really? Peter Otten seems to have found a solution, and Random832 almost
> found it too.
> 
> 
> > The requirements 
> > are over-specified. If you look at the OP's data (and based on previous
> > conversation), he's doing web scraping and trying to pull out good data.
> 
> I'm not talking about the OP's data. I'm talking about *my* requirements.
> 
> I thought that this was a friendly discussion about regexes, but perhaps
> I
> was mistaken. Because I sure am feeling a lot of hostility to the ideas
> that regexes are not necessarily the only way to solve this, and that
> data
> validation is a good thing.

Umm, what? Hostility? I have no idea where you're getting that.

I didn't say that regexs are the only way to solve problems; in fact
they're something I avoid using in most cases. In the OP's case, though,
I did say I thought was a natural fit. Usually, I'd go for
startswith/endswith, "in", slicing and such string primitives before I
go for a regular expression.

"Find all upper cased phrases that may have &'s in them" is something
just specific enough that the built in string primitives are awkward
tools.

In my experience, most of the problems with regexes is people think
they're the hammer and every problem is a nail: and then they get into
ever more convoluted expressions that become brittle.  More specific in
a regular expression is not, necessarily, a virtue. In fact its exactly
the opposite a lot of times.

> > There's no absolutely perfect way to do that because the system he's
> > scraping isn't meant for data processing. The data isn't cleanly
> > articulated.
> 
> Right. Which makes it *more*, not less, important to be sure that your
> regex
> doesn't match too much, because your data is likely to be contaminated by
> junk strings that don't belong in the data and shouldn't be accepted.
> I've
> done enough web scraping to realise just how easy it is to start grabbing
> data from the wrong part of the file.

I have nothing against data validation: I don't think it belongs in
regular expressions, though. That can be a step done afterwards.

> > Instead, he wants a heuristic to pull out what look like section titles.
> 
> Good for him. I asked a different question. Does my question not count?

Sure it counts, but I don't want to engage in your theoretical exercise.
That's not being hostile, that's me not wanting to think about a complex
set of constraints for a regular expression for purely intellectual
reasons.

> I was trying to teach DFS a generic programming technique, not solve his
> stupid web scraping problem for him. What happens next time when he's
> trying to filter a list of floats, or Widgets? Should he convert them to
> strings so he can use a regex to match them, or should he learn about
> general filtering techniques?

Come on. This is a bit presumptuous, don't you think?

> > This translates naturally into a simple regular expression: an uppercase
> > string with spaces and &'s. Now, that expression doesn't 100% encode
> > every detail of that rule-- it allows both Q&A and Q & A-- but on my own
> > looking at the data, I suspect its good enough. The titles are clearly
> > separate from the other data scraped by their being upper cased. We just
> > need to expand our allowed character range into spaces and &'s.
> > 
> > Nothing in the OP's request demands the kind of rigorous matching that
> > your scenario does. Its a practical problem with a simple, practical
> > answer.
> 
> Yes, and that practical answer needs to reject:
> 
> - the empty string, because it is easy to mistakenly get empty strings
> when
> scraping data, especially if you post-process the data;
> 
> - strings that are all spaces, because "   " cannot possibly be a
> title;
> 
> - strings that are all ampersands, because "&" is not a title, and it
> almost surely indicates that your scraping has gone wrong and you're
> reading junk from somewhere;
> 
> - even leading and trailing spaces are suspect: "  FOO  " doesn't match
> any
> of the examples given, and it seems unlikely to be a title. Presumably
> the
> strings have already been filtered or post-processed to have leading and
> trailing spaces removed, in which case "  FOO  " reveals a bug.

We're going to have to agree to disagree. I find all of that
unnecessary.  Any validation can be easily done before or after
matching, you don't need to over-complica

Re: Whittle it on down

2016-05-05 Thread Stephen Hansen
On Thu, May 5, 2016, at 05:31 AM, DFS wrote:
> You are out of your mind.

Whoa, now. I might disagree with Steven D'Aprano about how to approach
this problem, but there's no need to be rude. Everyone's trying to help
you, after all.

-- 
Stephen Hansen
  m e @ i x o k a i . i o
-- 
https://mail.python.org/mailman/listinfo/python-list


Python is an Equal Opportunity Programming Language

2016-05-05 Thread Terry Reedy

https://motherboard.vice.com/blog/python-is-an-equal-opportunity-programming-language

from an 'Intel® Software Evangelist'
--
Terry Jan Reedy


--
https://mail.python.org/mailman/listinfo/python-list


Re: Whittle it on down

2016-05-05 Thread Stephen Hansen
On Thu, May 5, 2016, at 11:03 AM, Steven D'Aprano wrote:
> - Nobody could possibly want to support non-ASCII text. (Apart from the
> approximately 6.5 billion people in the world that don't speak English of
> course, an utterly insignificant majority.)

Oh, I'd absolutely want to support non-ASCII text. If I have unicode
input, though, I unfortunately have to rely on
https://pypi.python.org/pypi/regex as 're' doesn't support matching on
character properties. 

I keep hoping it'll replace "re", then we could do:

pattern = regex.compile(ru"^\p{Lu}\s&]+$")

where \p{property} matches against character properties in the unicode
database.

> - Data validity doesn't matter, because there's no possible way that you
> might accidentally scrape data from the wrong part of a HTML file and end
> up with junk input.

Um, no one said that. I was arguing that the *regular expression*
doesn't need to be responsible for validation.

> - Even if you do somehow end up with junk, there couldn't possibly be any
> real consequences to that.

No one said that either...

> - It doesn't matter if you match too much, or to little, that just means
> the
> specs are too pedantic.

Or that...

-- 
Stephen Hansen
  m e @ i x o k a i . i o
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Comparing Python enums to Java, was: How much sanity checking is required for function inputs?

2016-05-05 Thread Ethan Furman

On 04/24/2016 08:20 AM, Ian Kelly wrote:

On Sun, Apr 24, 2016 at 1:20 AM, Ethan Furman wrote:



What fun things can Java enums do?


Everything that Python enums can do, plus:

>

--> Planet.EARTH.value
(5.976e+24, 6378140.0)
--> Planet.EARTH.surface_gravity
9.802652743337129

This is incredibly useful, but it has a flaw: the value of each member
of the enum is just the tuple of its arguments. Suppose we added a
value for COUNTER_EARTH describing a hypothetical planet with the same
mass and radius existing on the other side of the sun. [1] Then:

--> Planet.EARTH is Planet.COUNTER_EARTH
True


If using Python 3 and aenum 1.4.1+, you can do

--> class Planet(Enum, settings=NoAlias, init='mass radius'):
... MERCURY = (3.303e+23, 2.4397e6)
... VENUS   = (4.869e+24, 6.0518e6)
... EARTH   = (5.976e+24, 6.37814e6)
... COUNTER_EARTH = EARTH
... @property
... def surface_gravity(self):
... # universal gravitational constant  (m3 kg-1 s-2)
... G = 6.67300E-11
... return G * self.mass / (self.radius * self.radius)
...
--> Planet.EARTH.value
(5.976e+24, 6378140.0)
--> Planet.EARTH.surface_gravity
9.802652743337129
--> Planet.COUNTER_EARTH.value
(5.976e+24, 6378140.0)
--> Planet.COUNTER_EARTH.surface_gravity
9.802652743337129

Planet.EARTH is Planet.COUNTER_EARTH

False



* Speaking of AutoNumber, since Java enums don't have the
instance/value distinction, they effectively do this implicitly, only
without generating a bunch of ints that are entirely irrelevant to
your enum type. With Python enums you have to follow a somewhat arcane
recipe to avoid specifying values, which just generates some values
and then hides them away. And it also breaks the Enum alias feature:

--> class Color(AutoNumber):
... red = default = ()  # not an alias!
... blue = ()
...


Another thing you could do here:

--> class Color(Enum, settings=AutoNumber):
... red
... default = red
... blue
...
--> list(Color)
[, ]
--> Color.default is Color.red
True

--
~Ethan~
--
https://mail.python.org/mailman/listinfo/python-list


Re: Ctypes c_void_p overflow

2016-05-05 Thread eryk sun
On Thu, May 5, 2016 at 10:42 AM, Joseph L. Casale
 wrote:
> I have CDLL function I use to get a pointer, several other functions happily 
> accept this
> pointer which is really a long when passed to ctypes.c_void_p. However, only 
> one with
> same type def in the prototype overflows. Docs suggest c_void_p takes an int 
> but that
> is not what the first call returns, nor what all but one function happily 
> accept?

What you're describing isn't clear to me, so I'll describe the general
case of handling pointers with ctypes functions. If a function returns
a pointer, you must set the function's restype to a pointer type since
the default c_int restype truncates the upper half of a 64-bit
pointer. Generally you also have to do the same for pointer parameters
in argtypes. Otherwise integer arguments are converted to C int
values.

Note that when restype is set to c_void_p, the result gets converted
to a Python integer (or None for a NULL result). If you pass this
result back as a ctypes function argument, the function must have the
parameter set to c_void_p in argtypes. If argtypes isn't set, the
default integer conversion may truncate the pointer value. This
problem won't occur on a 32-bit platform, so there's a lot of
carelessly written ctypes code that makes this mistake.

Simple types are also automatically converted when accessed as a field
of a struct or union or as an index of an array or pointer. To avoid
this, you can use a subclass of the type, since ctypes won't
automatically convert subclasses of simple types.

I generally avoid c_void_p because its lenient from_param method
(called to convert arguments) doesn't provide much type safety. If a
bug causes an incorrect argument to be passed, I prefer getting an
immediate ctypes.ArgumentError rather than a segfault or data
corruption. For example, when a C API returns a void pointer as a
handle for an opaque structure or object, I prefer to handle it as a
pointer to an empty Structure subclass, as follows:

class _ContosoHandle(ctypes.Structure):
pass

ContosoHandle = ctypes.POINTER(_ContosoHandle)

lib.CreateContoso.restype = ContosoHandle
lib.DestroyContoso.argtypes = (ContosoHandle,)

ctypes will raise an ArgumentError if DestroyContoso is called with
arguments such as 123456789 or "Crash Me".
-- 
https://mail.python.org/mailman/listinfo/python-list


python, ctypes and GetIconInfo issue

2016-05-05 Thread mymyxin
Hello,

I try to make the GetIconInfo function work, but I can't figure out
what I'm doing wrong.

>From the MSDN documentation the function is

https://msdn.microsoft.com/en-us/library/windows/desktop/ms648070%28v=vs.85%29.aspx

# BOOL WINAPI GetIconInfo(
# _In_  HICON hIcon,
# _Out_ PICONINFO piconinfo
# );

which I defined as

GetIconInfo = windll.user32.GetIconInfo
GetIconInfo.argtypes   = [HICON, POINTER(ICONINFO)]
GetIconInfo.restype= BOOL
GetIconInfo.errcheck   = ErrorIfZero


The structure piconinfo is described as
https://msdn.microsoft.com/en-us/library/windows/desktop/ms648052%28v=vs.85%29.aspx

# typedef struct _ICONINFO {
# BOOLfIcon;
# DWORD   xHotspot;
# DWORD   yHotspot;
# HBITMAP hbmMask;
# HBITMAP hbmColor;
# } ICONINFO, *PICONINFO;

my implementation is

class ICONINFO(Structure):
__fields__ = [
  ('fIcon', BOOL),
  ('xHotspot',  DWORD),
  ('yHotspot',  DWORD),
  ('hbmMask',   HBITMAP),
  ('hbmColor',  HBITMAP),
 ]
 
 not part of the problem but needed to get the icon handle
hicon = ImageList_GetIcon(def_il_handle,1,ILD_NORMAL)
print hicon


As the documentation states, the function run successful if return code is
none zero. Well I get 1 returned but as soon as I try to access a class member
the program crashes.

iconinfo = ICONINFO()
lres =  GetIconInfo(hicon, pointer(iconinfo))
print lres
print '{0}'.format(sizeof(iconinfo))   # <- crash

If I comment the print of sizeof... the program keeps running but if I call
the same code a second time then it crashes at GetIconInfo(hicon, ...)

So it looks like I'm doing something terribly wrong but don't see it.

Can someone shed some light on it?

Thank you
Hubert
-- 
https://mail.python.org/mailman/listinfo/python-list


RE: Ctypes c_void_p overflow

2016-05-05 Thread Joseph L. Casale
> I generally avoid c_void_p because its lenient from_param method
> (called to convert arguments) doesn't provide much type safety. If a
> bug causes an incorrect argument to be passed, I prefer getting an
> immediate ctypes.ArgumentError rather than a segfault or data
> corruption. For example, when a C API returns a void pointer as a
> handle for an opaque structure or object, I prefer to handle it as a
> pointer to an empty Structure subclass, as follows:
>
> class _ContosoHandle(ctypes.Structure):
> pass
>
> ContosoHandle = ctypes.POINTER(_ContosoHandle)
>
> lib.CreateContoso.restype = ContosoHandle
> lib.DestroyContoso.argtypes = (ContosoHandle,)
>
> ctypes will raise an ArgumentError if DestroyContoso is called with
> arguments such as 123456789 or "Crash Me".

After typing up a response with all the detail, your reply helped me see
the error.

Thank you so much for all that detail, it was very much appreciated!
jlc
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Whittle it on down

2016-05-05 Thread DFS

On 5/5/2016 1:54 PM, Steven D'Aprano wrote:

On Thu, 5 May 2016 10:31 pm, DFS wrote:


You are out of your mind.


That's twice you've tried to put me down, first by dismissing my comments
about text processing with "Linguist much", and now an outright insult. The
first time I laughed it off and made a joke about it. I won't do that
again.

>

You asked whether it was better to extract the matching strings into a new
list, or remove them in place in the existing list. I not only showed you
how to do both, but I tried to give you the mental tools to understand when
you should pick one answer over the other. And your response is to insult
me and question my sanity.

Well, DFS, I might be crazy, but I'm not stupid. If that's really how you
feel about my answers, I won't make the mistake of wasting my time
answering your questions in the future.

Over to you now.



heh!  Relax, pal.

I was just trying to be funny - no insult intended either time, of 
course.  Look for similar responses from me in the future.  Usenet 
brings out the smart-aleck in me.


Actually, you should've accepted the 'Linguist much?' as a compliment, 
because I seriously thought you were.


But you ARE out of your mind if you prefer that convoluted "function" 
method over a simple 1-line regex method (as per S. Hansen).


def isupperalpha(string):
return string.isalpha() and string.isupper()

def check(string):
if isupperalpha(string):
return True
parts = string.split("&")
if len(parts) < 2:
return False
parts[0] = parts[0].rstrip(" ")
parts[-1] = parts[-1].lstrip(" ")
for i in range(1, len(parts)-1):
parts[i] = parts[i].strip(" ")
 return all(isupperalpha(part) for part in parts)


I'm sure it does the job well, but that style brings back [bad] memories 
of the VBA I used to write.  I expected something very concise and 
'pythonic' (which I'm learning is everyone's favorite mantra here in 
python-land).


Anyway, I appreciate ALL replies to my queries.  So thank you for taking 
the time.


Whenever I'm able, I'll try to contribute to clp as well.




--
https://mail.python.org/mailman/listinfo/python-list


Re: Whittle it on down

2016-05-05 Thread DFS

On 5/5/2016 2:56 PM, Stephen Hansen wrote:

On Thu, May 5, 2016, at 05:31 AM, DFS wrote:

You are out of your mind.


Whoa, now. I might disagree with Steven D'Aprano about how to approach
this problem, but there's no need to be rude.


Seriously not trying to be rude - more smart-alecky than anything.

Hope D'Aprano doesn't stay butthurt...




Everyone's trying to help you, after all.


Yes, and I do appreciate it.

I've only been working with python for about a month, but I feel like 
I'm making good progress.  clp is a great resource, and I'll be hanging 
around for a long time, and will contribute when possible.


Thanks for your help.
--
https://mail.python.org/mailman/listinfo/python-list


Re: Whittle it on down

2016-05-05 Thread DFS

On 5/5/2016 1:39 AM, Stephen Hansen wrote:


Given:


input = [u'Espa\xf1ol', 'Health & Fitness Clubs (36)', 'Health Clubs & Gymnasiums (42)', 'Health Fitness Clubs', 
'Name', 'Atlanta city guide', 'edit address', 'Tweet', 'PHYSICAL FITNESS CONSULTANTS & TRAINERS', 'HEALTH CLUBS & 
GYMNASIUMS', 'HEALTH CLUBS & GYMNASIUMS', 'www.custombuiltpt.com/', 'RACQUETBALL COURTS PRIVATE', 'www.lafitness.com', 
'GYMNASIUMS', 'HEALTH & FITNESS CLUBS', 'www.lafitness.com', 'HEALTH & FITNESS CLUBS', 'www.lafitness.com', 
'PERSONAL FITNESS TRAINERS', 'HEALTH CLUBS & GYMNASIUMS', 'EXERCISE & PHYSICAL FITNESS PROGRAMS', 'FITNESS 
CENTERS', 'HEALTH CLUBS & GYMNASIUMS', 'HEALTH CLUBS & GYMNASIUMS', 'PERSONAL FITNESS TRAINERS', '5', '4', '3', 
'2', '1', 'Yellow Pages', 'About Us', 'Contact Us', 'Support', 'Terms of Use', 'Privacy Policy', 'Advertise With Us', 
'Add/Update Listing', 'Business Profile Login', 'F.A.Q.']


Then:


pattern = re.compile(r"^[A-Z\s&]+$")
output = [x for x in list if pattern.match(x)]
output



['PHYSICAL FITNESS CONSULTANTS & TRAINERS', 'HEALTH CLUBS & GYMNASIUMS',
'HEALTH CLUBS & GYMNASIUMS', 'RACQUETBALL COURTS PRIVATE', 'GYMNASIUMS',
'HEALTH & FITNESS CLUBS', 'HEALTH & FITNESS CLUBS', 'PERSONAL FITNESS
TRAINERS', 'HEALTH CLUBS & GYMNASIUMS', 'EXERCISE & PHYSICAL FITNESS
PROGRAMS', 'FITNESS CENTERS', 'HEALTH CLUBS & GYMNASIUMS', 'HEALTH CLUBS
& GYMNASIUMS', 'PERSONAL FITNESS TRAINERS']



Should've looked earlier.  Their master list of categories 
http://www.usdirectory.com/cat/g0 shows a few commas, a bunch of dashes, 
and the ampersands we talked about.


"OFFICE SERVICES, SUPPLIES & EQUIPMENT" gets removed because of the comma.

"AUTOMOBILE - DEALERS" gets removed because of the dash.

I updated your regex and it seems to have fixed it.

orig: (r"^[A-Z\s&]+$")
new : (r"^[A-Z\s&,-]+$")


Thanks again.


--
https://mail.python.org/mailman/listinfo/python-list


Re: python, ctypes and GetIconInfo issue

2016-05-05 Thread eryk sun
On Thu, May 5, 2016 at 3:47 PM,   wrote:
>
> I try to make the GetIconInfo function work, but I can't figure out
> what I'm doing wrong.
>
> From the MSDN documentation the function is
>
> https://msdn.microsoft.com/en-us/library/windows/desktop/ms648070%28v=vs.85%29.aspx
>
> # BOOL WINAPI GetIconInfo(
> # _In_  HICON hIcon,
> # _Out_ PICONINFO piconinfo
> # );
>
> which I defined as
>
> GetIconInfo = windll.user32.GetIconInfo
> GetIconInfo.argtypes   = [HICON, POINTER(ICONINFO)]
> GetIconInfo.restype= BOOL
> GetIconInfo.errcheck   = ErrorIfZero

Please avoid windll. It caches the loaded library, which in turn
caches function pointers. So all packages that use windll.user32 are
potentially stepping on each others' toes with mutually incompatible
function prototypes. It also doesn't allow configuring
use_last_error=True to enable ctypes.get_last_error() for WinAPI
function calls.

> The structure piconinfo is described as
> https://msdn.microsoft.com/en-us/library/windows/desktop/ms648052%28v=vs.85%29.aspx
>
> # typedef struct _ICONINFO {
> # BOOLfIcon;
> # DWORD   xHotspot;
> # DWORD   yHotspot;
> # HBITMAP hbmMask;
> # HBITMAP hbmColor;
> # } ICONINFO, *PICONINFO;
>
> my implementation is
>
> class ICONINFO(Structure):
> __fields__ = [
>   ('fIcon', BOOL),
>   ('xHotspot',  DWORD),
>   ('yHotspot',  DWORD),
>   ('hbmMask',   HBITMAP),
>   ('hbmColor',  HBITMAP),
>  ]

The attribute name is "_fields_", not "__fields__", so you haven't
actually defined any fields and sizeof(ICONINFO) is 0. When you pass
this empty struct to GetIconInfo, it potentially overwrites and
corrupts existing data on the heap that can lead to a crash later on.

Here's the setup I created to test GetIconInfo and GetIconInfoEx.
Maybe you can reuse some of this code, but if you're using XP this
won't work as written because GetIconInfoEx was added in Vista.

Note the use of a __del__ finalizer to call DeleteObject on the
bitmaps. Otherwise, in a real application, calling GetIconInfo would
leak memory. Using __del__ is convenient, but note that you can't
reuse an instance without manually calling DeleteObject on the
bitmaps.

import ctypes
from ctypes import wintypes

kernel32 = ctypes.WinDLL('kernel32', use_last_error=True)
user32 = ctypes.WinDLL('user32', use_last_error=True)
gdi32 = ctypes.WinDLL('gdi32')

MAX_PATH = 260
IMAGE_ICON = 1

class ICONINFO_BASE(ctypes.Structure):
def __del__(self, gdi32=gdi32):
if self.hbmMask:
gdi32.DeleteObject(self.hbmMask)
self.hbmMask = None
if self.hbmColor:
gdi32.DeleteObject(self.hbmColor)
self.hbmColor = None

class ICONINFO(ICONINFO_BASE):
_fields_ = (('fIcon',wintypes.BOOL),
('xHotspot', wintypes.DWORD),
('yHotspot', wintypes.DWORD),
('hbmMask',  wintypes.HBITMAP),
('hbmColor', wintypes.HBITMAP))

class ICONINFOEX(ICONINFO_BASE):
_fields_ = (('cbSize',wintypes.DWORD),
('fIcon', wintypes.BOOL),
('xHotspot',  wintypes.DWORD),
('yHotspot',  wintypes.DWORD),
('hbmMask',   wintypes.HBITMAP),
('hbmColor',  wintypes.HBITMAP),
('wResID',wintypes.WORD),
('szModName', wintypes.WCHAR * MAX_PATH),
('szResName', wintypes.WCHAR * MAX_PATH))

def __init__(self, *args, **kwds):
super(ICONINFOEX, self).__init__(*args, **kwds)
self.cbSize = ctypes.sizeof(self)

PICONINFO = ctypes.POINTER(ICONINFO)
PICONINFOEX = ctypes.POINTER(ICONINFOEX)

def check_bool(result, func, args):
if not result:
raise ctypes.WinError(ctypes.get_last_error())
return args

kernel32.GetModuleHandleW.errcheck = check_bool
kernel32.GetModuleHandleW.restype = wintypes.HMODULE
kernel32.GetModuleHandleW.argtypes = (
wintypes.LPCWSTR,) # _In_opt_ lpModuleName

# DeleteObject doesn't call SetLastError
gdi32.DeleteObject.restype = wintypes.BOOL
gdi32.DeleteObject.argtypes = (
   wintypes.HGDIOBJ,) # _In_ hObject

user32.LoadImageW.errcheck = check_bool
user32.LoadImageW.restype = wintypes.HANDLE
user32.LoadImageW.argtypes = (
wintypes.HINSTANCE, # _In_opt_ hinst
wintypes.LPCWSTR,   # _In_ lpszName
wintypes.UINT,  # _In_ uType
ctypes.c_int,   # _In_ cxDesired
ctypes.c_int,   # _In_ cyDesired
wintypes.UINT,) # _In_ fuLoad

user32.DestroyIcon.errcheck = check_bool
user32.DestroyIcon.restype = wintypes.BOOL
user32.DestroyIcon.argtypes

Re: Whittle it on down

2016-05-05 Thread Steven D'Aprano
On Fri, 6 May 2016 04:27 am, Jussi Piitulainen wrote:

> Random832's pattern is fine. You need to use re.fullmatch with it.

py> re.fullmatch
Traceback (most recent call last):
  File "", line 1, in 
AttributeError: 'module' object has no attribute 'fullmatch'



-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list


Pylint prefers list comprehension over filter...

2016-05-05 Thread Christopher Reimer

Greetings,

Below is the code that I mentioned in an earlier thread.

string = "Whiskey Tango Foxtrot"
''.join(list(filter(str.isupper, string)))

'WTF'

That works fine and dandy. Except Pylint doesn't like it. According to 
this link, list comprehensions have replaced filters and the Pylint 
warning can be disabled.


http://stackoverflow.com/questions/3569134/why-doesnt-pylint-like-built-in-functions

Here's the replacement code using list comprehension:

''.join([x for x in string if x.isupper()])

Which is one is correct (Pythonic)? Or does it matter?

Thank you,

Chris R.
--
https://mail.python.org/mailman/listinfo/python-list


Re: Pylint prefers list comprehension over filter...

2016-05-05 Thread Chris Angelico
On Fri, May 6, 2016 at 11:26 AM, Christopher Reimer
 wrote:
> Below is the code that I mentioned in an earlier thread.
>
> string = "Whiskey Tango Foxtrot"
> ''.join(list(filter(str.isupper, string)))
>
> 'WTF'
>
> That works fine and dandy. Except Pylint doesn't like it. According to this
> link, list comprehensions have replaced filters and the Pylint warning can
> be disabled.
>
> http://stackoverflow.com/questions/3569134/why-doesnt-pylint-like-built-in-functions
>
> Here's the replacement code using list comprehension:
>
> ''.join([x for x in string if x.isupper()])
>
> Which is one is correct (Pythonic)? Or does it matter?

Nothing wrong with filter. Since join() is going to iterate over its
argument anyway, you don't need the list() call, you can remove that,
but you don't have to go for comprehensions:

''.join(filter(str.isupper, string))

Rule of thumb: If the function already exists, use filter or map. If
you would be using filter/map with a lambda function, reach for a
comprehension instead.

In this case, str.isupper exists, so use it!

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Pylint prefers list comprehension over filter...

2016-05-05 Thread Stephen Hansen
On Thu, May 5, 2016, at 06:26 PM, Christopher Reimer wrote:
> Which is one is correct (Pythonic)? Or does it matter?

First, pylint is somewhat opinionated, and its default options shouldn't
be taken as gospel. There's no correct: filter is fine.

That said, the general consensus is, I believe, that list comprehensions
are good, and using them is great.

In your case, though, I would not use a list comprehension. I'd use a
generator comprehension. It looks almost identical:

''.join(x for x in string if x.isupper())

The difference is, both filter and your list comprehension *build a
list* which is not needed, and wasteful. The above skips building a
list, instead returning a generator, and join pulls items out of it one
at a time as it uses them. No needlessly creating a list only to use it
and discard it.

-- 
Stephen Hansen
  m e @ i x o k a i  . i o
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: After a year using Node.js, the prodigal son returns

2016-05-05 Thread Michael Torrie
On 05/04/2016 02:59 AM, Steven D'Aprano wrote:
> A year ago, Gavin Vickery decided to move away from Python and give 
> Javascript with Node.js a try. Twelve months later, he has written about his 
> experiences:
> 
> 
> http://geekforbrains.com/post/after-a-year-of-nodejs-in-production

Very interesting.  Frankly Javascript sounds awful.  Even on the front end.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Pylint prefers list comprehension over filter...

2016-05-05 Thread Dan Sommers
On Thu, 05 May 2016 18:37:11 -0700, Stephen Hansen wrote:

> ''.join(x for x in string if x.isupper())

> The difference is, both filter and your list comprehension *build a
> list* which is not needed, and wasteful. The above skips building a
> list, instead returning a generator ...

filter used to build a list, but now it doesn't (where "used to" means
Python 2.7 and "now" means Python 3.5; I'm too lazy to track down the
exact point(s) at which it changed):

Python 2.7.11+ (default, Apr 17 2016, 14:00:29) 
[GCC 5.3.1 20160409] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> filter(lambda x:x+1, [1, 2, 3, 4])
[1, 2, 3, 4]

Python 3.5.1+ (default, Apr 17 2016, 16:14:06) 
[GCC 5.3.1 20160409] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> filter(lambda x:x+1, [1, 2, 3, 4])

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: After a year using Node.js, the prodigal son returns

2016-05-05 Thread Chris Angelico
On Fri, May 6, 2016 at 12:49 PM, Michael Torrie  wrote:
> On 05/04/2016 02:59 AM, Steven D'Aprano wrote:
>> A year ago, Gavin Vickery decided to move away from Python and give
>> Javascript with Node.js a try. Twelve months later, he has written about his
>> experiences:
>>
>>
>> http://geekforbrains.com/post/after-a-year-of-nodejs-in-production
>
> Very interesting.  Frankly Javascript sounds awful.  Even on the front end.

https://www.destroyallsoftware.com/talks/the-birth-and-death-of-javascript

JavaScript is terrible. Really, really bad. And because of that, it
has the potential to sweep the world.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Pylint prefers list comprehension over filter...

2016-05-05 Thread Chris Angelico
On Fri, May 6, 2016 at 12:46 PM, Dan Sommers  wrote:
> filter used to build a list, but now it doesn't (where "used to" means
> Python 2.7 and "now" means Python 3.5; I'm too lazy to track down the
> exact point(s) at which it changed):
>
> Python 2.7.11+ (default, Apr 17 2016, 14:00:29)
> [GCC 5.3.1 20160409] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
> >>> filter(lambda x:x+1, [1, 2, 3, 4])
> [1, 2, 3, 4]
>
> Python 3.5.1+ (default, Apr 17 2016, 16:14:06)
> [GCC 5.3.1 20160409] on linux
> Type "help", "copyright", "credits" or "license" for more information.
> >>> filter(lambda x:x+1, [1, 2, 3, 4])
> 

Most of these kinds of changes happened in 3.0, where
backward-incompatible changes were accepted. A whole bunch of things
stopped returning lists and started returning lazy iterables - range,
filter/map, dict.keys(), etc - because most of the time, they're
iterated over once and then dropped.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Pylint prefers list comprehension over filter...

2016-05-05 Thread Stephen Hansen
On Thu, May 5, 2016, at 07:46 PM, Dan Sommers wrote:
> On Thu, 05 May 2016 18:37:11 -0700, Stephen Hansen wrote:
> 
> > ''.join(x for x in string if x.isupper())
> 
> > The difference is, both filter and your list comprehension *build a
> > list* which is not needed, and wasteful. The above skips building a
> > list, instead returning a generator ...
> 
> filter used to build a list, but now it doesn't (where "used to" means
> Python 2.7 and "now" means Python 3.5; I'm too lazy to track down the
> exact point(s) at which it changed):

Oh, didn't know that. Then again the OP was converting the output of
filter *into* a list, which wasted a list either way.

-- 
Stephen Hansen
  m e @ i x o k a i . i o
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: How to become more motivated to learn Python

2016-05-05 Thread jladasky
The best way to increase your motivation to learn Python is: 

1. Select a non-trivial problem that you need to solve with programming.
2. Try to write the program you need in any other language (that you don't 
already know well).
3. Write the program you need in Python.
4. Gaze in astonishment at the time that you could have saved by skipping step 
2.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Pylint prefers list comprehension over filter...

2016-05-05 Thread Dan Sommers
On Fri, 06 May 2016 02:46:22 +, Dan Sommers wrote:

> Python 2.7.11+ (default, Apr 17 2016, 14:00:29) 
> [GCC 5.3.1 20160409] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
> >>> filter(lambda x:x+1, [1, 2, 3, 4])
> [1, 2, 3, 4]
> 
> Python 3.5.1+ (default, Apr 17 2016, 16:14:06) 
> [GCC 5.3.1 20160409] on linux
> Type "help", "copyright", "credits" or "license" for more information.
> >>> filter(lambda x:x+1, [1, 2, 3, 4])
> 

Muphrey's Law strikes again.  That lambda function is obviously a
leftover from a call to *map* rather than a call to *filter*, but thanks
everyone for not laughing and pointing.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Pylint prefers list comprehension over filter...

2016-05-05 Thread Chris Angelico
On Fri, May 6, 2016 at 1:07 PM, Dan Sommers  wrote:
> On Fri, 06 May 2016 02:46:22 +, Dan Sommers wrote:
>
>> Python 2.7.11+ (default, Apr 17 2016, 14:00:29)
>> [GCC 5.3.1 20160409] on linux2
>> Type "help", "copyright", "credits" or "license" for more information.
>> >>> filter(lambda x:x+1, [1, 2, 3, 4])
>> [1, 2, 3, 4]
>>
>> Python 3.5.1+ (default, Apr 17 2016, 16:14:06)
>> [GCC 5.3.1 20160409] on linux
>> Type "help", "copyright", "credits" or "license" for more information.
>> >>> filter(lambda x:x+1, [1, 2, 3, 4])
>> 
>
> Muphrey's Law strikes again.  That lambda function is obviously a
> leftover from a call to *map* rather than a call to *filter*, but thanks
> everyone for not laughing and pointing.

Hey, maybe you wanted to filter out all the -1 results. Maybe you have
a search function that returns zero-based offsets, or -1 for "not
found". Seems reasonable! And "x+1" is way shorter than "x!=-1", which
means by definition that it's better.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Whittle it on down

2016-05-05 Thread Jussi Piitulainen
Steven D'Aprano writes:

> On Fri, 6 May 2016 04:27 am, Jussi Piitulainen wrote:
>
>> Random832's pattern is fine. You need to use re.fullmatch with it.
>
> py> re.fullmatch
> Traceback (most recent call last):
>   File "", line 1, in 
> AttributeError: 'module' object has no attribute 'fullmatch'

It's new in version 3.4 (of Python).
-- 
https://mail.python.org/mailman/listinfo/python-list