Cannot step through asynchronous iterator manually
Hi all To loop though an iterator one usually uses a higher-level construct such as a 'for' loop. However, if you want to step through it manually you can do so with next(iter). I expected the same functionality with the new 'asynchronous iterator' in Python 3.5, but I cannot find it. I can achieve the desired result by calling 'await aiter.__anext__()', but this is clunky. Am I missing something? Frank Millman -- https://mail.python.org/mailman/listinfo/python-list
Re: Mimick tac with python.
Christian Gollwitzer wrote:
> Am 30.01.16 um 05:58 schrieb Random832:
>> On Fri, Jan 29, 2016, at 23:46, Hongyi Zhao wrote:
>>> awk '{a[NR]=$0} END {while (NR) print a[NR--]}' input_file
>>> perl -e 'print reverse<>' input_file
>>
>> Well, both of those read the whole file into memory - tac is sometimes
>> smarter than that, but that makes for a more complex program.
>
> Now I'm curious. How is it possible to output the first line as last
> again if not by remembering it from the every beginning? How could tac
> be implemented other than sucking up everything into memory?
If the input file is seekable you can do blockwise reads:
import os
import sys
def tac(f, blocksize=1024):
buf = b""
f.seek(0, os.SEEK_END)
size = f.tell()
for start in reversed(range(0, size, blocksize)):
f.seek(start)
buf = f.read(blocksize) + buf
lines = buf.splitlines(True)
buf = lines.pop(0)
yield from reversed(lines)
yield buf
if __name__ == "__main__":
for filename in sys.argv[1:]:
with open(filename, "rb") as infile:
sys.stdout.buffer.writelines(tac(infile))
This way you need to keep one block plus one line in memory.
--
https://mail.python.org/mailman/listinfo/python-list
Re: Cannot step through asynchronous iterator manually
On Sat, Jan 30, 2016 at 7:22 PM, Frank Millman wrote:
> We had a recent discussion about the best way to do this, and ChrisA
> suggested the following, which I liked -
>
>cur.execute('SELECT ...)
>try:
>row = next(cur)
>except StopIteration:
># row does not exist
>else:
>try:
>next_row = next(cur)
>except StopIteration:
># row does exist
>else:
># raise exception
>
> Now that I have gone async, I want to do the same with an asynchronous
> iterator.
Here's a crazy option. (Assuming that a row can't be None. If not, use
a unique sentinel object.)
cur.execute(whatever)
have_row = None
async for row in cur:
if have_row is not None:
raise TooManyRows
have_row = row
if have_row is None:
raise NoRowFound
It's kinda abusing the loop construct, but it'd work. Alternatively,
you could call the dunder method directly, but that feels dirty.
Dunders are for defining, not calling.
ChrisA
--
https://mail.python.org/mailman/listinfo/python-list
Re: psss...I want to move from Perl to Python
On 01/28/2016 04:01 PM, Fillmore wrote: I learned myself Perl as a scripting language over two decades ago. All through this time, I would revert to it from time to time whenever I needed some text manipulation and data analysis script. My problem? maybe I am stupid, but each time I have to go back and re-learn the syntax, the gotchas, the references and the derefercing, the different syntax between Perl 4 and Perl 5, that messy CPAN in which every author seems to have a different ideas of how things should be done I get this feeling I am wasting a lot of time restudying the wheel each tim... I look and Python and it looks so much more clean add to that that it is the language of choice of data miners... add to that that iNotebook looks powerful Does Python have Regexps? How was the Python 2.7 vs Python 3.X solved? which version should I go for? Do you think that switching to Python from Perl is a good idea at 45? Where do I get started moving from Perl to Python? which gotchas need I be aware of? Thank you Check out this link: http://www.linuxjournal.com/article/3882 It is an account of ESR's[1] first experiences going to Python from Perl. It's somewhat old (2000), but a very interesting read. And probably relevant to your questions. -=- Larry -=- [1] Eric S. Raymond -- https://mail.python.org/mailman/listinfo/python-list
Re: Mimick tac with python.
Am 30.01.16 um 08:56 schrieb Jussi Piitulainen:
Christian Gollwitzer writes:
Am 30.01.16 um 05:58 schrieb Random832:
On Fri, Jan 29, 2016, at 23:46, Hongyi Zhao wrote:
awk '{a[NR]=$0} END {while (NR) print a[NR--]}' input_file
perl -e 'print reverse<>' input_file
Well, both of those read the whole file into memory - tac is sometimes
smarter than that, but that makes for a more complex program.
Now I'm curious. How is it possible to output the first line as last
again if not by remembering it from the every beginning? How could tac
be implemented other than sucking up everything into memory?
It may be possible to map the data into virtual memory so that the
program sees it as an array of bytes. The data is paged in when
accessed. The program just scans the array backwards, looking for
end-of-line characters. I believe they can be identified reliably, as
bytes, even in a backward scan of UTF-8-encoded data.
The data needs to be in a file.
If it's in a file, then I agree. I was thinking about the case where tac
is used in a pipe - obviously here you can't reverse the file in
constant memory.
Christian
--
https://mail.python.org/mailman/listinfo/python-list
Re: Mimick tac with python.
Christian Gollwitzer writes:
> Am 30.01.16 um 05:58 schrieb Random832:
>> On Fri, Jan 29, 2016, at 23:46, Hongyi Zhao wrote:
>>> awk '{a[NR]=$0} END {while (NR) print a[NR--]}' input_file
>>> perl -e 'print reverse<>' input_file
>>
>> Well, both of those read the whole file into memory - tac is sometimes
>> smarter than that, but that makes for a more complex program.
>
> Now I'm curious. How is it possible to output the first line as last
> again if not by remembering it from the every beginning? How could tac
> be implemented other than sucking up everything into memory?
It may be possible to map the data into virtual memory so that the
program sees it as an array of bytes. The data is paged in when
accessed. The program just scans the array backwards, looking for
end-of-line characters. I believe they can be identified reliably, as
bytes, even in a backward scan of UTF-8-encoded data.
The data needs to be in a file. The keywords are something like "memory
mapping" and "mmap". I've only experimented with this briefly once in
Julia, so I don't really know more.
Oh. There's https://docs.python.org/3/library/mmap.html in Python.
--
https://mail.python.org/mailman/listinfo/python-list
Re: Cannot step through asynchronous iterator manually
On Jan 29, 2016 11:04 PM, "Frank Millman" wrote: > > Hi all > > To loop though an iterator one usually uses a higher-level construct such as a 'for' loop. However, if you want to step through it manually you can do so with next(iter). > > I expected the same functionality with the new 'asynchronous iterator' in Python 3.5, but I cannot find it. > > I can achieve the desired result by calling 'await aiter.__anext__()', but this is clunky. > > Am I missing something? async for x in aiter: pass Can only be used inside a coroutine, of course. -- https://mail.python.org/mailman/listinfo/python-list
Re: Cannot step through asynchronous iterator manually
On Sat, Jan 30, 2016 at 7:02 PM, Ian Kelly wrote: > On Jan 29, 2016 11:04 PM, "Frank Millman" wrote: >> >> Hi all >> >> To loop though an iterator one usually uses a higher-level construct such > as a 'for' loop. However, if you want to step through it manually you can > do so with next(iter). >> >> I expected the same functionality with the new 'asynchronous iterator' in > Python 3.5, but I cannot find it. >> >> I can achieve the desired result by calling 'await aiter.__anext__()', > but this is clunky. >> >> Am I missing something? > > async for x in aiter: > pass Yeah, he wants to single-step it. A regular for loop is equivalent to calling next() lots of times, and you can manually call next(). Common usage: Skip a header row before iterating over the rest of a file. So how do you do the same thing with an async iterator? I'm not sure there's a way, currently. That's the question. Of course, you can always do this: async for x in aiter: break as an equivalent to "x = next(aiter)", but that's just stupid :) ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Cannot step through asynchronous iterator manually
"Ian Kelly" wrote in message
news:CALwzid=ssdsm8hdan+orj54a_jeu9wc8103iqgkaah8mrj-...@mail.gmail.com...
On Jan 29, 2016 11:04 PM, "Frank Millman" wrote:
>
> Hi all
>
> To loop though an iterator one usually uses a higher-level construct
> such
as a 'for' loop. However, if you want to step through it manually you can
do so with next(iter).
>
> I expected the same functionality with the new 'asynchronous iterator'
> in
Python 3.5, but I cannot find it.
>
> I can achieve the desired result by calling 'await aiter.__anext__()',
but this is clunky.
>
> Am I missing something?
async for x in aiter:
pass
Can only be used inside a coroutine, of course.
I know that you can use this to loop through the entire iterator, but I have
a special case.
There are times when I want to execute a SELECT statement, and test for
three possibilities -
- if no rows are returned, the object does not exist
- if one row is returned, the object does exist
- if more that one row is returned, raise an exception
We had a recent discussion about the best way to do this, and ChrisA
suggested the following, which I liked -
cur.execute('SELECT ...)
try:
row = next(cur)
except StopIteration:
# row does not exist
else:
try:
next_row = next(cur)
except StopIteration:
# row does exist
else:
# raise exception
Now that I have gone async, I want to do the same with an asynchronous
iterator.
Frank
--
https://mail.python.org/mailman/listinfo/python-list
Re: psss...I want to move from Perl to Python
On Sat, 30 Jan 2016 09:47 am, Ben Finney wrote: > Steven D'Aprano writes: > >> You should have started with the official tutorial: >> >> https://docs.python.org/2/tutorial/ > > And these days the default recommendation should be to start with the > official tutorial for the current stable version of the Python language, > Python 3 https://docs.python.org/3/tutorial/>. Python 2.7 is still a current stable version, and will be maintained until at least 2020, after which you'll need commercial (paid) support from vendors such as Red Hat if you want security updates. I value your enthusiasm towards encouraging people to use Python 3, but Python 2.7 is, for the time being, equally current and stable, and will be for a few more years. I can't remember why I gave the 2 tutorial instead of the 3 tutorial, but I had a reason at the time... *wink* -- Steven -- https://mail.python.org/mailman/listinfo/python-list
Re: Mimick tac with python.
On 1/30/2016 1:03 AM, Christian Gollwitzer wrote:
Am 30.01.16 um 05:58 schrieb Random832:
On Fri, Jan 29, 2016, at 23:46, Hongyi Zhao wrote:
awk '{a[NR]=$0} END {while (NR) print a[NR--]}' input_file
perl -e 'print reverse<>' input_file
Well, both of those read the whole file into memory - tac is sometimes
smarter than that, but that makes for a more complex program.
Now I'm curious. How is it possible to output the first line as last
again if not by remembering it from the every beginning? How could tac
be implemented other than sucking up everything into memory?
One could read the file by lines and make a list of start-of-line
positions. Reverse the list. Read each line. Details omitted.
--
Terry Jan Reedy
--
https://mail.python.org/mailman/listinfo/python-list
Re: Cannot step through asynchronous iterator manually
2016-01-30 11:51 GMT+01:00 Frank Millman :
> "Chris Angelico" wrote in message
> news:CAPTjJmoAmVNTCKq7QYaDRNQ67Gcg9TxSXYXCrY==s9djjna...@mail.gmail.com...
>
>
>> On Sat, Jan 30, 2016 at 7:22 PM, Frank Millman
>> wrote:
>> > We had a recent discussion about the best way to do this, and ChrisA
>> > suggested the following, which I liked -
>> >
>> >cur.execute('SELECT ...)
>> >try:
>> >row = next(cur)
>> >except StopIteration:
>> ># row does not exist
>> >else:
>> >try:
>> >next_row = next(cur)
>> >except StopIteration:
>> ># row does exist
>> >else:
>> ># raise exception
>> >
>> > Now that I have gone async, I want to do the same with an asynchronous
>> > iterator.
>>
>
>
I might be a bit off-topic, but why don't you simply use cursor.rowcount?
For a pure iterator-based solution, I would do something like this (admitly
a bit cryptic, but iterator-based solutions often are :-) :
async def get_uniqu(ait):
async for row in ait:
break
else:
raise NotEnoughtRows()
async for _ in ait:
raise TooManyRows()
return row
--
https://mail.python.org/mailman/listinfo/python-list
Re: Python Scraps, Napkins & Nuggets
On Friday, 29 January 2016 17:32:58 UTC+11, Sayth Renshaw wrote: > Hi > > This may seem an odd request, however i thought i would ask do you have any > diagrams, scribbles, presentations you have done when coaching someone at > work that just seems to work for others consistently? > > In coaching non programming topics at work i have noticed the variance in > learning styles and many times my off hand scribbles and comments are what > gets their attention and understanding. > > I would love to see what you have created. If its python awesome and in > keeping with list topic if not i wont tell. > > Thanks > > Sayth Oh dear all a bit shy, that's ok. Sayth -- https://mail.python.org/mailman/listinfo/python-list
Re: The computer that mastered Go
On Fri, Jan 29, 2016 at 1:38 PM, mm0fmf via Python-list wrote: > On 29/01/2016 19:46, Seymore4Head wrote: >> >> https://www.youtube.com/watch?v=g-dKXOlsf98 >> > > Is it written in Python? Given the game, and the fact that it's Google, I would be very disappointed if it's not written in Go. -- https://mail.python.org/mailman/listinfo/python-list
[RELEASE] ‘python-daemon’ version 2.1.1 released
Howdy all, I am pleased to announce the release of version 2.1.1 of the ‘python-daemon’ library. The current release is always available at https://pypi.python.org/pypi/python-daemon/>. Significant changes since the previous version == Version 2.1.0 - I omitted sending a release announcement for version 2.1.0. * Add a DaemonContext option, ‘initgroups’, which specifies whether to set the daemon process's supplementary groups. * Set the process groups using ‘os.initgroups’. Thanks to Malcolm Purvis for contributing an implementation of this feature. Version 2.1.1 - This is a bug fix release, addressing this bug: * Default ‘initgroups’ option to False. Using ‘os.initgroups’ requires permission to set process GID, so this now needs to be explicitly requested. What is the ‘python-daemon’ library? ‘python-daemon’ is a Python library to implement a well-behaved Unix daemon process. -- \“There are no significant bugs in our released software that | `\ any significant number of users want fixed.” —Bill Gates, | _o__) 1995-10-23 | Ben Finney -- https://mail.python.org/mailman/listinfo/python-list
Re: Cannot step through asynchronous iterator manually
To address the original question, I don't believe a next() equivalent for
async iterables has been added to the standard library yet. Here's an
implementation from one of my projects that I use to manually get the next
value: https://bpaste.net/show/e4bd209fc067. It exposes the same interface
as the synchronous next(). Usage:
await anext(some_async_iterator)
Ultimately, it's a fancy wrapper around the original snippet of 'await
iterator.__anext__()'.
On Sat, Jan 30, 2016 at 6:07 AM Maxime S wrote:
> 2016-01-30 11:51 GMT+01:00 Frank Millman :
>
> > "Chris Angelico" wrote in message
> > news:CAPTjJmoAmVNTCKq7QYaDRNQ67Gcg9TxSXYXCrY==s9djjna...@mail.gmail.com.
> ..
> >
> >
> >> On Sat, Jan 30, 2016 at 7:22 PM, Frank Millman
> >> wrote:
> >> > We had a recent discussion about the best way to do this, and ChrisA
> >> > suggested the following, which I liked -
> >> >
> >> >cur.execute('SELECT ...)
> >> >try:
> >> >row = next(cur)
> >> >except StopIteration:
> >> ># row does not exist
> >> >else:
> >> >try:
> >> >next_row = next(cur)
> >> >except StopIteration:
> >> ># row does exist
> >> >else:
> >> ># raise exception
> >> >
> >> > Now that I have gone async, I want to do the same with an asynchronous
> >> > iterator.
> >>
> >
> >
> I might be a bit off-topic, but why don't you simply use cursor.rowcount?
>
> For a pure iterator-based solution, I would do something like this (admitly
> a bit cryptic, but iterator-based solutions often are :-) :
>
> async def get_uniqu(ait):
> async for row in ait:
> break
> else:
> raise NotEnoughtRows()
> async for _ in ait:
> raise TooManyRows()
> return row
> --
> https://mail.python.org/mailman/listinfo/python-list
>
--
https://mail.python.org/mailman/listinfo/python-list
Re: Cannot step through asynchronous iterator manually
"Maxime S" wrote in message news:CAGqiJR8yUdd1u7j0YHS-He_v4uUT-ui=PpiX=n_G=ntt8zn...@mail.gmail.com... I might be a bit off-topic, but why don't you simply use cursor.rowcount? I just tried that on sqlite3 and pyodbc, and they both return -1. I think that it only works with insert/update/delete, but not with select. For a pure iterator-based solution, I would do something like this (admitly a bit cryptic, but iterator-based solutions often are :-) : async def get_uniqu(ait): async for row in ait: break else: raise NotEnoughtRows() async for _ in ait: raise TooManyRows() return row Also nice - thanks. I now have a few to choose from without needing an 'anext()'. Frank -- https://mail.python.org/mailman/listinfo/python-list
Re: psss...I want to move from Perl to Python
On 29.01.2016 23:49, Ben Finney wrote: "Sven R. Kunze" writes: On 29.01.2016 01:01, Fillmore wrote: How was the Python 2.7 vs Python 3.X solved? which version should I go for? Python 3 is the new and better one. More importantly: Python 2 will never improve; Python 3 is the only one that is actively developed. Exactly. The following story also confirms that: always use up-to-date software (not only for security reasons). We recently upgraded from Django 1.3 to 1.4 to 1.5 to 1.6 to 1.7 and now to 1.8. It was amazing how much code (and workarounds) we could remove by simply using standard Django tools (small things but used hundreds of times). Thus: Python 3. Best, Sven -- https://mail.python.org/mailman/listinfo/python-list
Re: Cannot step through asynchronous iterator manually
"Chris Angelico" wrote in message
news:CAPTjJmoAmVNTCKq7QYaDRNQ67Gcg9TxSXYXCrY==s9djjna...@mail.gmail.com...
On Sat, Jan 30, 2016 at 7:22 PM, Frank Millman wrote:
> We had a recent discussion about the best way to do this, and ChrisA
> suggested the following, which I liked -
>
>cur.execute('SELECT ...)
>try:
>row = next(cur)
>except StopIteration:
># row does not exist
>else:
>try:
>next_row = next(cur)
>except StopIteration:
># row does exist
>else:
># raise exception
>
> Now that I have gone async, I want to do the same with an asynchronous
> iterator.
Here's a crazy option. (Assuming that a row can't be None. If not, use
a unique sentinel object.)
cur.execute(whatever)
have_row = None
async for row in cur:
if have_row is not None:
raise TooManyRows
have_row = row
if have_row is None:
raise NoRowFound
Not so crazy :-) If Ian doesn’t come up with a better idea I will run with
it.
Here is a slight variation - just variable name changes really, but I think
it is slightly easier to read -
found = False
async for row in cur:
if found:
raise TooManyRows
found = True
if found:
process row
else:
no rows found
Frank
--
https://mail.python.org/mailman/listinfo/python-list
Re: Cannot step through asynchronous iterator manually
> Any particular reason not to use the classic sentinel object model? None that I can remember. I would use the sentinel pattern if I were writing it again today. > Also curious is that you raise a new StopAsyncIteration from the original one, rather than just reraising the original. I assume there's a reason for that, but it doesn't have a comment. I think I was just playing around with the new syntax. I also just noticed that there is an inconsistent use of the terms iterator and iterable in the docstring and variable names. The function looks much improved after updates: https://bpaste.net/show/14292d2b4070. Thanks for calling that out. Note to self: Review old code before copy/pasta into the mail list. On Sat, Jan 30, 2016 at 6:57 AM Chris Angelico wrote: > On Sat, Jan 30, 2016 at 11:35 PM, Kevin Conway > wrote: > > To address the original question, I don't believe a next() equivalent for > > async iterables has been added to the standard library yet. Here's an > > implementation from one of my projects that I use to manually get the > next > > value: https://bpaste.net/show/e4bd209fc067. It exposes the same > interface > > as the synchronous next(). Usage: > > > > await anext(some_async_iterator) > > > > Ultimately, it's a fancy wrapper around the original snippet of 'await > > iterator.__anext__()'. > > Curious idiom for the one-or-two-arg situation. Any particular reason > not to use the classic sentinel object model? > > _SENTINEL = object() > async def anext(iterable, default=_SENTINEL): > ... > if default is not _SENTINEL: > return default > > Or if you want to avoid that, at least take iterable as a fixed arg: > > async def anext(iterable, *default): > if len(default) > 1: TypeError > ... > if default: return default[0] > > Also curious is that you raise a new StopAsyncIteration from the > original one, rather than just reraising the original. I assume > there's a reason for that, but it doesn't have a comment. > > ChrisA > -- > https://mail.python.org/mailman/listinfo/python-list > -- https://mail.python.org/mailman/listinfo/python-list
Re: Cannot step through asynchronous iterator manually
On Sat, Jan 30, 2016 at 11:35 PM, Kevin Conway wrote: > To address the original question, I don't believe a next() equivalent for > async iterables has been added to the standard library yet. Here's an > implementation from one of my projects that I use to manually get the next > value: https://bpaste.net/show/e4bd209fc067. It exposes the same interface > as the synchronous next(). Usage: > > await anext(some_async_iterator) > > Ultimately, it's a fancy wrapper around the original snippet of 'await > iterator.__anext__()'. Curious idiom for the one-or-two-arg situation. Any particular reason not to use the classic sentinel object model? _SENTINEL = object() async def anext(iterable, default=_SENTINEL): ... if default is not _SENTINEL: return default Or if you want to avoid that, at least take iterable as a fixed arg: async def anext(iterable, *default): if len(default) > 1: TypeError ... if default: return default[0] Also curious is that you raise a new StopAsyncIteration from the original one, rather than just reraising the original. I assume there's a reason for that, but it doesn't have a comment. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Cannot step through asynchronous iterator manually
On 30 January 2016 at 08:22, Frank Millman wrote:
> There are times when I want to execute a SELECT statement, and test for
> three possibilities -
>- if no rows are returned, the object does not exist
>- if one row is returned, the object does exist
>- if more that one row is returned, raise an exception
>
> We had a recent discussion about the best way to do this, and ChrisA
> suggested the following, which I liked -
>
>cur.execute('SELECT ...)
>try:
>row = next(cur)
>except StopIteration:
># row does not exist
>else:
>try:
>next_row = next(cur)
>except StopIteration:
># row does exist
>else:
># raise exception
The simplest thing would just be to call list(cur) but I realise that
you don't want to consume more than 2 rows from the database so just
use islice:
rows = list(islice(cur, 2)) # pull at most 2 rows
if not rows:
# no rows
elif len(rows) > 1:
# too many rows
row = rows[0]
Depending on your application if you just want to raise any error when
there's not exactly one row then you could just do:
(row,) = list(islice(cur, 2))
--
Oscar
--
https://mail.python.org/mailman/listinfo/python-list
Re: Cannot step through asynchronous iterator manually
"Oscar Benjamin" wrote in message news:cahvvxxsa0yq4voyy6qycgxxvpl5zzgm8muui+1vmezd8crg...@mail.gmail.com... The simplest thing would just be to call list(cur) but I realise that you don't want to consume more than 2 rows from the database so just use islice: rows = list(islice(cur, 2)) # pull at most 2 rows if not rows: # no rows elif len(rows) > 1: # too many rows row = rows[0] I like the idea, but I don't think it would work with an asychronous iterable. OTOH it should not be difficult to roll your own using the example in the itertools docs as a base. Except that the example uses next(it) internally, and this thread started with the fact that there is no asychronous equivalent, so I might be back to square one. But these are all variations on a similar theme, so I don't think it matters which one I choose. I will go through them at my leisure and pick the most readable one. Thanks Frank -- https://mail.python.org/mailman/listinfo/python-list
Re: Cannot step through asynchronous iterator manually
On 30 January 2016 at 13:45, Frank Millman wrote: > "Oscar Benjamin" wrote in message > news:cahvvxxsa0yq4voyy6qycgxxvpl5zzgm8muui+1vmezd8crg...@mail.gmail.com... >> >> >> The simplest thing would just be to call list(cur) but I realise that > > you don't want to consume more than 2 rows from the database so just > use islice: >> >> >> rows = list(islice(cur, 2)) # pull at most 2 rows >> if not rows: >> # no rows >> elif len(rows) > 1: >> # too many rows >> row = rows[0] >> > > I like the idea, but I don't think it would work with an asychronous > iterable. That was intended as an improvement over the code that you posted for normal iterators. > OTOH it should not be difficult to roll your own using the example > in the itertools docs as a base. Except that the example uses next(it) > internally, and this thread started with the fact that there is no > asychronous equivalent, so I might be back to square one. I haven't used PEP 492 yet but what about: async def aslice(asynciterator, end): if end == 0: return [] items = [] async for item in asynciterator: items.append(item) if len(items) == end: break return items rows = await aslice(cur, 2) AFAICT there's no generator-function-style syntax for writing an async iterator so you'd have to make a class with the appropriate methods if you wanted to be able to loop over aslice with async for. -- Oscar -- https://mail.python.org/mailman/listinfo/python-list
Re: Cannot step through asynchronous iterator manually
On 30 January 2016 at 16:42, Ian Kelly wrote: >> AFAICT there's no generator-function-style syntax for writing an async >> iterator so you'd have to make a class with the appropriate methods if >> you wanted to be able to loop over aslice with async for. > > Before you go any further with this, be sure to check out the aitertools > third-party module. I haven't done anything with it myself, but it already > claims to provide aiter and anext as well as async versions of everything > in the standard itertools module. Right you are. There is aslice and it is implemented as a class with __anext__ etc. methods: https://github.com/asyncdef/aitertools/blob/master/aitertools/__init__.py#L747 My original suggestion just becomes: from aitertools import alist, islice rows = await alist(islice(cur, 2)) # pull at most 2 rows -- Oscar -- https://mail.python.org/mailman/listinfo/python-list
Re: Cannot step through asynchronous iterator manually
On Jan 30, 2016 7:13 AM, "Oscar Benjamin" wrote: > > I haven't used PEP 492 yet but what about: > > async def aslice(asynciterator, end): > if end == 0: > return [] >items = [] >async for item in asynciterator: >items.append(item) >if len(items) == end: >break > return items > > rows = await aslice(cur, 2) > > AFAICT there's no generator-function-style syntax for writing an async > iterator so you'd have to make a class with the appropriate methods if > you wanted to be able to loop over aslice with async for. Before you go any further with this, be sure to check out the aitertools third-party module. I haven't done anything with it myself, but it already claims to provide aiter and anext as well as async versions of everything in the standard itertools module. -- https://mail.python.org/mailman/listinfo/python-list
Re: Cannot step through asynchronous iterator manually
On 01/30/2016 01:22 AM, Frank Millman wrote: > There are times when I want to execute a SELECT statement, and test for > three possibilities - > - if no rows are returned, the object does not exist > - if one row is returned, the object does exist > - if more that one row is returned, raise an exception Is there a reason you cannot get SQL to answer this question for you? Something like: SELECT count(some_field) WHERE condition That will always return one row, with one field that will either be 0, 1, or more than 1. -- https://mail.python.org/mailman/listinfo/python-list
Re: Python Unit test
On 27/01/2016 04:57, [email protected] wrote: [...] Here is what the python code looks like and I am to make a Unittest for it Just a few ideas: first of all, you have to make your code more modular, for it to be unit-testable. There are a few areas in your code that need testing: - command line option usage/parsing/validation: - test that the parameters passed are syntactically correct - test that the parameters passed are semantically correct - test that the command line usage shown to the user is correct (based on the available options) This might require the creation of a Usage class that encapsulate the usage of getopt. The class could validate the input command line, and return a string for the usage. As long as you have a Usage class, you can test its behaviours. - filtering/decoding data: - test that a filter is correctly set - test that the data are correctly decoded This might require the creation of a Filter and Decoder classes, the former encapsulating the behaviour of your 'pc'; the latter encapsulating your 'decode'. The dependency between these two classes can be mocked, for example. You can mock the data returned by pc.datalink() in your decoding code and make sure they are correctly decoded. And so on... -- 4ndre4 "The use of COBOL cripples the mind; its teaching should, therefore, be regarded as a criminal offense." (E. Dijkstra) -- https://mail.python.org/mailman/listinfo/python-list
Re: show instant data on webpage
So, the python plotting packages mentioned can create files that hold your graphs. As to whether you want to display them on the web or send them around as attachments to your coworkers, or build some local application to display them is your choice Sorry for my confusion :D I'm still not sure wht to do. Anyway I would like to plot graph with my data first of all, then I could think to share them in network and then online. -- https://mail.python.org/mailman/listinfo/python-list
Re: Cannot step through asynchronous iterator manually
On Sun, Jan 31, 2016 at 6:42 AM, Michael Torrie wrote:
> On 01/30/2016 01:22 AM, Frank Millman wrote:
>> There are times when I want to execute a SELECT statement, and test for
>> three possibilities -
>> - if no rows are returned, the object does not exist
>> - if one row is returned, the object does exist
>> - if more that one row is returned, raise an exception
>
> Is there a reason you cannot get SQL to answer this question for you?
> Something like:
>
> SELECT count(some_field) WHERE condition
>
> That will always return one row, with one field that will either be 0,
> 1, or more than 1.
Efficiency. That's a fine way of counting actual rows in an actual
table. However, it's massive overkill to perform an additional
pre-query for something that's fundamentally an assertion (this is a
single-row-fetch API like "select into", and it's an error to fetch
anything other than a single row - but normal usage will never hit
that error), and also, there's no guarantee that the query is looking
at a single table. Plus, SQL's count function ignores NULLs, so you
could get a false result. Using count(*) might be better, but the only
way I can think of to be certain would be something like:
select count(*) from (...)
where the ... is the full original query. In other words, the whole
query has to be run twice - once to assert that there's exactly one
result, and then a second time to get that result. The existing
algorithm ("try to fetch a row - if it fails error; then try to fetch
another - if it succeeds, error") doesn't need to fetch more than two
results, no matter how big the query result is.
ChrisA
--
https://mail.python.org/mailman/listinfo/python-list
Re: Cannot step through asynchronous iterator manually
On 01/30/2016 02:19 PM, Chris Angelico wrote:
> Efficiency. That's a fine way of counting actual rows in an actual
> table. However, it's massive overkill to perform an additional
> pre-query for something that's fundamentally an assertion (this is a
> single-row-fetch API like "select into", and it's an error to fetch
> anything other than a single row - but normal usage will never hit
> that error), and also, there's no guarantee that the query is looking
> at a single table. Plus, SQL's count function ignores NULLs, so you
> could get a false result. Using count(*) might be better, but the only
> way I can think of to be certain would be something like:
>
> select count(*) from (...)
True. The id field is usually the best, or some other indexed field.
> where the ... is the full original query. In other words, the whole
> query has to be run twice - once to assert that there's exactly one
> result, and then a second time to get that result. The existing
> algorithm ("try to fetch a row - if it fails error; then try to fetch
> another - if it succeeds, error") doesn't need to fetch more than two
> results, no matter how big the query result is.
That is true, but that's what a database engine is designed for. Granted
he's just using SQLite here so many optimizations don't exist. Just
seems a bit odd to me to implement something in Python that the DB
engine is already good at. Guess ever since ORM was invented the debate
has raged over what the DB's job actually is. Personally I trust a DB
engine to be fast and efficient much more than my Python code will be
playing with the results.
--
https://mail.python.org/mailman/listinfo/python-list
Re: Cannot step through asynchronous iterator manually
On Sun, Jan 31, 2016 at 8:52 AM, Michael Torrie wrote:
> On 01/30/2016 02:19 PM, Chris Angelico wrote:
>> Efficiency. That's a fine way of counting actual rows in an actual
>> table. However, it's massive overkill to perform an additional
>> pre-query for something that's fundamentally an assertion (this is a
>> single-row-fetch API like "select into", and it's an error to fetch
>> anything other than a single row - but normal usage will never hit
>> that error), and also, there's no guarantee that the query is looking
>> at a single table. Plus, SQL's count function ignores NULLs, so you
>> could get a false result. Using count(*) might be better, but the only
>> way I can think of to be certain would be something like:
>>
>> select count(*) from (...)
>
> True. The id field is usually the best, or some other indexed field.
Yeah, a primary key is always non-nullable. But to do that, you have
to know that you're selecting from exactly one table - even what looks
like a primary key can have duplicates and/or NULLs if it's coming
from an outer join. It's not something you can do in a general way in
a library.
>> where the ... is the full original query. In other words, the whole
>> query has to be run twice - once to assert that there's exactly one
>> result, and then a second time to get that result. The existing
>> algorithm ("try to fetch a row - if it fails error; then try to fetch
>> another - if it succeeds, error") doesn't need to fetch more than two
>> results, no matter how big the query result is.
>
> That is true, but that's what a database engine is designed for. Granted
> he's just using SQLite here so many optimizations don't exist. Just
> seems a bit odd to me to implement something in Python that the DB
> engine is already good at. Guess ever since ORM was invented the debate
> has raged over what the DB's job actually is. Personally I trust a DB
> engine to be fast and efficient much more than my Python code will be
> playing with the results.
Again, the simple case is fine - I would be fairly confident that
running the same query twice would be cheaper than twice the cost of
running it once (although not all queries are perfectly stable, and
some forms of query will defeat optimizations in surprising ways). But
even if the query itself is fully optimized, just the action of
running a query has a cost - you have to send something off to the
server and wait for a response. A "fetchone" call is likely to be
being used for queries that are already coming from the cache, so the
main cost is just "hello server, I need this" - which is going to be
doubled.
ChrisA
--
https://mail.python.org/mailman/listinfo/python-list
Re: Cannot step through asynchronous iterator manually
On 01/30/2016 02:57 PM, Michael Torrie wrote: > SELECT count(some_id_field),field1,field2,field3 FROM wherever WHERE > conditions > > If the first column (or whatever you decide to alias it as) contains a > count, and the rest of the information is still there. If count is 1, > then the row is what you want and you can do whatever you wish with it. > If not, throw your exception. I'm not sure how SQLite handles it, or even what the SQL spec says, but I know in MySQL you could do something like this: SELECT count(id) as row_count,`tablename`.* FROM `tablename` WHERE condition and get the same thing as SELECT * would have, with the addition of a "row_count" field. Note that because of the count() part, the query will always only return 1 row. The fields will be NULL if the count was zero or they will contain the fields from the last row the query found. In other words if there is more than one row that matches the query, it will only give you data from the last match. Now if Frank is hoping to do work on the first row and then throw an exception if there's an additional row, then this of course won't work for him. -- https://mail.python.org/mailman/listinfo/python-list
Re: Cannot step through asynchronous iterator manually
On Sun, Jan 31, 2016 at 8:57 AM, Michael Torrie wrote:
> On 01/30/2016 02:19 PM, Chris Angelico wrote:
>> where the ... is the full original query. In other words, the whole
>> query has to be run twice - once to assert that there's exactly one
>> result, and then a second time to get that result. The existing
>> algorithm ("try to fetch a row - if it fails error; then try to fetch
>> another - if it succeeds, error") doesn't need to fetch more than two
>> results, no matter how big the query result is.
>
> Actually it occurs to me this doesn't have to be true. The same
> information he needs to know can be done with one query and only 1 result.
>
> SELECT count(some_id_field),field1,field2,field3 FROM wherever WHERE
> conditions
>
> If the first column (or whatever you decide to alias it as) contains a
> count, and the rest of the information is still there. If count is 1,
> then the row is what you want and you can do whatever you wish with it.
> If not, throw your exception.
That actually violates the SQL spec. Some servers will accept it,
others won't. (You're not supposed to mix column functions and
non-column functions.) It also can't cope with 'group by' queries, as
it'll count the underlying rows, not the groups. I also suspect it
can't handle join queries.
The original approach is still the most general, and IMO the best.
ChrisA
--
https://mail.python.org/mailman/listinfo/python-list
Re: Cannot step through asynchronous iterator manually
On 01/30/2016 03:06 PM, Chris Angelico wrote: > That actually violates the SQL spec. Some servers will accept it, > others won't. (You're not supposed to mix column functions and > non-column functions.) Are you sure? Wikipedia is not always the most accurate place, but they have several clear examples on the SQL page of combining table fields with count() listed. This is straight SQL we're talking about here, not a particular implementation or dialect. Maybe there're some subtleties at play here. > It also can't cope with 'group by' queries, as > it'll count the underlying rows, not the groups. I also suspect it > can't handle join queries. The Wikipedia entry on SQL, which seems to be based in some grounding of the spec, shows that count(), joins, and group by are all compatible with each other. So I dunno! > The original approach is still the most general, and IMO the best. Could be. On the other hand, letting the DB do it all solves his problem without mucking about with async iterators. -- https://mail.python.org/mailman/listinfo/python-list
Re: psss...I want to move from Perl to Python
Rustom Mody wrote: 1. One can use string-re's instead of compiled re's And I gather that string REs are compiled on first use and cached, so you don't lose much by using them most of the time. -- Greg -- https://mail.python.org/mailman/listinfo/python-list
Re: Cannot step through asynchronous iterator manually
On Sun, Jan 31, 2016 at 9:19 AM, Michael Torrie wrote: > On 01/30/2016 03:06 PM, Chris Angelico wrote: >> That actually violates the SQL spec. Some servers will accept it, >> others won't. (You're not supposed to mix column functions and >> non-column functions.) > > Are you sure? Wikipedia is not always the most accurate place, but they > have several clear examples on the SQL page of combining table fields > with count() listed. This is straight SQL we're talking about here, not > a particular implementation or dialect. Maybe there're some subtleties > at play here. Here's some info: http://stackoverflow.com/questions/5920070/why-cant-you-mix-aggregate-values-and-non-aggregate-values-in-a-single-select I don't have a good spec handy, but dig around a bit with a few different engines and you'll find that some fully-compliant engines disallow this. >> It also can't cope with 'group by' queries, as >> it'll count the underlying rows, not the groups. I also suspect it >> can't handle join queries. > > The Wikipedia entry on SQL, which seems to be based in some grounding of > the spec, shows that count(), joins, and group by are all compatible > with each other. So I dunno! Yes, they are - but not with the semantics you're looking for. If you say something like this: select count(*), foo from table group by foo then you get one row for each unique foo, with its own count. It won't tell you how many rows are in the result. >> The original approach is still the most general, and IMO the best. > > Could be. On the other hand, letting the DB do it all solves his > problem without mucking about with async iterators. Except that it means mucking about with other things :) ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Heap Implemenation
Hi again, as the topic of the old thread actually was fully discussed, I dare to open a new one. I finally managed to finish my heap implementation. You can find it at https://pypi.python.org/pypi/xheap + https://github.com/srkunze/xheap. I described my motivations and design decisions at http://srkunze.blogspot.com/2016/01/fast-object-oriented-heap-implementation.html @Cem You've been worried about a C implemenation. I can assure you that I did not intend to rewrite the incredibly fast and well-tested heapq implementation. I just re-used it. ;) I would really be grateful for your feedback as you have first-hand experience with heaps. @srinivas You might want to have a look at the removal implementation. Do you think it would be wiser/faster to switch for the sweeping approach? I plan to publish some benchmarks to compare heapq and xheap. Best, Sven -- https://mail.python.org/mailman/listinfo/python-list
Re: Cannot step through asynchronous iterator manually
Michael Torrie wrote: I'm not sure how SQLite handles it, or even what the SQL spec says, but I know in MySQL you could do something like this: SELECT count(id) as row_count,`tablename`.* FROM `tablename` WHERE condition I don't think that's strictly valid SQL. I know of at least one SQL implementation that complains if you have fields in an aggregate query that aren't either in an aggregate function or listed in the GROUP BY clause. To make it valid you would have to wrap LAST() around all of the other fields. (Probably individually -- I doubt whether LAST(tablename.*) would be accepted.) Which seems like a lot of trouble to go to just to tell whether you have a unique result. Also it's asking the DB to perform more work than you really need. It has to run the whole query before returning any results, whereas doing it yourself you can give up after reading the second result if there is one. -- Greg -- https://mail.python.org/mailman/listinfo/python-list
Re: show instant data on webpage
On 30/01/2016 20:50, mustang wrote: So, the python plotting packages mentioned can create files that hold your graphs. As to whether you want to display them on the web or send them around as attachments to your coworkers, or build some local application to display them is your choice Sorry for my confusion :D I'm still not sure wht to do. Anyway I would like to plot graph with my data first of all, then I could think to share them in network and then online. How about https://plot.ly/python/ ? -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence -- https://mail.python.org/mailman/listinfo/python-list
Re: Cannot step through asynchronous iterator manually
On 01/30/2016 02:19 PM, Chris Angelico wrote:
> where the ... is the full original query. In other words, the whole
> query has to be run twice - once to assert that there's exactly one
> result, and then a second time to get that result. The existing
> algorithm ("try to fetch a row - if it fails error; then try to fetch
> another - if it succeeds, error") doesn't need to fetch more than two
> results, no matter how big the query result is.
Actually it occurs to me this doesn't have to be true. The same
information he needs to know can be done with one query and only 1 result.
SELECT count(some_id_field),field1,field2,field3 FROM wherever WHERE
conditions
If the first column (or whatever you decide to alias it as) contains a
count, and the rest of the information is still there. If count is 1,
then the row is what you want and you can do whatever you wish with it.
If not, throw your exception.
--
https://mail.python.org/mailman/listinfo/python-list
Re: Cannot step through asynchronous iterator manually
On Sun, Jan 31, 2016 at 9:05 AM, Michael Torrie wrote: > On 01/30/2016 02:57 PM, Michael Torrie wrote: >> SELECT count(some_id_field),field1,field2,field3 FROM wherever WHERE >> conditions >> >> If the first column (or whatever you decide to alias it as) contains a >> count, and the rest of the information is still there. If count is 1, >> then the row is what you want and you can do whatever you wish with it. >> If not, throw your exception. > > I'm not sure how SQLite handles it, or even what the SQL spec says, but > I know in MySQL you could do something like this: > > SELECT count(id) as row_count,`tablename`.* FROM `tablename` WHERE condition > > and get the same thing as SELECT * would have, with the addition of a > "row_count" field. Note that because of the count() part, the query > will always only return 1 row. The fields will be NULL if the count was > zero or they will contain the fields from the last row the query found. > In other words if there is more than one row that matches the query, it > will only give you data from the last match. Huh. Thank you, MySQL, for violating the spec in a different way from what other servers do. The spec says you can't mix count(id) and non-aggregated columns; other DBMSes permit this by cloning the ID down all the rows, not by limiting the result to one row. (Fortunately, nobody would ever run that on any other DBMS, as you use the MySQL-specific backticks. Why that non-standard quoting convention became the normal way to do things in MySQL, I don't know. It's really annoying when I'm porting someone else's code to PostgreSQL.) > Now if Frank is hoping to do work on the first row and then throw an > exception if there's an additional row, then this of course won't work > for him. Exactly. If all he wants is to ignore additional rows, it's half the work - and half the assertion protection. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: psss...I want to move from Perl to Python
On Friday, January 29, 2016 at 9:38:23 PM UTC-6, Rustom Mody wrote: > JustForTheRecord[1]: Rick is someone who I sometimes agree with... Thanks for reacting in a rational, cool-headed manner. Many folks, especially the new members of this group, may not understand *WHY* i react so passionately when someone readily admits that they are unwilling to learn a specific skill, in this case: the programming paradigm known as Object Oriented Programming (or OOP) YOU'D THINK IT WAS *GIN-GI-VI-TIOUS* OR SOMETHING??? And they also may be unaware of the open hostilities that exists against the OOP paradigm in this very group -- which is odd, because, why should anyone *FEAR* a paradigm? And although some may raise legitimate questions regarding my unorthodox "stylistic mannerisms", my intention is never to "rant for the sake of ranting", or to "rant for the sake of disrupting", no, but to rant for the sake of *AWAKING*; to rant for the sake of *JOLTING* my peers out of their zombie-like religious trances! The natural instinct of humans is for us to locate a social group that we fit neatly into; become a member; adopt a religious attitude of "my group is the best", and then fight ferociously against those who are perceived as a threat to "our little group". There is no doubt that on a brisk cold morning, none of us would enjoy having our "warm and cozy blanket" yanked violently from our bodies so that we become exposed to the "chilling *BITE* of conscious reality", but sometimes, this "shock therapy" is only the method that will motivate us to "rise from our intellectual slumbers". SOMETIMES WE ALL NEED A GOOD SMACK ON THE FACE! And whilst this "motivational philosophy" may indeed be easier to "preach than to practice", my job is to deliver these "metaphorical smacks" without pride or prejudice, and i freely admit, that i myself have benefited personally from being a "shock therapy" patient. In fact, a very prominent member of this fine community might even consider his "smacking therapy" to be the impetus of the personality otherwise known as Rick (rantingrick) Johnson! THOSE WHO SAY DON'T KNOW, AND THOSE WHO KNOW CAN'T SAY! BUT [CENSORED] KNOWS WHO [CENSORED] IS!!! Therefore, i have become convinced that we can never allow ourselves to become closed-minded to the many diverse methods of solving problems, because, as programmers, or more specifically, *AS ENGINEERS*, our primary and first directive is to *SOLVE PROBLEMS*! INDEED! And since every problem presents it's own unique challenges, we must learn to wield the many unique perspectives, unique techniques, and unique tools that are available to us to *SOLVE* these *PROBLEMS* in the most *EFFICIENT* manner. A manner that will *ENSURE* that we produce a high quality product that requires very little maintenance, or at minimum, is easy to maintain. In conclusion, i implore the fine members of this group to never allow themselves to become a "paradigm ideologue", or *ANY* form of ideologue for that matter. We must *CONSTANTLY* resist our natural instincts to religiously identify ourselves with a "single selfish perspective", and instead, discover the bits of truth, no matter how trivial, that exist within in each unique perspective. By adopting this "objective philosophy", we will not only become better engineers, we will become better people. This is the creed of those who are "liberated seekers of knowledge", and who use that vast trove of knowledge to enhance the collective, *NOT* slaves mired in the emotional divisions that are a unfortunate aspect of our selfishly subjective individual perspectives. [1]: Acronym unpacking at no additional charge. ;-) -- https://mail.python.org/mailman/listinfo/python-list
Heap Implementation
Hi again, as the topic of the old thread actually was fully discussed, I dare to open a new one. I finally managed to finish my heap implementation. You can find it at https://pypi.python.org/pypi/xheap + https://github.com/srkunze/xheap. I described my motivations and design decisions at http://srkunze.blogspot.com/2016/01/fast-object-oriented-heap-implementation.html @Cem You've been worried about a C implementation. I can assure you that I did not intend to rewrite the incredibly fast and well-tested heapq implementation. I just re-used it. I would really be grateful for your feedback as you have first-hand experience with heaps. @srinivas You might want to have a look at the removal implementation. Do you think it would be wiser/faster to switch for the sweeping approach? I plan to publish some benchmarks to compare heapq and xheap. @all What's the best/standardized tool in Python to perform benchmarking? Right now, I use a self-made combo of unittest.TestCase and time.time + proper formatting. Best, Sven PS: fixing some weird typos and added missing part. -- https://mail.python.org/mailman/listinfo/python-list
Re: Heap Implementation
On Sunday 31 January 2016 09:47, Sven R. Kunze wrote: > @all > What's the best/standardized tool in Python to perform benchmarking? timeit -- Steve -- https://mail.python.org/mailman/listinfo/python-list
Re: psss...I want to move from Perl to Python
On Sunday 31 January 2016 09:18, Gregory Ewing wrote: > Rustom Mody wrote: >> 1. One can use string-re's instead of compiled re's > > And I gather that string REs are compiled on first use and > cached, so you don't lose much by using them most of the > time. Correct. The re module keeps a cache of the last N regexes used, for some value of N (possibly 10?) so for casual use there's no real point to pre- compiling other than fussiness. But if you have an application that makes heavy-duty use of regexes, e.g. some sort of parser with dozens of distinct regexes, you might not want to rely on the cache. -- Steve -- https://mail.python.org/mailman/listinfo/python-list
Re: psss...I want to move from Perl to Python
On Sunday, January 31, 2016 at 7:27:06 AM UTC+5:30, Steven D'Aprano wrote: > On Sunday 31 January 2016 09:18, Gregory Ewing wrote: > > > Rustom Mody wrote: > >> 1. One can use string-re's instead of compiled re's > > > > And I gather that string REs are compiled on first use and > > cached, so you don't lose much by using them most of the > > time. > > Correct. The re module keeps a cache of the last N regexes used, for some > value of N (possibly 10?) so for casual use there's no real point to pre- > compiling other than fussiness. > > But if you have an application that makes heavy-duty use of regexes, e.g. > some sort of parser with dozens of distinct regexes, you might not want to > rely on the cache. > > > > -- > Steve Python 3.4.3+ (default, Oct 14 2015, 16:03:50) [GCC 5.2.1 20151010] on linux Type "help", "copyright", "credits" or "license" for more information. >>> python.el: native completion setup loaded >>> >>> import re >>> re._MAXCACHE 512 >>> Have you ever seen a program that uses 512 re's? I havent :-) -- https://mail.python.org/mailman/listinfo/python-list
Re: psss...I want to move from Perl to Python
On Sunday, January 31, 2016 at 9:18:31 AM UTC+5:30, Cameron Simpson wrote: > On 30Jan2016 19:22, rusi wrote: > >Python 3.4.3+ (default, Oct 14 2015, 16:03:50) > >[GCC 5.2.1 20151010] on linux > >Type "help", "copyright", "credits" or "license" for more information. > python.el: native completion setup loaded > > import re > re._MAXCACHE > >512 > > > >Have you ever seen a program that uses 512 re's? > >I havent :-) > > I have. I've got one right here. It happens to be in perl, but it has been in > need of a recode in Python for a long time. It has about 3000 regexps. > > Of course they will be explicitly compiled in the recode. > I would guess it needs more recoding than explicit compilation! Maybe something like http://www.colm.net/open-source/ragel/ Unfortunately no python binding so far :-( -- https://mail.python.org/mailman/listinfo/python-list
x=something, y=somethinelse and z=crud all likely to fail - how do i wrap them up
I'm parsing html and i'm doing: x = root.find_class(... y = root.find_class(.. z = root.find_class(.. all 3 are likely to fail so typically i'd have to stick it in a try. This is a huge pain for obvious reasons. try: except something: x = 'default_1' (repeat 3 times) Is there some other nice way to wrap this stuff up? I can't do: try: x= y= z= except: because here if x fails, y and z might have succeeded. Pass the statement as a string to a try function? Any other way? -- https://mail.python.org/mailman/listinfo/python-list
Re: x=something, y=somethinelse and z=crud all likely to fail - how do i wrap them up
On Sun, Jan 31, 2016 at 3:58 PM, Veek. M wrote: > I'm parsing html and i'm doing: > > x = root.find_class(... > y = root.find_class(.. > z = root.find_class(.. > > all 3 are likely to fail so typically i'd have to stick it in a try. This is > a huge pain for obvious reasons. > > try: > > except something: > x = 'default_1' > (repeat 3 times) > > Is there some other nice way to wrap this stuff up? I'm not sure what you're using to parse HTML here (there are several libraries for doing that), but the first thing I'd look for is an option to have it return a default if it doesn't find something - even if that default has to be (say) None. But failing that, you can always write your own wrapper: def find_class(root, ...): try: return root.find_class(...) except something: return 'default_1' Or have the default as a parameter, if it's different for the different ones. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: psss...I want to move from Perl to Python
On 2016-01-31 01:56:51, "Steven D'Aprano" wrote: On Sunday 31 January 2016 09:18, Gregory Ewing wrote: Rustom Mody wrote: 1. One can use string-re's instead of compiled re's And I gather that string REs are compiled on first use and cached, so you don't lose much by using them most of the time. Correct. The re module keeps a cache of the last N regexes used, for some value of N (possibly 10?) so for casual use there's no real point to pre- compiling other than fussiness. In Python 3.5, it's 512. But if you have an application that makes heavy-duty use of regexes, e.g. some sort of parser with dozens of distinct regexes, you might not want to rely on the cache. It's slightly faster to use a pre-compiled regex because it won't have to look it up in the cache, although most of the time it probably won't matter. -- https://mail.python.org/mailman/listinfo/python-list
Re: psss...I want to move from Perl to Python
On 30Jan2016 19:22, rusi wrote: Python 3.4.3+ (default, Oct 14 2015, 16:03:50) [GCC 5.2.1 20151010] on linux Type "help", "copyright", "credits" or "license" for more information. python.el: native completion setup loaded import re re._MAXCACHE 512 Have you ever seen a program that uses 512 re's? I havent :-) I have. I've got one right here. It happens to be in perl, but it has been in need of a recode in Python for a long time. It has about 3000 regexps. Of course they will be explicitly compiled in the recode. Cheers, Cameron Simpson -- https://mail.python.org/mailman/listinfo/python-list
Re: x=something, y=somethinelse and z=crud all likely to fail - how do i wrap them up
Chris Angelico wrote:
> On Sun, Jan 31, 2016 at 3:58 PM, Veek. M wrote:
>> I'm parsing html and i'm doing:
>>
>> x = root.find_class(...
>> y = root.find_class(..
>> z = root.find_class(..
>>
>> all 3 are likely to fail so typically i'd have to stick it in a try. This
>> is a huge pain for obvious reasons.
>>
>> try:
>>
>> except something:
>> x = 'default_1'
>> (repeat 3 times)
>>
>> Is there some other nice way to wrap this stuff up?
>
> I'm not sure what you're using to parse HTML here (there are several
> libraries for doing that), but the first thing I'd look for is an
> option to have it return a default if it doesn't find something - even
> if that default has to be (say) None.
>
> But failing that, you can always write your own wrapper:
>
> def find_class(root, ...):
> try:
> return root.find_class(...)
> except something:
> return 'default_1'
>
> Or have the default as a parameter, if it's different for the different
> ones.
>
> ChrisA
I'm using lxml.html
def parse_page(self, root):
for li_item in root.xpath('//li[re:test(@id, "^item[a-z0-9]+$")]',
namespaces={'re': "http://exslt.org/regular-expressions"}):
description = li_item.find_class('vip')[0].text_content()
link = li_item.find_class('vip')[0].get('href')
price_dollar = li_item.find_class('lvprice prc')
[0].xpath('span')[0].text
bids = li_item.find_class('lvformat')[0].xpath('span')[0].text
tme_time = li_item.find_class('tme')[0].xpath('span')
[0].get('timems')
if tme_time:
time_hrs = int(tme_time)/1000 - time.time()
else:
time_hrs = 'No time found'
shipping = li_item.find_class('lvshipping')
[0].xpath('span/span/span')[0].text_content()"
print('{} {} {} {} {}'.format(link, price_dollar, time_hrs,
shipping, bids))
print('-')
--
https://mail.python.org/mailman/listinfo/python-list
Re: x=something, y=somethinelse and z=crud all likely to fail - how do i wrap them up
Veek. M wrote:
> Chris Angelico wrote:
>
>> On Sun, Jan 31, 2016 at 3:58 PM, Veek. M wrote:
>>> I'm parsing html and i'm doing:
>>>
>>> x = root.find_class(...
>>> y = root.find_class(..
>>> z = root.find_class(..
>>>
>>> all 3 are likely to fail so typically i'd have to stick it in a try.
>>> This is a huge pain for obvious reasons.
>>>
>>> try:
>>>
>>> except something:
>>> x = 'default_1'
>>> (repeat 3 times)
>>>
>>> Is there some other nice way to wrap this stuff up?
>>
>> I'm not sure what you're using to parse HTML here (there are several
>> libraries for doing that), but the first thing I'd look for is an
>> option to have it return a default if it doesn't find something - even
>> if that default has to be (say) None.
>>
>> But failing that, you can always write your own wrapper:
>>
>> def find_class(root, ...):
>> try:
>> return root.find_class(...)
>> except something:
>> return 'default_1'
>>
>> Or have the default as a parameter, if it's different for the different
>> ones.
>>
>> ChrisA
>
> I'm using lxml.html
>
> def parse_page(self, root):
> for li_item in root.xpath('//li[re:test(@id, "^item[a-z0-9]+$")]',
> namespaces={'re': "http://exslt.org/regular-expressions"}):
> description = li_item.find_class('vip')[0].text_content()
> link = li_item.find_class('vip')[0].get('href')
> price_dollar = li_item.find_class('lvprice prc')
> [0].xpath('span')[0].text
> bids = li_item.find_class('lvformat')[0].xpath('span')[0].text
>
> tme_time = li_item.find_class('tme')[0].xpath('span')
> [0].get('timems')
> if tme_time:
> time_hrs = int(tme_time)/1000 - time.time()
> else:
> time_hrs = 'No time found'
>
> shipping = li_item.find_class('lvshipping')
> [0].xpath('span/span/span')[0].text_content()"
>
> print('{} {} {} {} {}'.format(link, price_dollar, time_hrs,
> shipping, bids))
>
print('-')
Someone suggested i refactor the find_class/xpath into wrapper functions but
i tried it and it didn't look all that great..
Just give me a general idea of how to deal with messy crud like this..
--
https://mail.python.org/mailman/listinfo/python-list
Re: Heap Implementation
@Sven actually you are not sweeping at all, as i remember from my last post what i meant by sweeping is periodically deleting the elements which were marked as popped items. kudos on that __setitem__ technique, instead of using references to the items like in HeapDict, it is brilliant of you to simply use __setitem__ On Sun, Jan 31, 2016 at 4:17 AM, Sven R. Kunze wrote: > Hi again, > > as the topic of the old thread actually was fully discussed, I dare to open > a new one. > > I finally managed to finish my heap implementation. You can find it at > https://pypi.python.org/pypi/xheap + https://github.com/srkunze/xheap. > > I described my motivations and design decisions at > http://srkunze.blogspot.com/2016/01/fast-object-oriented-heap-implementation.html > > @Cem > You've been worried about a C implementation. I can assure you that I did > not intend to rewrite the incredibly fast and well-tested heapq > implementation. I just re-used it. > > I would really be grateful for your feedback as you have first-hand > experience with heaps. > > @srinivas > You might want to have a look at the removal implementation. Do you think it > would be wiser/faster to switch for the sweeping approach? > > I plan to publish some benchmarks to compare heapq and xheap. > > @all > What's the best/standardized tool in Python to perform benchmarking? Right > now, I use a self-made combo of unittest.TestCase and time.time + proper > formatting. > > Best, > Sven > > > PS: fixing some weird typos and added missing part. -- https://mail.python.org/mailman/listinfo/python-list
Re: x=something, y=somethinelse and z=crud all likely to fail - how do i wrap them up
On Sun, 31 Jan 2016 03:58 pm, Veek. M wrote: > Is there some other nice way to wrap this stuff up? The answer to "how do I wrap this stuff up?" is nearly always: - refactor your code so you don't need to; - subclass and extend the method; - write a function; - write a delegate class. Pick whichever is more relevant to your specific situation. -- Steven -- https://mail.python.org/mailman/listinfo/python-list
