Re: Installing PyGame?

2014-04-26 Thread Andrea D'Amore

On 2014-04-25 23:57:21 +, Gregory Ewing said:


I don't know what you're doing to hose your system that badly.
I've never had a problem that couldn't be fixed by deleting
whatever the last thing was I added that caused it.


The actual problem with the "native MacOSX way" is that there's no
official way to uninstall a package once it's installed.


Also the problems I had with one of the third-party package
managers was because it *didn't* keep its own stuff properly
separated. It installed libraries on my regular library path
so that they got picked up by things that they weren't
appropriate for.


This most likely was not MacPorts, its default install path is not
checked by dyld by default.


But I use a wide
variety of libraries, not all of them available that way,
and many of them installed from source, and I find it's
less hassle overall to do everything the native MacOSX way
wherever possible.


Well, the "native" MacOSX way would probably be registering a package
via installer(8) not compiling from source.

As long as you're comfortable with your system then it's good for you.
In my experience the more libraries/software I install the more useful
a package manager becomes in terms of stray files left when upgrading or
uninstalling.


I use a mix of MacPorts to provide the base tools and virtualenv for
project-specific pypi libraries.


--
Andrea

--
https://mail.python.org/mailman/listinfo/python-list


Re: Installing PyGame?

2014-04-26 Thread Andrea D'Amore

On 2014-04-25 23:42:33 +, Gregory Ewing said:


That's fine if it works, but the OP said he'd already tried
various things like that and they *didn't* work for him.


By reading the "original" message (the empty reply with full quote of a
ten months earlier message) I couldn't figure what the OP actually did,
he says "just about every way possible", or what his "an error" actually is.

Most likely all those methods are good, I'd rather fix any of those by
providing further info than switch to another one looking for a magical 
solution.


--
Andrea

--
https://mail.python.org/mailman/listinfo/python-list


Re: Unicode in Python

2014-04-26 Thread wxjmfauth
==

I wrote once 90 % of Python 2 apps (a generic term) supposed to
process text, strings are not working.

In Python 3, that's 100 %. It is somehow only by chance, apps may
give the illusion they are properly working.

jmf

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: possible bug in re expression?

2014-04-26 Thread Steven D'Aprano
On Fri, 25 Apr 2014 14:32:30 -0400, Terry Reedy wrote:

> On 4/25/2014 12:30 PM, Robin Becker wrote:

[...]
>> should
>>
>> re.compile('.{1,+3}')
>>
>> raise an error? It doesn't on python 2.7 or 3.3.
> 
> And it should not because it is not an error. '+' means 'match 1 or more
> occurrences of the preceding re' and the preceding re is ','.

Actually, no. Braces have special meaning, and are used to specify a 
number of matches. R{m,n} matches from m to n repetitions of the 
preceding regex R:


py> re.search('(spam){2,4}', 'spam-spamspamspam-spam').group()
'spamspamspam'


This surprises me:

>  >>> re.match('a{1,+3}', 'a{1,,,3}').group()
> 'a{1,,,3}'


I would have expected that either +3 would have been interpreted as just 
"3", or that it would have been an invalid regex. It appears that what is 
happening is that if the braces cannot be interpreted as a repetition 
group, they are interpreted as regular characters. Those sort of silent 
errors is why I hate programming in regexes.

> I suppose that one could argue that '{' alone should be treated as
> special immediately, and not just when a matching '}' is found, and
> should disable other special meanings. I wonder what JS does if there is
> no matching '}'?

Probably silently do the wrong thing :-)


-- 
Steven D'Aprano
http://import-that.dreamwidth.org/
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Unicode in Python

2014-04-26 Thread Frank Millman

 wrote in message 
news:[email protected]...
> ==
>
> I wrote once 90 % of Python 2 apps (a generic term) supposed to
> process text, strings are not working.
>
> In Python 3, that's 100 %. It is somehow only by chance, apps may
> give the illusion they are properly working.
>

It is quite frustrating when you make these statements without explaining 
what you mean by 'not working'.

It would be really useful if you could spell out -

1. what you did
2. what you expected to happen
3. what actually happened

Frank Millman



-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Unicode in Python

2014-04-26 Thread Ben Finney
"Frank Millman"  writes:

>  wrote […]

> It is quite frustrating when you make these statements without
> explaining what you mean by 'not working'.

Please do not engage “wxjmfauth” on this topic; he is an
amply-demonstrated troll with nothing tangible to back up his incessant
complaints about Unicode in Python. He is best ignored, IMO.

-- 
 \  “As the evening sky faded from a salmon color to a sort of |
  `\   flint gray, I thought back to the salmon I caught that morning, |
_o__)and how gray he was, and how I named him Flint.” —Jack Handey |
Ben Finney

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Help with changes in traceback stack from Python 2.7 to Python 3.x

2014-04-26 Thread Ned Batchelder

On 4/26/14 1:50 AM, Andrew Konstantaras wrote:

I wrote the following code that works in Python 2.7 that takes the
variables passed to the function into a dictionary.  The following call:

  strA = 'a'
  intA = 1
  dctA = makeDict(strA, intA)

produces the following dictionary:

   {'strA':'a', 'intA':1}

To access the names passed into the function, I had to find the
information by parsing through the stack.  The code that used to work is:

from traceback import extract_stack

def makeDict(*args):
 strAllStack = str(extract_stack())
 intNumLevels = len(extract_stack())
 intLevel = 0
 blnFinished = False
 while not blnFinished:
 strStack = str(extract_stack()[intLevel])
 if strStack.find("makeDict(")>0:
 blnFinished = True
 intLevel += 1
 if intLevel >= intNumLevels:
 blnFinished = True
 strStartText = "= makeDict("
 intLen = len(strStartText)
 intOpenParenLoc = strStack.find(strStartText)
 intCloseParenLoc = strStack.find(")", intOpenParenLoc)
 strArgs = strStack[ intOpenParenLoc+intLen : intCloseParenLoc ].strip()
 lstVarNames = strArgs.split(",")
 lstVarNames = [ s.strip() for s in lstVarNames ]
 if len(lstVarNames) == len(args):
 tplArgs = map(None, lstVarNames, args)
 newDict = dict(tplArgs)
 return newDict
 else:
return "Error, argument name-value mismatch in function 'makeDict'.
lstVarNames: " + str(lstVarNames) + "\n args: " + str(args), strAllStack

The same code does not work in Python 3.3.4.  I have tried parsing
through the stack information and frames and I can't find any reference
to the names of the arguments passed to the function.  I have tried
inspecting the function and other functions in the standard modules, but
I can't seem to find anything that will provide this information.


1) This is a very strange function. Instead of going to all this 
trouble, why not use:


dctA = dict(strA=strA, intA=intA)

Yes, you have to repeat the names, but you'd be done by now, it works on 
both versions of Python, and people reading your code would understand 
what it does.


2) Why is your code examining the entire stack?  The frame you want is 
the one just above you.  Why are you stringifying the tuple produced by 
extract_stack when the source line you want is the fourth element?  Why 
are you using a while loop and a counter to iterate over elements of a list?


3) In the error case you return a tuple when the caller is expecting a 
dict? Why not raise an exception?


4) Your code only works if makeDict is assigned to a name.  When I first 
tried it, I used "print(makeDict(...))", and it failed completely.


5) You haven't mentioned what goes wrong on Python 3, but when I tried 
it, I got:


Traceback (most recent call last):
  File "foo3.py", line 34, in 
d = makeDict(a, b)
  File "foo3.py", line 27, in makeDict
newDict = dict(list(tplArgs))
TypeError: 'NoneType' object is not callable

Looking at your code, I see:

tplArgs = map(None, lstVarNames, args)

I didn't realize map accepted a callable of None (TIL!), but it no 
longer does in Python 3.  You'll have to do this a different way.


But seriously: just use dict() and be done with it.  There's a lot here 
that can be much simpler if you learn to use Python as it was meant to 
be used.  You are swimming upstream, there are easier ways.




Can anyone point me in the direction to find this information?  Any help
is appreciated.

---Andrew






--
Ned Batchelder, http://nedbatchelder.com

--
https://mail.python.org/mailman/listinfo/python-list


Re: Unicode in Python

2014-04-26 Thread Ian Kelly
On Apr 26, 2014 3:46 AM, "Frank Millman"  wrote:
>
>
>  wrote in message
> news:[email protected]...
> > ==
> >
> > I wrote once 90 % of Python 2 apps (a generic term) supposed to
> > process text, strings are not working.
> >
> > In Python 3, that's 100 %. It is somehow only by chance, apps may
> > give the illusion they are properly working.
> >
>
> It is quite frustrating when you make these statements without explaining
> what you mean by 'not working'.

As far as anybody has been able to determine, what jmf means by "not
working" is  that strings containing the € character are handled less
efficiently than strings that do not contain it in certain contrived test
cases.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Help with changes in traceback stack from Python 2.7 to Python 3.x

2014-04-26 Thread Ian Kelly
On Apr 26, 2014 8:12 AM, "Ned Batchelder"  wrote:
> Looking at your code, I see:
>
>
> tplArgs = map(None, lstVarNames, args)
>
> I didn't realize map accepted a callable of None (TIL!), but it no longer
does in Python 3.  You'll have to do this a different way.

The Python 3 replacement for map(None, ...) is itertools.zip_longest.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: how to split this kind of text into sections

2014-04-26 Thread oyster
First of all, thank you all for your answers. I received python
mail-list in a daily digest, so it is not easy for me to quote your
mail separately.

I will try to explain my situation to my best, but English is not my
native language, I don't know whether I can make it clear at last.

Every SECTION starts with 2 special lines; these 2 lines is special
because they have some same characters (the length is not const for
different section) at the beginning; these same characters is called
the KEY for this section. For every 2 neighbor sections, they have
different KEYs.

After these 2 special lines, some paragraph is followed. Paragraph
does not have any KEYs.

So, a section = 2 special lines with KEYs at the beginning + some
paragraph without KEYs

However there maybe some paragraph before the first section, which I
do not need and want to drop it

I need a method to split the whole text into SECTIONs and to know all the KEYs

I have tried to solve this problem via re module, but failed. Maybe I
can make you understand me clearly by showing the regular expression
object
reobj = 
re.compile(r"(?P[^\r\n]*?)[^\r\n]*?\r\n(?P=bookname)[^\r\n]*?\r\n.*?",
re.DOTALL)
which can get the first 2 lines of a section, but fail to get the rest
of a section which does not have any KEYs at the begin. The hard part
for me is to express "paragraph does not have KEYs".

Even I can get the first 2 line, I think regular expression is
expensive for my text.

That is all. I hope get some more suggestions. Thanks.

[demo text starts]
a line we do not need
I am section axax
I am section bbb
(and here goes many other text)...

let's continue to
let's continue, yeah
.(and here goes many other text)...

I am using python
I am using perl
.(and here goes many other text)...

Programming is hard
Programming is easy
How do you thing?
I do’t know
[demo text ends]

the above text should be splited to a LIST with 4 items, and I also
need to know the KEY for LIST is ['I am section ', 'let's continue',
'I am using ', ' Programming is ']:
lst=[
'''a line we do not need
I am section axax
I am section bbb
(and here goes many other text)... ''',

'''let's continue to
let's continue, yeah
.(and here goes many other text)... ''',

'''I am using python
I am using perl
.(and here goes many other text)... ''',

'''Programming is hard
Programming is easy
How do you thing?
I do’t know'''
]
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: feedparser error

2014-04-26 Thread Kushal Kumaran
tad na  writes:

> python 2.7.2
>
> The following code has an error and I can not figure out why:
>
> import feedparser
> d = feedparser.parse('http://bl.ocks.org/mbostock.rss')
> numb = len(d['entries'])
> for post in d.entries:
> print post.pubDate+"\n"
>
> ---
> the error is :
>
> print post.pubDate+"\n"
>   File "build\bdist.win32\egg\feedparser.py", line 416, in __getattr__
> raise AttributeError, "object has no attribute '%s'" % key
> AttributeError: object has no attribute 'pubDate'
>
> ---
>
> The only thing I can think of is feedparser does not like 
> uppercase(pubDate)?
> I can not change someone else's rss.   What can I do here?

You want post.published, or post.published_parsed.  See the feedparser
documentation here:
https://pythonhosted.org/feedparser/reference-entry-published.html

-- 
regards,
kushal


pgpkscAotXmSL.pgp
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: feedparser error

2014-04-26 Thread MRAB

On 2014-04-26 03:16, tad na wrote:

python 2.7.2

The following code has an error and I can not figure out why:

import feedparser
d = feedparser.parse('http://bl.ocks.org/mbostock.rss')
numb = len(d['entries'])
for post in d.entries:
 print post.pubDate+"\n"

---
the error is :

 print post.pubDate+"\n"
   File "build\bdist.win32\egg\feedparser.py", line 416, in __getattr__
 raise AttributeError, "object has no attribute '%s'" % key
AttributeError: object has no attribute 'pubDate'

---

The only thing I can think of is feedparser does not like 
uppercase(pubDate)?
I can not change someone else's rss.   What can I do here?


Print dir(post) to see what attributes it has.
--
https://mail.python.org/mailman/listinfo/python-list


Re: how to split this kind of text into sections

2014-04-26 Thread Tim Chase
On 2014-04-26 23:53, oyster wrote:
> I will try to explain my situation to my best, but English is not my
> native language, I don't know whether I can make it clear at last.

Your follow-up reply made much more sense and your written English is
far better than many native speakers'. :-)

> Every SECTION starts with 2 special lines; these 2 lines is special
> because they have some same characters (the length is not const for
> different section) at the beginning; these same characters is called
> the KEY for this section. For every 2 neighbor sections, they have
> different KEYs.

I suspect you have a minimum number of characters (or words) to
consider, otherwise a single character duplicated at the beginning of
the line would delimit a section, such as

 abcd
 afgh

because they share the commonality of an "a".  The code I provided
earlier should give you what you describe.  I've tweaked and tested,
and provided it below.  Note that I require a minimum overlap of 6
characters (MIN_LEN).  It also gathers the initial stuff (that you
want to discard) under the empty key, so you can either delete that,
or ignore it.

> I need a method to split the whole text into SECTIONs and to know
> all the KEYs
> 
> I have tried to solve this problem via re module

I don't think the re module will be as much help here.

-tkc


from collections import defaultdict
import itertools as it
MIN_LEN = 6
def overlap(s1, s2):
"Given 2 strings, return the initial overlap between them"
return ''.join(
c1
for c1, c2
in it.takewhile(
lambda pair: pair[0] == pair[1],
it.izip(s1, s2)
)
)
prevline = "" # the initial key under which preamble gets stored
output = defaultdict(list)
key = None
with open("data.txt") as f:
for line in f:
if len(line) >= MIN_LEN and prevline[:MIN_LEN] == line[:MIN_LEN]:
key = overlap(prevline, line)
output[key].append(line)
prevline = line
for k,v in output.items():
print str(k).center(60,'=')
print ''.join(v)








.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: feedparser error

2014-04-26 Thread tad na


You guys are good. thanks.



===
On Saturday, April 26, 2014 11:55:35 AM UTC-5, MRAB wrote:
On 2014-04-26 03:16, tad na wrote:

 python 2.7.2
 The following code has an error and I can not figure out why:

 import feedparser
 d = feedparser.parse('http://bl.ocks.org/mbostock.rss')
 numb = len(d['entries'])
 for post in d.entries:
  print post.pubDate+"\n"
 ---
 the error is :
  print post.pubDate+"\n"
File "build\bdist.win32\egg\feedparser.py", line 416, in __getattr__
  raise AttributeError, "object has no attribute '%s'" % key
 AttributeError: object has no attribute 'pubDate'

 ---

The only thing I can think of is feedparser does not like 
uppercase(pubDate)?
 I can not change someone else's rss.   What can I do here?
Print dir(post) to see what attributes it has.

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Proper deletion of selected items during map iteration in for loop: Thanks to all

2014-04-26 Thread Charles Hixson

On 04/25/2014 10:53 AM, Charles Hixson wrote:
What is the proper way to delete selected items during iteration of a 
map?  What I want to do is:


for (k, v) in m.items():
   if f(k):
  #  do some processing of v and save result elsewhere
  del m[k]

But this gives (as should be expected):
RuntimeError: dictionary changed size during iteration
In the past I've accumulated the keys to be deleted in a separate 
list, but this time there are likely to be a large number of them, so 
is there some better way?




Going over the various responses, it looks like saving the "to be 
deleted" keys to a list, and then iterating over that to delete is the 
best answer.  I expect that I'll be deleting around 1/3 during each 
iteration of the process...and then adding new ones back in. There 
shouldn't be a really huge number of deletions on any particular pass, 
but it will be looped through many times...so if there were any better 
way to do this, it would speed things up considerably...but it's not 
really surprising that there doesn't appear to be.  So now it translates 
into (approximately, not yet tested):


toDel = []
for (k, v) in m.items():
   if f(k):
  #  do some processing of v and save result elsewhere
  toDel.append(k)
   else:
  # do something else
for k in toDel:
   del m[k]
toDel = None


--
Charles Hixson

--
https://mail.python.org/mailman/listinfo/python-list


Re: Proper deletion of selected items during map iteration in for loop: Thanks to all

2014-04-26 Thread Tim Chase
On 2014-04-26 12:25, Charles Hixson wrote:
> I expect that I'll be deleting around 1/3 during
> each iteration of the process...and then adding new ones back in.
> There shouldn't be a really huge number of deletions on any
> particular pass, but it will be looped through many times...

If you have further details on what triggers the "adding new ones
back in", and what changes or remains the same between various
passes, there might be room for optimization elsewhere.

-tkc




-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Proper deletion of selected items during map iteration in for loop

2014-04-26 Thread Peter Otten
Charles Hixson wrote:

> What is the proper way to delete selected items during iteration of a
> map?  What I want to do is:
> 
> for (k, v) in m.items():
> if f(k):
>#  do some processing of v and save result elsewhere
>del m[k]
> 
> But this gives (as should be expected):
>  RuntimeError: dictionary changed size during iteration
> In the past I've accumulated the keys to be deleted in a separate list,
> but this time there are likely to be a large number of them, so is there
> some better way?

It just struck me that you can store the keys to be deleted as values in the 
same dict. That way you need no extra memory:

def delete_items(d, keys):
keys = iter(keys)
try:
first = prev = next(keys)
except StopIteration:
return

for key in keys:
d[prev] = prev = key
d[prev] = first

key = first
while True:
key = d.pop(key)
if key is first:
break

if __name__ == "__main__":
import string
data = dict(zip(range(10), string.ascii_lowercase))
print("original data:", data)
print("removing odd items...")
delete_items(data, keys=(k for k in data if k % 2))
print("modified data:", data)

print("delete no items...")
delete_items(data, [])
print("modified data:", data)

print("delete a single item (6)...")
delete_items(data, [6])
print("modified data:", data)

print("delete all items...")
delete_items(data, data)
print("modified data:", data)

While I think I am a genius* in practice this approach will probably not be 
the most effective.

(*) (Un)fortunately that feeling never lasts longer than a few minutes ;)

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Proper deletion of selected items during map iteration in for loop: Thanks to all

2014-04-26 Thread Steven D'Aprano
On Sat, 26 Apr 2014 12:25:27 -0700, Charles Hixson wrote:

> On 04/25/2014 10:53 AM, Charles Hixson wrote:
>> What is the proper way to delete selected items during iteration of a
>> map?  What I want to do is:
>>
>> for (k, v) in m.items():
>>if f(k):
>>   #  do some processing of v and save result elsewhere 
>>   del m[k]
>>
>> But this gives (as should be expected):
>> RuntimeError: dictionary changed size during iteration
>> In the past I've accumulated the keys to be deleted in a separate list,
>> but this time there are likely to be a large number of them, so is
>> there some better way?
>>
>>
> Going over the various responses, it looks like saving the "to be
> deleted" keys to a list, and then iterating over that to delete is the
> best answer.  


I don't think that there is any one "best" answer that always applies. 
"My dict has three items, and I'll be deleting most of them" is likely to 
have different performance characteristics from "My dict has three 
billion items, and I'll be deleting two [or two billion] of them".

So how much time do you want to spend tuning this for optimum performance 
(or memory use, or programmer time)? Or is "good enough" good enough?

I think the two obviously good enough approaches are:

- save a "to be deleted" list, then delete those keys;

- copy the "not to be deleted" items into a new dict

My intuition is that the second will probably be faster, unless your dict 
is truly monstrously big. Not millions, but tens or hundreds or thousands 
of millions of items, depending on how much memory your computer has.

You've already got a code snippet using the "to be deleted" list, here's 
how I would do the other way:

new = {}
for k, v in m.items():
if f(k):
process(v)
else:
new[k] = v
m.clear()
m.update(new)
del new


If you don't care about modifying the existing dict, but can afford to 
write in a more functional style, you don't even need to bother doing 
that m.clear(), m.update(new). Just return the new dict, stop using the 
old one (it will be garbage collected), and use the new one.

Oh, in case you're wondering, this will be more efficient than it may 
seem, because the actual data in the dict isn't copied. The only things 
being copied are references to the keys and values, not the keys and 
values themselves.



-- 
Steven D'Aprano
http://import-that.dreamwidth.org/
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Proper deletion of selected items during map iteration in for loop: Thanks to all

2014-04-26 Thread Chris Angelico
On Sun, Apr 27, 2014 at 12:14 PM, Steven D'Aprano
 wrote:
> I think the two obviously good enough approaches are:
>
> - save a "to be deleted" list, then delete those keys;
>
> - copy the "not to be deleted" items into a new dict

For a small enough dict that the performance question doesn't matter,
I'd go with the other option: iterate over a snapshot of the keys.
Compare:

# Naive approach:
for k in d:
if f(k): del d[k]

# Snapshot of keys:
for k in list(d):
if f(k): del d[k]

No extra loop at the end, no switching out and in of contents, just
one little change in the loop header. Obviously you don't want to do
this when you're deleting two out of three billion, but for smallish
dicts, that won't make a visible change.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: how to split this kind of text into sections

2014-04-26 Thread Steven D'Aprano
On Sat, 26 Apr 2014 23:53:14 +0800, oyster wrote:

> Every SECTION starts with 2 special lines; these 2 lines is special
> because they have some same characters (the length is not const for
> different section) at the beginning; these same characters is called the
> KEY for this section. For every 2 neighbor sections, they have different
> KEYs.
> 
> After these 2 special lines, some paragraph is followed. Paragraph does
> not have any KEYs.
> 
> So, a section = 2 special lines with KEYs at the beginning + some
> paragraph without KEYs
> 
> However there maybe some paragraph before the first section, which I do
> not need and want to drop it
> 
> I need a method to split the whole text into SECTIONs and to know all
> the KEYs

Let me try to describe how I would solve this, in English.

I would look at each pair of lines (1st + 2nd, 2nd + 3rd, 3rd + 4th, 
etc.) looking for a pair of lines with matching prefixes. E.g.:

"This line matches the next"
"This line matches the previous"

do match, because they both start with "This line matches the ".

Question: how many characters in common counts as a match?

"This line matches the next"
"That previous line matches this line"

have a common prefix of "Th", two characters. Is that a match?

So let me start with a function to extract the matching prefix, if there 
is one. It returns '' if there is no match, and the prefix (the KEY) if 
there is one:

def extract_key(line1, line2):
"""Return the key from two matching lines, or '' if not matching."""
# Assume they need five characters in common.
if line1[:5] == line2[:5]:
return line1[:5]
return ''


I'm pretty much guessing that this is how you decide there's a match. I 
don't know if five characters is too many or two few, or if you need a 
more complicated test. It seems that you want to match as many characters 
as possible. I'll leave you to adjust this function to work exactly as 
needed.

Now we iterate over the text in pairs of lines. We need somewhere to hold 
the the lines in each section, so I'm going to use a dict of lists of 
lines. As a bonus, I'm going to collect the ignored lines using a key of 
None. However, I do assume that all keys are unique. It should be easy 
enough to adjust the following to handle non-unique keys. (Use a list of 
lists, rather than a dict, and save the keys in a separate list.)

Lastly, the way it handles lines at the beginning of a section is not 
exactly the way you want it. This puts the *first* line of the section as 
the *last* line of the previous section. I will leave you to sort out 
that problem.


from collections import OrderedDict
section = []
sections = OrderedDict()
sections[None] = section
lines = iter(text.split('\n'))
prev_line = ''
for line in lines:
key = extract_key(prev_line, line)
if key == '':
# No match, so we're still in the same section as before.
section.append(line)
else:
# Match, so we start a new section.
section = [line]
sections[key] = section
prev_line = line



-- 
Steven D'Aprano
http://import-that.dreamwidth.org/
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Proper deletion of selected items during map iteration in for loop: Thanks to all

2014-04-26 Thread Roy Smith
In article <[email protected]>,
 Steven D'Aprano  wrote:

> I think the two obviously good enough approaches are:
> 
> - save a "to be deleted" list, then delete those keys;
> 
> - copy the "not to be deleted" items into a new dict

There is a third possibility:

Iterate over the dict, storing keys to be deleted in a list, but break 
out after the list reaches N items.  Delete those keys from the dict.  
Repeat the whole process until you reach the end of the dictionary 
before you reach N saved keys.

It's going to take multiple (perhaps many) passes over the dict, but it 
will limit the amount of extra memory used.  In the extreme case, if 
N=1, with k keys in the dict, it will turn a process which is O(k) in 
time and O(k) in extra memory into one which is O(k^2) in time and O(1) 
in extra memory.  Is that a good tradeoff?  Only your hairdresser knows 
for sure.
-- 
https://mail.python.org/mailman/listinfo/python-list