Re: [Tutor] Optional groups in RE's

2009-04-11 Thread Moos Heintzen
Mark Tolonen wrote: > Your data looks like XML.  If it is actually well-formed XML, have you tried > ElementTree? It is XML. I used minidom from xml.dom, and it worked fine, except it was ~16 times slower. I'm parsing a ~70mb file, and the difference is 3 minutes to 10 seconds with re's. I used

[Tutor] Optional groups in RE's

2009-04-11 Thread Moos Heintzen
Hello Tutors! I was trying to make some groups optional in a regular expression, but I couldn't do it. For example, I have the string: >>> data = "42 sdlfks d f60 sdf sdf >>> Title" and the pattern: >>> pattern = "(.*?).*?(.*?).*?(.*?)" This works when all the groups are present. >>> re.sea

Re: [Tutor] problem in replacing regex

2009-04-08 Thread Moos Heintzen
Hi, You can do the substitution in many ways. You can first search for bare account numbers and substitute them with urls. Then substitute urls into tags. To substitute account numbers that aren't in urls, you simply substitutes account numbers if they don't start with a "/", as you have been t

Re: [Tutor] Issues Parsing XML

2009-03-12 Thread Moos Heintzen
I'm a little bored, so I wrote a function that gets elements and puts them in a dictionary. Missing elements are just an empty string. http://gist.github.com/78385 Usage: >>> d = process_finding(findings[0]) >>> ", ".join(map(lambda e: d[e], elements)) u'V0006310, NF, , , GD, 2.0.8.8, TRUE, DTBI

Re: [Tutor] Issues Parsing XML

2009-03-12 Thread Moos Heintzen
So you want one line for each element? Easy: # Get elements findings = domDatasource.getElementsByTagName('FINDING') # Get the text of all direct child nodes in each element # That's assuming every child has a TEXT_NODE node. lines = [] for finding in findings: lines.append([f.firstChild.d

Re: [Tutor] memory error files over 100MB

2009-03-10 Thread Moos Heintzen
On Tue, Mar 10, 2009 at 8:45 AM, Harris, Sarah L wrote: > That looks better, thank you. > However I still have a memory error when I try to run it on three or more > files that are over 100 MB? How big are files in the zip file? It seems that in this line newFile.write(zf.read(zfilename)) the

Re: [Tutor] memory error

2009-03-09 Thread Moos Heintzen
On Fri, Mar 6, 2009 at 5:03 PM, Harris, Sarah L wrote: > fname=filter(isfile, glob.glob('*.zip')) > for fname in fname: > zipnames=filter(isfile, glob.glob('*.zip')) > for zipname in zipnames: > ... It looks you're using an unnecessary extra loop. Aren't the contents of fname sim

Re: [Tutor] Binary to Decimal conversion

2009-03-09 Thread Moos Heintzen
You're making it more complicated than it needs to. Also, you first used binnum then binum, and you didn't define binsum. It could easily be done like this: binnum = raw_input("Please enter a binary number: ") decnum = 0 rank = 1 for i in reversed(binnum): decnum += rank * int(i) rank *

Re: [Tutor] how to instantiate a class

2009-02-25 Thread Moos Heintzen
Here is a good book if you are already familiar with other languages. http://diveintopython.org/object_oriented_framework/instantiating_classes.html ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] extracting a column from many files

2009-02-23 Thread Moos Heintzen
Here's a simple repositioning code given that you already have the fields extracted. All files have to have equal dimensions. file1 = [["1a1", "1b1", "1c1",], ["2a1", "2b1", "2c1"],] file2 = [["1a2", "1b2", "1c2",], ["2a2", "2b2", "2c2"],] files = [file1, file2] out_lines = [] for column in range

Re: [Tutor] extracting a column from many files

2009-02-22 Thread Moos Heintzen
For example, if you have input files: file1: 1a1 1b1 1c1 1d1 1e1 1f1 2a1 2b1 2c1 2d1 2e1 2f1 3a1 3b1 3c1 3d1 3e1 3f1 file2: 1a2 1b2 1c2 1d2 1e2 1f2 2a2 2b2 2c2 2d2 2e2 2f2 3a2 3b2 3c2 3

Re: [Tutor] extracting a column from many files

2009-02-22 Thread Moos Heintzen
On Thu, Feb 19, 2009 at 2:41 AM, Bala subramanian wrote: > Dear friends, > > I want to extract certain 6 different columns from a many files and write it > to 6 separate output files. I took some help from the following link > > http://mail.python.org/pipermail/tutor/2004-November/033475.html > >

[Tutor] Default list arguments in __init__

2009-02-21 Thread Moos Heintzen
Hi, This behavior was totally unexpected. I only caught it because it was the only thing I changed. >>> class foo: ... def __init__(self, lst=[]): ... self.items = lst ... >>> f1 = foo() >>> f1.items [] >>> f1.items.append(1) >>> f2 = foo() >>> f2.items [1] Huh? lst is a refer

Re: [Tutor] ConfigParser re-read fails

2009-02-15 Thread Moos Heintzen
It looks way too simplified. I have no idea where the problem is. Would you mind showing the script? gist.github.com is good for posting code. ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor

[Tutor] List weirdness

2009-02-13 Thread Moos Heintzen
Hi, I was wondering why this happens. I was trying to create a list of lists. >>> d = [[]] >>> d[0][0]=1 Traceback (most recent call last): File "", line 1, in ? IndexError: list assignment index out of range >>> d [[]] What's wrong with that? However: >>> d[0].append(1) >>> d [[1]] I guess I

[Tutor] Splitting strings and undefined variables

2009-02-09 Thread Moos Heintzen
Hello all, I was looking at this: http://www.debian.org/doc/manuals/reference/ch-program.en.html#s-python I have a question about the line of code that uses split() With the python version, the line below only works if there are three fields in line. (first, last, passwd) = line.split() Also,