[Tutor] List weirdness

2009-02-13 Thread Moos Heintzen

Hi,
I was wondering why this happens. I was trying to create a list of lists.

>>> d = [[]]
>>> d[0][0]=1
Traceback (most recent call last):
 File "", line 1, in ?
IndexError: list assignment index out of range
>>> d
[[]]

What's wrong with that?

However:
>>> d[0].append(1)
>>> d
[[1]]

I guess I can't reference [0] on an empty list. (I come from a C 
background.)

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] ConfigParser re-read fails

2009-02-15 Thread Moos Heintzen
It looks way too simplified. I have no idea where the problem is.
Would you mind showing the script?

gist.github.com is good for posting code.
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] Default list arguments in __init__

2009-02-21 Thread Moos Heintzen

Hi,

This behavior was totally unexpected. I only caught it because it was 
the only thing I changed.


>>> class foo:
... def __init__(self, lst=[]):
... self.items = lst
...
>>> f1 = foo()
>>> f1.items
[]
>>> f1.items.append(1)
>>> f2 = foo()
>>> f2.items
[1]

Huh? lst is a reference to the *same list* every instance?

I guess I have to do it like this. It seems to work. (i.e. every foo 
instance with default lst now has a unique new list.)


def__init__(self, lst=None):
   self.items = lst or []


This is on python 2.4.4c1
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] extracting a column from many files

2009-02-22 Thread Moos Heintzen
On Thu, Feb 19, 2009 at 2:41 AM, Bala subramanian
 wrote:
> Dear friends,
>
> I want to extract certain 6 different columns from a many files and write it
> to 6 separate output files. I took some help from the following link
>
> http://mail.python.org/pipermail/tutor/2004-November/033475.html
>
> to write one column from many input files to a particular output file.

Let me see if I understand what you want to do. You have file1.txt,
file2.txt, file3.txt ...
and you want to read n columns from those files?
It gets confusing. How many columns do you want to read from each
file? How many columns does each output file have?

Also, it would be very helpful if you give us the format of the input
and output files.
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] extracting a column from many files

2009-02-22 Thread Moos Heintzen
For example, if you have input files:

file1:
1a1 1b1 1c1 1d1 1e1 1f1
2a1 2b1 2c1 2d1 2e1 2f1
3a1 3b1 3c1 3d1 3e1 3f1

file2:
1a2 1b2 1c2 1d2 1e2 1f2
2a2 2b2 2c2 2d2 2e2 2f2
3a2 3b2 3c2 3d2 3e2 3f2

How do you want the output files to look like?
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] extracting a column from many files

2009-02-23 Thread Moos Heintzen
Here's a simple repositioning code given that you already have the
fields extracted.
All files have to have equal dimensions.

file1 = [["1a1", "1b1", "1c1",], ["2a1", "2b1", "2c1"],]
file2 = [["1a2", "1b2", "1c2",], ["2a2", "2b2", "2c2"],]
files = [file1, file2]
out_lines = []

for column in range(len(files[0][0])):
   for fileno in range(len(files)):
   out_lines.append([])
   for row in range(len(files[0])):
   out_lines[-1].append(files[fileno][row][column])
   # write out_lines to file "file%s" % column
   print out_lines
   out_lines = []

No offense, but your extracting code looks a bit inflexible. It has a
lot of magic numbers, and most of it is hardcoded.
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] how to instantiate a class

2009-02-25 Thread Moos Heintzen
Here is a good book if you are already familiar with other languages.

http://diveintopython.org/object_oriented_framework/instantiating_classes.html
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Binary to Decimal conversion

2009-03-09 Thread Moos Heintzen
You're making it more complicated than it needs to.
Also, you first used binnum then binum, and you didn't define binsum.

It could easily be done like this:

binnum = raw_input("Please enter a binary number:  ")
decnum = 0
rank = 1

for i in reversed(binnum):
decnum += rank * int(i)
rank *= 2

Moos
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] memory error

2009-03-09 Thread Moos Heintzen
On Fri, Mar 6, 2009 at 5:03 PM, Harris, Sarah L
 wrote:

> fname=filter(isfile, glob.glob('*.zip'))
> for fname in fname:
> zipnames=filter(isfile, glob.glob('*.zip'))
> for zipname in zipnames:
> ...

It looks you're using an unnecessary extra loop.
Aren't the contents of fname similar to zipnames?

I tried it with one loop (for zipname in zipnames:) and it worked.

P.S. You're at jpl? That's awesome! I was looking at internships they
have few days ago.
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] memory error files over 100MB

2009-03-10 Thread Moos Heintzen
On Tue, Mar 10, 2009 at 8:45 AM, Harris, Sarah L
 wrote:
> That looks better, thank you.
> However I still have a memory error when I try to run it on three or more 
> files that are over 100 MB?

How big are files in the zip file?

It seems that in this line

newFile.write(zf.read(zfilename))

the compressed file is unzipped to memory first, then written to the new file.

You can read and write in smaller chunks using file objects returned
by zf.open(), which take a size parameter. (Maybe it wouldn't work
since the file is going to get extracted to memory anyway.)

However, the open() method is in Python 2.6, and in Python 2.6 there
is also the extractall() method

http://docs.python.org/library/zipfile.html#zipfile.ZipFile.extractall

which does what you're doing. I'm not sure if it will still cause a
memory error.

Also, are files in your zip files not in directories, since you're not
creating directories?

Since this is a learning experience, it might also help creating
functions to minimize clutter, or you could familiarize yourself with
the language.

Moos
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Issues Parsing XML

2009-03-12 Thread Moos Heintzen
So you want one line for each  element? Easy:

# Get  elements
findings = domDatasource.getElementsByTagName('FINDING')

# Get the text of all direct child nodes in each element
# That's assuming every  child has a TEXT_NODE node.
lines = []
for finding in findings:
lines.append([f.firstChild.data for f in finding.childNodes])

# print
for line in lines:
 print ", ".join(line)

Not sure how you want to deal with newlines. You can escape them to \n
in the output, or you might find something in the CSV module. (I
haven't looked at it.)

Now this doesn't deal with missing elements. I found some have 7, and
others have 9. You might be able to insert two empty elements in lines
with length 7.

Or, if you want to have more control, you can make a dictionary with
keys of all available tag names, and for each element found in
, insert it in the dictionary (If it's a valid tag name).

Then you have a list of dictionaries, and you can print the elements
in any order you want. Missing elements will have null strings as
values.

Moos
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Issues Parsing XML

2009-03-12 Thread Moos Heintzen
I'm a little bored, so I wrote a function that gets  elements
and puts them in a dictionary. Missing elements are just an empty
string.

http://gist.github.com/78385

Usage:
>>> d = process_finding(findings[0])
>>> ", ".join(map(lambda e: d[e], elements))
u'V0006310, NF, , , GD, 2.0.8.8, TRUE, DTBI135-Scripting\nof Java
applets -\nRestricted, 2'

Now for a  of 9 elements:
>>> d = process_finding(findings[1])
>>> ", ".join(map(lambda e: d[e], elements))
u'V0006311, O, The
value:\nSoftware\\Policies\\Microsoft\\Windows\\CurrentVersion\\Internet\nSettings\\Zones\\4\\1A00
does not exist.\n\n, The
value:\nSoftware\\Policies\\Microsoft\\Windows\\CurrentVersion\\Internet\nSettings\\Zones\\4\\1A00
does not exist.\n\n, GD, 2.0.8.8, TRUE, DTBI136-User\nAuthentication -
Logon -\nRestricted, 2'

The map() function just applies the dictionary to each element in the
elements list. You can reorder them anyway you want.

You're welcome :)

Moos
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] problem in replacing regex

2009-04-08 Thread Moos Heintzen
Hi,

You can do the substitution in many ways.

You can first search for bare account numbers and substitute them with
urls. Then substitute urls into  tags.

To substitute account numbers that aren't in urls, you simply
substitutes account numbers if they don't start with a "/", as you
have been trying to do.

re.sub() can accept a function instead of a string. The function
receives the match object and returns a replacement. This way you can
do extra processing to matches.

import re

text = """https://hello.com/accid/12345-12

12345-12

http://sadfsdf.com/asdf/asdf/asdf/12345-12

start12345-12end

this won't be replaced
start/123-45end
"""

def sub_num(m):
if m.group(1) == '/':
return m.group(0)
else:
# put url here
return m.group(1) + 'http://example.com/' + m.group(2)

>>> print re.sub(r'(\D)(\d+-\d+)', sub_num , text)
https://hello.com/accid/12345-12

http://example.com/12345-12

http://sadfsdf.com/asdf/asdf/asdf/12345-12

starthttp://example.com/12345-12end

this won't be replaced
start/123-45end

>>> _

This is assuming there isn't any  tags in the input, so you should
do this before substituting urls into  tags.


I have super cow powers!

Moos
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] Optional groups in RE's

2009-04-11 Thread Moos Heintzen
Hello Tutors!

I was trying to make some groups optional in a regular expression, but
I couldn't do it.

For example, I have the string:

>>> data = "42 sdlfks d f60 sdf sdf  
>>> Title"

and the pattern:
>>> pattern = "(.*?).*?(.*?).*?(.*?)"

This works when all the groups are present.

>>> re.search(pattern, data).groups()
('42', '60', 'Title')

However, I don't know how to make an re to deal with possibly missing groups.
For example, with:
>>> data = "42 sdlfks d f60 sdf sdf"

I tried
>>> pattern = 
>>> "(.*?).*?(.*?).*?(?:(.*?))?"
>>> re.search(pattern, data).groups()
('42', '60', None)

but it doesn't work when  _is_ present.

>>> data = "42 sdlfks d f60 sdf sdf  
>>> Title"
>>> re.search(pattern, data).groups()
('42', '60', None)

I tried something like (?:pattern)+ and (?:pattern)* but I couldn't
get what I wanted.
(.*?)? doesn't seem to be a valid re either.

I know (?:pattern) is a non-capturing group.
I just read that | has very low precedence, so I used parenthesis
liberally to "or" pattern and a null string.

>>> pattern = 
>>> "(.*?).*?(.*?).*?(?:(?:(.*?))|)"
>>> re.search(pattern, data).groups()
('42', '60', None)

(?:(?:pattern)|(?:.*)) didn't work either.

I want to be able to have some groups as optional, so when that group
isn't matched, it returns None. When it's match it should return what
is matched.

Is that possible with one re?

I could probably do it with more than one re (and did it) but with one
re the solution is much more elegant.
(i.e. I could have named groups, then pass the resultant dictionary to
a processing function)

I also tried matching optional groups before, and curious about the solution.

Moos
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Optional groups in RE's

2009-04-11 Thread Moos Heintzen
Mark Tolonen  wrote:
> Your data looks like XML.  If it is actually well-formed XML, have you tried
> ElementTree?

It is XML. I used minidom from xml.dom, and it worked fine, except it
was ~16 times slower. I'm parsing a ~70mb file, and the difference is
3 minutes to 10 seconds with re's.

I used separate re's for each field I wanted, and it worked nicely.
(1-1 between DOM calls and re.search and re.finditer)

This problem raised when I tried to do the match in one re.

I guess instead of minidom I could try lxml, which uses libxml2, which
is written in C.

Kent Johnson  wrote:
> This re doesn't have to match anything after  so it doesn't.
> You can force it to match to the end by adding $ at the end but that
> is not enough, you have to make the ".*?" *not* match .
> One way to do that is to use [^<]*? instead of .*?:

Ah. Thanks.
Unfortunately, the input string is multi-line, and doesn't end in 


Moos

P.S.

I'm still relatively new to RE's, or IRE's. sed, awk, grep, and perl
have different format for re's. grep alone has four different versions
of RE's!

Since the only form of re I'm using is "start(.*?)end" I was thinking
about writing a C program to do that.
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] Splitting strings and undefined variables

2009-02-09 Thread Moos Heintzen
Hello all,

I was looking at this:
http://www.debian.org/doc/manuals/reference/ch-program.en.html#s-python

I have a question about the line of code that uses split()

With the python version, the line below only works if there are three
fields in line.

(first, last, passwd) = line.split()

Also, since the variables are used like this:

lineout = "%s:%s:%d:%d:%s %s,,/home/%s:/bin/bash\n" %  \
 (user, passwd, uid, gid, first, last, user)

I can't use ":".join(line.split())
But maybe a dictionary could be used for string substitution.

In the perl version (above the python version in the link), the script
works with the input line having one to three fields. Like "fname
lname pw" or "fname lname"

($n1, $n2, $n3) = split / /;

Is there a better way to extract the fields from line in a more
flexible way, so that the number of fields could vary?
I guess we could use conditionals to check each field, but is there a
more elegant (or pythonic!) way to do it?

Moos

P.S. I'm not a Perl user, I was just reading the examples. I've been
using C and awk for few years, and Python for few months. Also, I know
blank passwords aren't very practical, but I'm just asking this to
explore possibilities :)
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor