Data Manipulation - Rows to Columns

2008-02-05 Thread Tess
Hello All,

I have a text file with marked up data that I need to convert into a
text tab separated file.

The structure of the input file is listed below (see file 1) and the
desired output file is below as well (see file 2).

I am a complete novice with python and would appreciate any tips you
may be able to provide.

Best,

Tess



file 1:
TABLE
black
blue
red
CHAIR
yellow
black
red
TABLE
white
gray
pink


file 2 (tab separated):
TABLE   black   bluered
CHAIR   yellow  black   red
TABLE   white   graypink
-- 
http://mail.python.org/mailman/listinfo/python-list


Beautiful Soup Looping Extraction Question

2008-03-24 Thread Tess
Hello All,

I have a Beautiful Soup question and I'd appreciate any guidance the
forum can provide.

Let's say I have a file that looks at file.html pasted below.

My goal is to extract all elements where the following is true:  and .

The lines should be ordered in the same order as they appear in the
file - therefore the output file would look like output.txt below.

I experimented with something similar to this code:
for i in soup.findAll('p', align="left"):
print i
for i in soup.findAll('p', align="center"):
print i

I get something like this:
P4
P3
P1
div4b
div3b
div2b
div2a

Any guidance would be greatly appreciated.

Best,

Ira











##begin: file.html



P1

P2

div2a
div2b

P3
div3a
div3b
div3c

P4
div4a
div4b





##end: file.html


===begin: output.txt===
P1
div2a
div2b
P3
div3b
P4
div4b
===end: output.txt===

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Beautiful Soup Looping Extraction Question

2008-03-24 Thread Tess
Paul - thanks for the input, it's interesting to see how pyparser
handles it.

Anyhow, a simple regex took care of the issue in BS:


for i in soup.findAll(re.compile('^p|^div'),align=re.compile('^center|
^left')):
print i


Thanks again!

T
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Beautiful Soup Looping Extraction Question

2008-03-25 Thread Tess
Paul - you are very right. I am back to the drawing board. Tess
-- 
http://mail.python.org/mailman/listinfo/python-list