On Mon, Jul 2, 2012 at 10:03 AM, Flynn, Stephen (L & P - IT)
<steve.fl...@capita.co.uk> wrote:
> Tutors,
>
> Whilst having a play around with reading in textfiles and reformatting them I 
> tried to write a python 3.2 script to read a CSV file, looking for any 
> records which were short (indicating that the data may well contain an 
> embedded CR/LF. I've attached a small sample file with a "split record" at 
> line 3, and my code.
>
> Call the code with
>
> Python pipesmoker.py MyFile.txt ,
>
> (first parameter is the file being read, second parameter is the field 
> separator... a comma in this case)
>
> I can read the file in, I can determine that I'm looking for records which 
> have 13 fields and I can find a record which is too short (line 3).
>
> What I can't do is read the successive line to a short line in order to 
> append it onto the end of short line before writing the entire amended line 
> out. I'm still thinking about how to persuade the fileinput module to leap 
> over the successor line so it doesn't get processed again.
>
> When I run the code as it stands, I get a traceback as I'm obviously not 
> using fileinput.FileInput.readline() correctly.
>
> value of file is C:\myfile.txt
> value of the delimiter is ,
> I'm looking for  13 , in each currentLine...
> "1","0000000688      ","ABCD","930020854","34","0","1"," ","930020854 ","  
>         ","0","0","0","0"
>
> "2","0000000688      ","ABCD","930020854","99","0","1"," ","930020854 ","     
>      ","0","0","0","0"
>
> short line found at line 3
> Traceback (most recent call last):
>   File "C:\Documents and 
> Settings\flynns\workspace\PipeSmoker\src\pipesmoker\pipesmoker.py", line 35, 
> in <module>
>     nextLine = fileinput.FileInput.readline(args.file)
>   File "C:\Python32\lib\fileinput.py", line 301, in readline
>     line = self._buffer[self._bufindex]
> AttributeError: 'str' object has no attribute '_buffer'
>
>
> Can someone explain to me how I am supposed to make use of readline() to grab 
> the next line of a text file please? It may be that I should be using some 
> other module, but chose fileinput as I was hoping to make the little routine 
> as generic as possible; able to spot short lines in tab separated, comma 
> separated, pipe separated, ^~~^ separated and anything else which my clients 
> feel like sending me.
>
Take a look at csvreader
http://docs.python.org/library/csv.html#csv.reader.  It comes with
python, and according to the text near this link, it will handle a
situation where EOL characters are contained in quoted fields.  Will
that help you?


-- 
Joel Goldstick
_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Reply via email to