Python editing .txt file
I am trying to write a program in Python that will edit .txt log files that
contain regression output from R. Any thoughts or suggestions would be
greatly appreciated.
To get an idea of what I am trying to do, note that I include fixed effects
in the R regressions, resulting in hundreds of extra lines per regression
which I am not interested in right now. Basically, I want to save a
shortened version of the .txt files in which the blocks of fixed
effects coefficients are replaced by a line that says includes fixed effects
for whatever variable it is.
All the lines that are to be deleted start with the same six characters --
'factor(xyz)' where xyz is the variable name -- so my idea is to have Python
copy each line to a new file if the first six characters do not match
'factor('.
That part I at least know how to approach. However, I am not sure how to
approach adding the line that says, "includes fixed effects for xyz." The
problem I am having is how to approach the following:
1. In the resulting file, I will be skipping blocks of lines, say anywhere
from 10 to 500 or so, and inserting one line -- i.e., whether it inserts the
line needs to depend on whether it's the first line or one of the remaining
499 lines.
2. the xyz variable name is different lengths depending on what variable it
is. For example, one block might be 'state' and another block might be
'yr'. Maybe I can use the fact that the var name starts after the first '('
and ends at the first ')' in the line? I think I can use the re module for
this?
Any suggestions on any aspect of this, but especially the latter part, would
be greatly appreciated. Thank you.
--
http://mail.python.org/mailman/listinfo/python-list
Re: Python editing .txt file
MRAB mrabarnett.plus.com> writes:
> input_file = open(input_path)
> output_file = open(output_path, "w")
> for line in input_file:
> if line.startswith("factor("):
> open_paren = line.find("(")
> close_paren = line.find(")")
> variable = line[open_paren + 1 : close_paren]
> output_file.write("*** Factors for %s ***\n" % variable)
> prefix = line[ : close_paren + 1]
> while line.startswith(prefix):
> line = input_file.readline()
> output_file.write(line)
> input_file.close()
> output_file.close()
This code is very helpful. Thank you. I have been working with it, but
encounter an error that I thought I should mention (line =
input_file.readline()
ValueError: Mixing iteration and read methods would lose data).
--
http://mail.python.org/mailman/listinfo/python-list
Re: Python editing .txt file
> > From: MRAB > To: [email protected] > Date: Wed, 16 Jun 2010 03:06:58 +0100 > Subject: Re: Python editing .txt file > [email protected] wrote: > >> I am trying to write a program in Python that will edit .txt log files >> that contain regression output from R. Any thoughts or suggestions would be >> greatly appreciated. >> > > How's this: > > input_file = open(input_path) > output_file = open(output_path, "w") > for line in input_file: >if line.startswith("factor("): >open_paren = line.find("(") >close_paren = line.find(")") >variable = line[open_paren + 1 : close_paren] >output_file.write("*** Factors for %s ***\n" % variable) >prefix = line[ : close_paren + 1] >while line.startswith(prefix): >line = input_file.readline() >output_file.write(line) > input_file.close() > output_file.close() Thank you very much for your reply. This code works perfectly, except that the "line = input_file.readline()" part causes an error: "ValueError: Mixing iteration and read methods would lose data." I've been working on figuring out a work around. Do you have any ideas? -- http://mail.python.org/mailman/listinfo/python-list
