On Wed, Aug 17, 2011 at 5:59 PM, ERIC KRAUSE <[email protected]> wrote:
> The problem for me is the line endings I think. When I open the
> file and read in one line, I get the whole file. I think the
> line endings are ^p (MS paragraph markers), but I can't open
> the file to view them. The files are huge, 150M or bigger. MS
> Word chokes on them.
*snip*
> Is there a way for me to search the entire 150M single line and
> get the metrics I'm looking for, or is it possible to open the
> file, search for the 30 spaces and replace with \n?

150M single line? Do you mean a single line is 150 megabytes or
did you mean something else?

Assuming sensible line lengths you could start by opening the
file as a binary file and reading a specific amount of data (a
reasonable length, like a few kilobytes or megabytes).  Write
that to a new file and examine it, either with a text editor or
hex editor (or what ever application of your choosing).  Once you
know the line/record separator character(s) you should be able to
easily process the file line by line or record by record.


-- 
Brandon McCaig <http://www.bamccaig.com/> <[email protected]>
V zrna gur orfg jvgu jung V fnl. Vg qbrfa'g nyjnlf fbhaq gung jnl.
Castopulence Software <http://www.castopulence.org/> <[email protected]>

-- 
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
http://learn.perl.org/


Reply via email to