On Wed, Aug 17, 2011 at 5:59 PM, ERIC KRAUSE <[email protected]> wrote: > The problem for me is the line endings I think. When I open the > file and read in one line, I get the whole file. I think the > line endings are ^p (MS paragraph markers), but I can't open > the file to view them. The files are huge, 150M or bigger. MS > Word chokes on them. *snip* > Is there a way for me to search the entire 150M single line and > get the metrics I'm looking for, or is it possible to open the > file, search for the 30 spaces and replace with \n?
150M single line? Do you mean a single line is 150 megabytes or did you mean something else? Assuming sensible line lengths you could start by opening the file as a binary file and reading a specific amount of data (a reasonable length, like a few kilobytes or megabytes). Write that to a new file and examine it, either with a text editor or hex editor (or what ever application of your choosing). Once you know the line/record separator character(s) you should be able to easily process the file line by line or record by record. -- Brandon McCaig <http://www.bamccaig.com/> <[email protected]> V zrna gur orfg jvgu jung V fnl. Vg qbrfa'g nyjnlf fbhaq gung jnl. Castopulence Software <http://www.castopulence.org/> <[email protected]> -- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected] http://learn.perl.org/
