Thanks for the help, guys. Below is a portion of my sed script. It just goes
through a bunch of html files and replaces outdated URLs with the correct
locations. All substitions are the same format.
s/\/new\.gif/\/icons\/new.gif/g
s/\/http\:\/\/.*\.pppl\.gov\/nstxhome\/nstx\/controls/nstx.pppl.gov\/local\/controls/g
s/\/http\:\/\/.*\.pppl\.gov\/nstxhome\/nstx\/software/nstx.pppl.gov\/local\/software/g
s/\/http\:\/\/.*\.pppl\.gov\/nstxhome\/nstxhome/nstx.pppl.gov/g
s/\/http\:\/\/.*\.pppl\.gov\/nstxhome\/nstx/nstx.pppl.gov\/local/g
s/\/iterhome\/iter/\/iter\/local_share/g
s/\/iterhome/\/iter/g
s/\/iter/\/iter/g
s/\/database/\/iter\/database/g
I think I may have solved my problem. Most of these html files were created on
Macs or Windows PCs. Using tr, I replace the returns w/ a newline and then pipe
it into sed:
tr "[\r]" "[\n]" < ${1} | $sed > ${1}.new 2>/dev/null
The "2>/dev/null" is to get rid of the sed errors about the missing newline on
the last line of the file. This appears to be working with one exception. On
the Macs and most PC editors, the files still appear normally after being
massaged by tr and sed. The exception is Windows notepad - when the the altered
files are opened w/ notepad, they become on long line with some junk characters
appearing in lieu of newlines. Wordpad, Word and Netscape composer don't have
any problems. I figure if anyone uses notepad and complains, I can just tell them
to use a better editor (which would be just about anything else).
Ideally, I would like to return the files to the original condition regarding
newlines and returns, but putting
tr "[\n]" "[\r]"
on the end of the above pipe didn't do it. Any ideas?
On Tue, 16 May 2000, Pete Peterson wrote:
> The RedHat digest form of the redhat-list is severely hosed. Your message
> just came today, though you sent it almost a week ago. The same thing
> happened with a message *I* sent a week ago: it just appeared today,
> although later messages have appeared previously. Perhaps you've
> already received some suggestions, but I offer these anyhow:
>
> sed doesn't join lines unless you try very hard to make it do so (or it's
> broken).
>
> It normally reads the input, one line at a time and runs all the commands
> in sequence on that line. You *can* append multiple lines to the buffer
> and delete the embedded newlines, but it doesn't happen gratuitously.
>
> Were you working, on Linux, with a file that was generated on a Unix
> machine or some "foreign" file format such as DOS, Windoze, RSX-11, VMS,
> CP/M, :-) ...? Perhaps there's some disagreement between sed and the
> file on what constitutes a "line". Try "cat -v -e sm.html | less".
> That will show up any strange characters, and each line should end with
> "$" which is how that 'cat' command shows newline characters.
>
> It would be helpful if you showed us the content of 'substitutions.sed'
> or at least a representative sample.
>
> Many Unix editors which, like me, are of the "do what l tell you, not
> what you THINK I meant" persuasion, don't insist on putting a Newline at
> the end of the file and some, like Emacs, allow you to specify whether
> you want it to add one, ask you, or quietly accept what you gave it.
> Some LINE-ORIENTED utilities, like sed, will give you a complaint if the
> last line doesn't have a newline and will not consider it to be a "line"
> if it doesn't. Just edit the file with vi, Emacs, or whatever your
> favorite editor might be, insert the newline and that warning will go
> away.
>
>
> pete
>
>
>
> pete peterson
> GenRad, Inc.
> 7 Technology Park Drive
> Westford, MA 01886-0033
>
> [EMAIL PROTECTED] or [EMAIL PROTECTED]
> +1-978-589-7478 (GenRad); +1-978-256-5829 (Home: Chelmsford, MA)
> +1-978-589-2088 (Closest FAX); +1-978-589-7007 (Main GenRad FAX)
>
>
>
>
> > Date: Wed, 10 May 2000 17:24:07 -0400
> > From: Prentice <[EMAIL PROTECTED]>
> > To: [EMAIL PROTECTED]
> > Subject: OT: help w/ sed
> >
> >
> > I rearranged the layout of a webserver, and I need to fix URLs that are no
> > longer correct in the html files. I wrote a sed script to do this, and it makes
> > the substitutions correctly, but it strips the newlines of the ends of the
> > lines so the output is one long line. Sed also issues a warning about now
> > newline at the end of the file:
> >
> > $sed -f substitutions.sed sm.html
> > sed: Missing newline at end of file somefile.html.
> > <a bunch of output from sed that's one single line.... >
> >
> > How do I keep sed from removing the newlines at the end of each line? What is
> > causing that error message?
> >
> > Prentice
> > [EMAIL PROTECTED]
> > Princeton Plasma Physics Lab
> > http://www.pppl.gov
--
Prentice
[EMAIL PROTECTED]
Princeton Plasma Physics Lab
http://www.pppl.gov
--
To unsubscribe: mail [EMAIL PROTECTED] with "unsubscribe"
as the Subject.