drieux, et al -- ...and then drieux said... % % On Saturday, June 8, 2002, at 08:13 , David T-G wrote: % >drieux, et al -- % >...and then drieux said... % >% On Saturday, June 8, 2002, at 04:47 , David T-G wrote: ... % > % >Tell me about the standard... Should perl happily chomp either a UNIX or % >a DOS (or even a MAC) line? Or do I turn around and explain it below, % >answering myself? % % the cannon is: % % EOL - end of line is denoted as % % mac: <CR> : chr(13) % dos: <CR><NL> : chr(13)chr(10) % nix: <NL> : chr(10)
OK, so it *shouldn't* somehow ever be \n\r and so it is extremely
unlikely that that's why chomp was failing.
%
% note what happens:
%
% vladimir: 64:] echo line> file
% vladimir: 65:] unix2dos file file.dox
% could not open /dev/kbd to get keyboard type US keyboard assumed
% could not get keyboard type US keyboard assumed
% vladimir: 66:] od -c !$
% od -c file.dox
% 0000000 l i n e \r \n
% 0000006
% vladimir: 67:]
OK. I'd probably see about the same when taking a look at my Cygwin
find output.
%
% if you check the stty man pages you will find our friend onlcr
% that does the mapping of NL to CR-NL - we still have the old
% cross over problem here that what unix folks use as \n is the
Right.
% "new line" token - but which by way of stty goes out to their
% 'terminal type' as if it were CR - or "\r" - return the carriage
% head to the beginning of the line and then shift the roller up one.
%
% otherwise if you have merely the new line
% you start typing here.
The famous stair-step printer problem. Oh, how many sheets of paper have
been wasted because of that mess.
%
% If you have merely the CR - you would start writing over the line.
Tougher to demonstrate in a readable post :-)
%
% Hence to have "\n\r" would mean having implemented the stardard
% for the EOL token to the file 'underappropriately' - although
Right; I get it.
% 'technically literally' and it would 'still work' in the case of
% those systems that know how to parse them correctly. Since it
% really does not matter to a teletype which order the commands
% are generated - they will read them off the wire as commands
% and execute them...
Yep, and a screen can function in the same way so the user might never
know, but code will care.
%
% { note you should seend three BEL tokens for the start and stop
% of any message - but that has fallen out of habit.... and no one
% seems to worry about taking them out of the data stream, or remembering
% to put them in either... }
*grin*
%
% [..]
% >(you know, it can be a real challlenge to write a one-liner!) and found
% >that I have either RL or L for all files, and no \n\r as I had thought,
% [..]
%
% the problem here is that chomp is defined on the host you are on,
% not on the host where you once were.....
Actually, it's all on the same host, but there has been a Cygwin upgrade
in the meantime. What I don't get is why kazin-1 and kazin-3 are not the
same as kazin-2 and kazin-4, and yet all were made around the same time,
definitely without any upgrades or code changes. That says to me that I
ran the find in a different way or some such, and that's possible because
I could have still been doing it manually, but I still don't know what it
would take to generate different output.
%
% it's a reasonable compromise in that case...
Yeah.
%
% where you have to get your poop in a group on this point is as you
% move into 'network layer plays' - such as HTTP - unless you are
% using the appropriate modules to do this stuff for you - and you
Well, that's the plan when possible; I'd much rather use than roll :-)
% find that the RFC for http defines the separator for the head from
% the body as <CR><LF> - cf:
% http://www.w3.org/Protocols/rfc2068/rfc2068
% section 2.2 to be specific - where they call out the decimal
% values for them in the ASCII table....
Yeah.
%
% { may I recommend that you use the CPAN modules - hand cranking this
% stuff from the IO::Socket layer - while what some of us did, is not
You betcha!
% what I would recommend now.... but yes, the original code I ripped
% had the sort of 'oh look, we have that <CR><LF> hence we are out of
% header and the rest is body....' sort of coding...}
Hmmm... Perhaps good to use.
%
% [..]
% >this would have really screwed me as I got way down into my lists :-)
%
% yes... not that I would wish to impose some 'puritanical morality'
*grin*
% on how you relate to yourself..... but in the coding space, I would
% wish to impose a sense of
%
% THAT WILL HURT YOU!
Oh, indeed. I knew it was a bad way to do it, but it was the only way
that I could see. I'm all better now :-)
%
% >So now I should be able to put
...
% > s/($cr|$lf)+//;
...
%
% test that - but I do not think it will do what you are expecting,
% since I think the tradition is
%
% ([$cr|$lf]+)
Actually, I ended up doing it in octal instead of wasting $cr and $lf,
so I stole your brackets, and the code now works as
s/[\012|\015]+//;
and also works the other way around.
Now that I'm pondering it, if anything, I'd think it would be
($cr|$lf)+
or perhaps
[$cr$lf]+
and so I'll have to check some more.
%
% where the [ ] block off the sequence of characters, the "|"
% here is the expected 'or me' - the "+" denoting one or more of these.
Yeah, I imagine I need to get rid of the | and have again :-) gotten
lucky because there are no |s in my pathnames to be caught.
% { in this case you want that, helps the compiler not worry about
% looking for the cases of 'not me' and we nest all of that in the
% round braces to denote the 'yes, this pattern, do something with it!'
Hmmm... Tell me more what you mean here; you've lost me.
%
% [..]
%
% ciao
% drieux
TIA & HAND
:-D
--
David T-G * It's easier to fight for one's principles
(play) [EMAIL PROTECTED] * than to live up to them. -- fortune cookie
(work) [EMAIL PROTECTED]
http://www.justpickone.org/davidtg/ Shpx gur Pbzzhavpngvbaf Qrprapl Npg!
msg25712/pgp00000.pgp
Description: PGP signature
