Hi Clive Thanks for the answer. > No, it's not a bug. Regular expressions match substrings, not entire lines, unless > constrained by anchors (^ or $) [1]. Of course. But, as grep and sed only operate on line levels (altho sed can work on multiple lines with some tweaking) the $ should play no role here So grep '/Name/[^/][^/]*' should mean find a line with /Name/ anywhere the find at least one (first [^/]) character <> '/' then find no more or any number of characters <> '/' but once you find a '/' the mission is over and you have to drop that line So it shoult match blaba/Name/aaa blaba/Name/ : possible hit : state : valid blaba/Name/a : no '/', : state still valid : let's go on blaba/Name/aa : no '/', : state still valid : let's go on blaba/Name/aaa : no '/', : state still valid : let's go on {} / EOL : no more input, state valid, let's report a match Some should apply here blaba/Name/bb blaba/Name/Cc&DD With he anchor ('$') it should includ another step blaba/Name/ : possible hit : state : valid blaba/Name/a : no '/', : state still valid : let's go on blaba/Name/aa : no '/', : state still valid : let's go on blaba/Name/aaa : no '/', : state still valid : let's go on EOL : no '/', EOL found, : maximal amount of input allowed read : state valid : stop : let's report a match
Same, of course for blaba/Name/bb blaba/Name/Cc&DD No for the false matches blaba/Name/aaa/1 : blaba/Name/ : possible hit : state : valid blaba/Name/a : no '/', : state still valid : let's go on blaba/Name/aa : no '/', : state still valid : let's go on blaba/Name/aaa : no '/', : state still valid : let's go on blaba/Name/aaa/ : '/' read : state invalid : let's get out of here : report error So, for grep / sed error means don't tell, so output shopuld be blank Thus blaba/Name/aaa/1 blaba/name/aaa/2 blaba/Name/bb/3 blaba/Name/Ccccc/5 blaba/Name/Cc&DD/2 must not be reported no matter whether there is aa anchor or not. The moment they (the progs) encounter a '/' the DFA (Deterministic Finite automaton) should switch to an invalid state and that's it for that line. Dump it, let's not talk about it, forget it, you're out, history, dead. (Of course for the program there might be a chance that a little farther down the line it might encounter a second /Name/ so it should go on, realize there is not such thing and qietly give up.) So why does grep / sed report them, also they violate the limits of the regex, as is: not '/' after /Name/, period. > Without the $ anchor, the [^/]* matches as many non-/ characters as it can, and no > more. The next / and any subsequent characters are ignored. Yes, but after reading '/' after /Name/ they cannot read more chars, so the have to balk out. > However when followed by the $ anchor, the [^/]* must match non-/ characters all the > way to the end of the line. This looks like a correct solution. Sure, but once the found a character violating the condition : no '/' after /Name/ the are expected to give up. > Does it look any less strange now? Honest answer, no. THere should only be one possible explanation the preceeding .* may go haywire and read up to the end of the line before thinking on matching anything else, but although greedy, sed and the like should only be greedy up to a point, that is .*Anything will be interpreted as read as much as you like, but once you meet an Anything you'll stop. But a .* prefix does not change anything (quite correctly) So the question remains, why don't they stop once they meet the first '/' after /Name/? Axel Schlicht -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]