Hey Craig.
On Tue, 2022-11-15 at 08:01 +1100, Craig Small wrote: > > > It can, but what we think the string is is not what the string > actually is, I suspect. > Each one of those 0x0 are delimiters, so if there was two of them at > the end we would have: > argv[0] DELIM argv[1] DELIM argv[2] > Where DELIM is the " " delimiter ps uses and argv[2]="" AFAIU the cmdline ist defined to be the fields, separated with 0x0 with the last one terminated by 0x0 (so in other words, all are NULL terminated strings). For the matching, we need to make the fields one string, right? Because BRE/ERE are not really defined for multiline/field matches and I guess even if one makes something up, like one RE per field, it wouldn't be so much useful in practise. How does one reasonably make the multiple fields one true line without any 0x0? a) We concatenate simply, separated by space (which I guess is what's done now and causes the issue?). And this is already ambiguous if on wants perfect matching, cause: Was "foo " actually: foo "" "" or was it rather: foo " " ? b) we try to escape/quote the fields, so if e.g. the 2nd one would be <space><tab> we'd concatenate: <space>'<space><tab>' (i.e. shell style quoting) But that's obviously more complex, since the filed may also contain quote characters. c) Since (a) already looses the information how the fields actually looked like, we could also just say, that any whitespace in fields is effectively ignored, which means again, that one cannot specifically match a e.g. 2nd field <space><tab> But therefore, we could apply the following rule: Only if a field in cmdline is non-empty, we actually append it and a spearating whitespace to the string that we match against. With the question left open, whether we strip any surrounding whitespace from a field, i.e. does foo " bar baz " result in "foo bar baz " ^ \- being the field spearatinspace or in "foo bar baz" > The arglist is doesn't end with "[mux]" but "[mux] ". While > looking odd, this is what the argument list actually is. Well I think it's difficult to say what it "actually" is, since we cannot really express it in ONE string without a separator characters like 0x0 that is otherwise not allowed or without loosing information. Right? > Another strange thing, the proc(5) manpage says: > /proc/[pid]/cmdline > This read-only file holds the complete command line for > the process, unless the process is a zombie. In the latter case, > there is nothing in this file: that is, a read on this file will > return 0 characters. The command-line arguments appear in this file > as a > set of strings separated by null bytes ('\0'), with a > further null byte after the last string. > > But neither your or my example has that, its a space. That's the > kernel doing something odd. That I don't understand? The (field-)separating (single) <space> that you presumably add as a convenience does not "really" exist in the command line. It's just something that *might* have been used in the shell, that caused the exec for the command, but the kernel never sees those. The separator could have very well been always two <space> or a <tab>. Cheers, Chris.