On Mon 20 Feb 2023 at 10:39:21 (+0100), Andreas Leha wrote: > Greg Wooledge <g...@wooledge.org> writes: > > On Sun, Feb 19, 2023 at 12:04:22PM -0600, David Wright wrote: > >> But even that's not enough > >> because the field width is somewhat variable: try ps -eo '%c | %z | > >> %a' > >> (We can still use | to make the problem somewhat more obvious.) > > > > Oh wow. Yeah, OK, that's not really solvable. > > > > For those who don't want to try to reverse engineer David's conclusion, > > or who don't just happen to stumble upon it with their current process > > list, here's what I'm seeing: > > > > COMMAND | VSZ | COMMAND > > systemd | 164140 | /sbin/init > > kthreadd | 0 | [kthreadd] > > rcu_gp | 0 | [rcu_gp] > > rcu_par_gp | 0 | [rcu_par_gp] > > [...] > > steamwebhelper | 4631064 | /home/greg/.steam/debian-installation/[...] > > [...] > > chrome_crashpad | 33567792 | > > /opt/google/chrome/chrome_crashpad_handler[...] > > [...] > > kworker/3:0-eve | 0 | [kworker/3:0-events] > > > > ps appears to guess an initial maximum width for the VSZ field, but > > when a value comes along that exceeds the guessed maximum, it simply > > shoves the field barrier over. It doesn't even become the new maximum, > > with all of the fields aligning after that. It's just a one-time shove, > > breaking the current line only. > > > > Therefore, parsing the header line cannot give us enough information to > > insert field separators correctly in body lines after the fact. > > > Dear all, > > Thanks for chiming in. The example was indeed simplified and I am using > %a which can contain internal whitespace. > > This is the command I was using previously: > > ps -eo '%p|%c|%C' -o "%mem" -o '|%a' --sort=-%cpu > > I now replaced it with > > ps -eo '%p %c %C' -o "%mem" -o ' %a' --sort=-%cpu | sed -E 's/([0-9]+) > (.+) ([0-9]+.?[0-9]?) ([0-9]+.?[0-9]?) (.+)/\1|\2|\3|\4|\5/' > > This works, but is of course cumbersome to maintain. > > Again, thanks for all the comments!
I think there are a few too many assumptions in there; in particular, numbers in %a will match patterns designed to match cpu and mem, because you can't prevent sed from being greedy (except with the [^ … … ]+ construction, to restrict what it matches). This version makes a few assumptions as well: . that the new format matches the old one (mine) if the delimiters given are a single space (like '%p %c %C'), or stripped (like "%mem" and '%a', but not ' %a'). . the short command is always 15 chars wide even if all the commands in the table are shorter, eg with ps -o. . I don't have any of those new-fangled extra-long PIDs yet today. It might well break if a CPU or MEM is running at 100%. That's not easily tested here. I've reordered the columns on the first pass, so that the numeric ones (with their limited character set) come first, which means I can use an auxiliary character for correcting the spacing. (The spaces between the columns get comingled with the leading spaces of numbers.) The second pass sorts that out and processes the heading. $ ps -eo '%p %c %C' -o "%mem" -o '%a' --sort=-%cpu | sed -E 's/( *[0-9]+) (.{15})( +[0-9.]+ +[0-9.]+) (.*$)/\1~\3~\2\4/;' | sed -E 's/([^~]+)~ ([^~]+)~(.{15})(.*)/\1|\3|\2|\4/;s/^( *PID) (COMMAND ) /\1|\2|/;s/%MEM COMMAND/%MEM|COMMAND/;' | less $ This is the same, except I deliberately chose _ for the auxiliary character, knowing that short commands are stuffed with underscores: $ ps -eo '%p %c %C' -o "%mem" -o '%a' --sort=-%cpu | sed -E 's/( *[0-9]+) (.{15})( +[0-9.]+ +[0-9.]+) (.*$)/\1_\3_\2\4/;' | sed -E 's/([^_]+)_ ([^_]+)_(.{15})(.*)/\1|\3|\2|\4/;s/^( *PID) (COMMAND ) /\1|\2|/;s/%MEM COMMAND/%MEM|COMMAND/;' | less $ Examples: PID|COMMAND |%CPU %MEM|COMMAND 9798|firefox-esr | 2.5 5.8|firefox-esr 16143|Isolated Web Co| 1.8 2.2|/usr/lib/firefox-esr/firefox-esr -contentproc -childID 11 -isForBrowser -prefsLen 47676 -prefMapSize 232307 -jsInitLen 277276 -parentBuildID 20230214011352 -appDir /usr/lib/firefox-esr/browser 9798 true tab 1242|Xorg | 1.0 1.4|/usr/lib/xorg/Xorg -nolisten tcp :0 vt1 -keeptty -auth /tmp/serverauth.FxvBp8B7Qn [ … ] 8|mm_percpu_wq | 0.0 0.0|[mm_percpu_wq] 9|rcu_tasks_rude_| 0.0 0.0|[rcu_tasks_rude_] 10|rcu_tasks_trace| 0.0 0.0|[rcu_tasks_trace] An incestuous one, with -o rather -eo: PID|COMMAND |%CPU %MEM|COMMAND 1694|bash | 0.0 0.1|bash 23486|ps | 0.0 0.0|ps -o %p %c %C -o %mem -o %a --sort=-%cpu 23487|sed | 0.0 0.0|sed -E s/( *[0-9]+) (.{15})( +[0-9.]+ +[0-9.]+) (.*$)/\1~\3~\2\4/; 23488|sed | 0.0 0.0|sed -E s/([^~]+)~ ([^~]+)~(.{15})(.*)/\1|\3|\2|\4/;s/^( *PID) (COMMAND ) /\1|\2|/;s/%MEM|COMMAND/%MEM|COMMAND/; 23489|less | 0.0 0.0|less Cheers, David.