On Mon 20 Feb 2023 at 10:39:21 (+0100), Andreas Leha wrote:
> Greg Wooledge <g...@wooledge.org> writes:
> > On Sun, Feb 19, 2023 at 12:04:22PM -0600, David Wright wrote:
> >> But even that's not enough
> >> because the field width is somewhat variable: try   ps -eo '%c  |  %z  |  
> >> %a'
> >> (We can still use | to make the problem somewhat more obvious.)
> >
> > Oh wow.  Yeah, OK, that's not really solvable.
> >
> > For those who don't want to try to reverse engineer David's conclusion,
> > or who don't just happen to stumble upon it with their current process
> > list, here's what I'm seeing:
> >
> > COMMAND          |     VSZ  |  COMMAND
> > systemd          |  164140  |  /sbin/init
> > kthreadd         |       0  |  [kthreadd]
> > rcu_gp           |       0  |  [rcu_gp]
> > rcu_par_gp       |       0  |  [rcu_par_gp]
> > [...]
> > steamwebhelper   |  4631064  |  /home/greg/.steam/debian-installation/[...]
> > [...]
> > chrome_crashpad  |  33567792  |  
> > /opt/google/chrome/chrome_crashpad_handler[...]
> > [...]
> > kworker/3:0-eve  |       0  |  [kworker/3:0-events]
> >
> > ps appears to guess an initial maximum width for the VSZ field, but
> > when a value comes along that exceeds the guessed maximum, it simply
> > shoves the field barrier over.  It doesn't even become the new maximum,
> > with all of the fields aligning after that.  It's just a one-time shove,
> > breaking the current line only.
> >
> > Therefore, parsing the header line cannot give us enough information to
> > insert field separators correctly in body lines after the fact.
> 
> 
> Dear all,
> 
> Thanks for chiming in.  The example was indeed simplified and I am using
> %a which can contain internal whitespace.
> 
> This is the command I was using previously:
> 
>   ps -eo '%p|%c|%C' -o "%mem" -o '|%a' --sort=-%cpu
> 
> I now replaced it with
> 
>   ps -eo '%p %c %C' -o "%mem" -o ' %a' --sort=-%cpu  | sed -E 's/([0-9]+) 
> (.+) ([0-9]+.?[0-9]?) ([0-9]+.?[0-9]?) (.+)/\1|\2|\3|\4|\5/'
>  
> This works, but is of course cumbersome to maintain.
> 
> Again, thanks for all the comments!

I think there are a few too many assumptions in there;
in particular, numbers in %a will match patterns designed
to match cpu and mem, because you can't prevent sed from
being greedy (except with the [^ … … ]+ construction, to
restrict what it matches).

This version makes a few assumptions as well:
. that the new format matches the old one (mine) if the
  delimiters given are a single space (like '%p %c %C'),
  or stripped (like "%mem" and '%a', but not ' %a').
. the short command is always 15 chars wide even if all
  the commands in the table are shorter, eg with ps -o.
. I don't have any of those new-fangled extra-long PIDs
  yet today.

It might well break if a CPU or MEM is running at 100%.
That's not easily tested here.

I've reordered the columns on the first pass, so that the
numeric ones (with their limited character set) come first,
which means I can use an auxiliary character for
correcting the spacing. (The spaces between the columns get
comingled with the leading spaces of numbers.) The second
pass sorts that out and processes the heading.

$ ps -eo '%p %c %C' -o "%mem" -o '%a' --sort=-%cpu | sed -E 's/( *[0-9]+) 
(.{15})( +[0-9.]+ +[0-9.]+) (.*$)/\1~\3~\2\4/;' | sed -E 's/([^~]+)~ 
([^~]+)~(.{15})(.*)/\1|\3|\2|\4/;s/^( *PID) (COMMAND        ) /\1|\2|/;s/%MEM 
COMMAND/%MEM|COMMAND/;' | less
$ 

This is the same, except I deliberately chose _ for the auxiliary
character, knowing that short commands are stuffed with underscores:

$ ps -eo '%p %c %C' -o "%mem" -o '%a' --sort=-%cpu | sed -E 's/( *[0-9]+) 
(.{15})( +[0-9.]+ +[0-9.]+) (.*$)/\1_\3_\2\4/;' | sed -E 's/([^_]+)_ 
([^_]+)_(.{15})(.*)/\1|\3|\2|\4/;s/^( *PID) (COMMAND        ) /\1|\2|/;s/%MEM 
COMMAND/%MEM|COMMAND/;' | less
$ 

Examples:

    PID|COMMAND        |%CPU %MEM|COMMAND
   9798|firefox-esr    | 2.5  5.8|firefox-esr
  16143|Isolated Web Co| 1.8  2.2|/usr/lib/firefox-esr/firefox-esr -contentproc 
-childID 11 -isForBrowser -prefsLen 47676 -prefMapSize 232307 -jsInitLen 277276 
-parentBuildID 20230214011352 -appDir /usr/lib/firefox-esr/browser 9798 true tab
   1242|Xorg           | 1.0  1.4|/usr/lib/xorg/Xorg -nolisten tcp :0 vt1 
-keeptty -auth /tmp/serverauth.FxvBp8B7Qn
[ … ]
      8|mm_percpu_wq   | 0.0  0.0|[mm_percpu_wq]
      9|rcu_tasks_rude_| 0.0  0.0|[rcu_tasks_rude_]
     10|rcu_tasks_trace| 0.0  0.0|[rcu_tasks_trace]

An incestuous one, with -o rather -eo:

    PID|COMMAND        |%CPU %MEM|COMMAND
   1694|bash           | 0.0  0.1|bash
  23486|ps             | 0.0  0.0|ps -o %p %c %C -o %mem -o %a --sort=-%cpu
  23487|sed            | 0.0  0.0|sed -E s/( *[0-9]+) (.{15})( +[0-9.]+ 
+[0-9.]+) (.*$)/\1~\3~\2\4/;
  23488|sed            | 0.0  0.0|sed -E s/([^~]+)~ 
([^~]+)~(.{15})(.*)/\1|\3|\2|\4/;s/^( *PID) (COMMAND        ) 
/\1|\2|/;s/%MEM|COMMAND/%MEM|COMMAND/;
  23489|less           | 0.0  0.0|less

Cheers,
David.

Reply via email to