David Wright <deb...@lionunicorn.co.uk> writes:

> On Mon 20 Feb 2023 at 10:39:21 (+0100), Andreas Leha wrote:
>> Greg Wooledge <g...@wooledge.org> writes:
>> > On Sun, Feb 19, 2023 at 12:04:22PM -0600, David Wright wrote:
>> >> But even that's not enough
>> >> because the field width is somewhat variable: try   ps -eo '%c  |  %z  |  
>> >> %a'
>> >> (We can still use | to make the problem somewhat more obvious.)
>> >
>> > Oh wow.  Yeah, OK, that's not really solvable.
>> >
>> > For those who don't want to try to reverse engineer David's conclusion,
>> > or who don't just happen to stumble upon it with their current process
>> > list, here's what I'm seeing:
>> >
>> > COMMAND          |     VSZ  |  COMMAND
>> > systemd          |  164140  |  /sbin/init
>> > kthreadd         |       0  |  [kthreadd]
>> > rcu_gp           |       0  |  [rcu_gp]
>> > rcu_par_gp       |       0  |  [rcu_par_gp]
>> > [...]
>> > steamwebhelper   |  4631064  |  /home/greg/.steam/debian-installation/[...]
>> > [...]
>> > chrome_crashpad  |  33567792  |  
>> > /opt/google/chrome/chrome_crashpad_handler[...]
>> > [...]
>> > kworker/3:0-eve  |       0  |  [kworker/3:0-events]
>> >
>> > ps appears to guess an initial maximum width for the VSZ field, but
>> > when a value comes along that exceeds the guessed maximum, it simply
>> > shoves the field barrier over.  It doesn't even become the new maximum,
>> > with all of the fields aligning after that.  It's just a one-time shove,
>> > breaking the current line only.
>> >
>> > Therefore, parsing the header line cannot give us enough information to
>> > insert field separators correctly in body lines after the fact.
>> 
>> 
>> Dear all,
>> 
>> Thanks for chiming in.  The example was indeed simplified and I am using
>> %a which can contain internal whitespace.
>> 
>> This is the command I was using previously:
>> 
>>   ps -eo '%p|%c|%C' -o "%mem" -o '|%a' --sort=-%cpu
>> 
>> I now replaced it with
>> 
>>   ps -eo '%p %c %C' -o "%mem" -o ' %a' --sort=-%cpu  | sed -E 's/([0-9]+) 
>> (.+) ([0-9]+.?[0-9]?) ([0-9]+.?[0-9]?) (.+)/\1|\2|\3|\4|\5/'
>>  
>> This works, but is of course cumbersome to maintain.
>> 
>> Again, thanks for all the comments!
>
> I think there are a few too many assumptions in there;
> in particular, numbers in %a will match patterns designed
> to match cpu and mem, because you can't prevent sed from
> being greedy (except with the [^ … … ]+ construction, to
> restrict what it matches).
>
> This version makes a few assumptions as well:
> . that the new format matches the old one (mine) if the
>   delimiters given are a single space (like '%p %c %C'),
>   or stripped (like "%mem" and '%a', but not ' %a').
> . the short command is always 15 chars wide even if all
>   the commands in the table are shorter, eg with ps -o.
> . I don't have any of those new-fangled extra-long PIDs
>   yet today.
>
> It might well break if a CPU or MEM is running at 100%.
> That's not easily tested here.
>
> I've reordered the columns on the first pass, so that the
> numeric ones (with their limited character set) come first,
> which means I can use an auxiliary character for
> correcting the spacing. (The spaces between the columns get
> comingled with the leading spaces of numbers.) The second
> pass sorts that out and processes the heading.
>
> $ ps -eo '%p %c %C' -o "%mem" -o '%a' --sort=-%cpu | sed -E 's/( *[0-9]+) 
> (.{15})( +[0-9.]+ +[0-9.]+) (.*$)/\1~\3~\2\4/;' | sed -E 's/([^~]+)~ 
> ([^~]+)~(.{15})(.*)/\1|\3|\2|\4/;s/^( *PID) (COMMAND        ) /\1|\2|/;s/%MEM 
> COMMAND/%MEM|COMMAND/;' | less
> $ 
>
> This is the same, except I deliberately chose _ for the auxiliary
> character, knowing that short commands are stuffed with underscores:
>
> $ ps -eo '%p %c %C' -o "%mem" -o '%a' --sort=-%cpu | sed -E 's/( *[0-9]+) 
> (.{15})( +[0-9.]+ +[0-9.]+) (.*$)/\1_\3_\2\4/;' | sed -E 's/([^_]+)_ 
> ([^_]+)_(.{15})(.*)/\1|\3|\2|\4/;s/^( *PID) (COMMAND        ) /\1|\2|/;s/%MEM 
> COMMAND/%MEM|COMMAND/;' | less
> $ 
>
> Examples:
>
>     PID|COMMAND        |%CPU %MEM|COMMAND
>    9798|firefox-esr    | 2.5  5.8|firefox-esr
>   16143|Isolated Web Co| 1.8  2.2|/usr/lib/firefox-esr/firefox-esr 
> -contentproc -childID 11 -isForBrowser -prefsLen 47676 -prefMapSize 232307 
> -jsInitLen 277276 -parentBuildID 20230214011352 -appDir 
> /usr/lib/firefox-esr/browser 9798 true tab
>    1242|Xorg           | 1.0  1.4|/usr/lib/xorg/Xorg -nolisten tcp :0 vt1 
> -keeptty -auth /tmp/serverauth.FxvBp8B7Qn
> [ … ]
>       8|mm_percpu_wq   | 0.0  0.0|[mm_percpu_wq]
>       9|rcu_tasks_rude_| 0.0  0.0|[rcu_tasks_rude_]
>      10|rcu_tasks_trace| 0.0  0.0|[rcu_tasks_trace]
>
> An incestuous one, with -o rather -eo:
>
>     PID|COMMAND        |%CPU %MEM|COMMAND
>    1694|bash           | 0.0  0.1|bash
>   23486|ps             | 0.0  0.0|ps -o %p %c %C -o %mem -o %a --sort=-%cpu
>   23487|sed            | 0.0  0.0|sed -E s/( *[0-9]+) (.{15})( +[0-9.]+ 
> +[0-9.]+) (.*$)/\1~\3~\2\4/;
>   23488|sed            | 0.0  0.0|sed -E s/([^~]+)~ 
> ([^~]+)~(.{15})(.*)/\1|\3|\2|\4/;s/^( *PID) (COMMAND        ) 
> /\1|\2|/;s/%MEM|COMMAND/%MEM|COMMAND/;
>   23489|less           | 0.0  0.0|less
>
> Cheers,
> David.

Thanks a lot for this!  Really appreaciated.

I am missing the delimiter between %CPU and
%MEM, though...

Best,
Andreas

Reply via email to