Samuel Thibault, on Wed 20 Apr 2016 01:07:09 +0200, wrote:
> Samuel Thibault, on Tue 19 Apr 2016 12:18:01 +0200, wrote:
> > Svante Signell, on Tue 19 Apr 2016 12:08:07 +0200, wrote:
> > > > Looking at ps -feMj, it seems that it's ext2fs which consumes much more
> > > > CPU time, thus increasing overall wallclock time. It'd probably be
> > > > interesting to profile ext2fs, to see what takes longer. Perhaps it's
> > > > the hash table which is less efficient since with the new page cache
> > > > policy there are much more cached files?
> > >
> > > Are you preparing ext2fs for profiling? .../pkg-glibc/glibc: hurd-i386:
> > > Fix
> > > recording profiling from ext2fs
> >
> > Yes. I just got the results, attached here.
>
> I have now pushed a wiki page for profiling the kernel, which is more
> involved:
> http://darnassus.sceen.net/~hurd-web/microkernel/mach/gnumach/profiling/
> The glibc build is going on, results available later :)
Here is the profiling for the new policy.
So half of the time is spent in vm_map_enter.part.0, and looking more
closely at the addresses, it's exactly the while(TRUE) loop between
lines 777 and 830.
This indeed loops over all vm_map_entries of the process, which are
plenty when we have a lot of cache (a mere find /usr on my system brings
11400 cached objects, and vminfo on ext2fs shows 3240 entries).
Most of the time, these are contiguous, so it's kind of stupid to look
again at each and every of them to find room. I guess we can introduce
a much better algorithm, by maintaining pointers to the free areas.
Samuel
77m22.980s
USER PID PPID PGRP SESS LCOLL STAT TIME COMMAND
root 1 1 1 1 1 Sslo 0:00.07 init [2]
- 2 1 1 1 1 Sp 0:00.00 /hurd/startup root=device:
- 3 1 1 1 1 D<p 0:17.62 root=device:hd1 console=co
- 4 2 1 1 1 R<mo 1:15.91 /hurd/proc
- 5 2 1 1 1 S<o 7:05.29 ext2fs --readonly --multib
- 6 5 1 1 1 S<o 0:50.76 /hurd/exec
- 7 2 1 1 1 S<o 0:06.08 /hurd/auth
root 8 5 1 1 1 S<o 0:00.06 /hurd/term /dev/console de
root 9 5 1 1 1 S<o 0:17.82 /hurd/pflocal
root 11 1 11 11 1 Ssfo 0:00.06 /hurd/mach-defpager
root 15 5 1 1 1 S<o 0:00.00 /hurd/fifo
- 19 5 1 1 1 So 0:17.03 /hurd/null
root 23 5 1 1 1 S<o 0:00.00 /hurd/storeio hd1
root 24 5 1 1 1 S<o 0:00.00 /hurd/storeio hd0s1
root 25 5 1 1 1 S<o 0:00.00 /hurd/storeio hd3s1
root 26 5 1 1 1 S<o 0:00.01 /hurd/storeio hd2
root 27 1 14 14 1 S<o 0:00.01 /hurd/procfs --compatible
root 37 5 1 1 1 S<o 0:00.00 /hurd/proxy-defpager
root 85 1 14 14 1 S<o 0:00.02 /hurd/tmpfs --nosuid --noe
root 86 5 1 1 1 S<o 0:00.00 /hurd/storeio --no-cache t
root 100 1 14 14 1 S<o 0:00.00 /hurd/tmpfs --noexec --nos
root 169 1 14 14 1 S<o 0:00.00 /hurd/tmpfs --nosuid --noe
- 205 27 14 14 1 So 0:00.00 /hurd/mtab /
root 412 1 14 14 1 S<o 0:00.06 /hurd/ext2fs /dev/hd0s1
root 511 5 1 1 1 So 0:00.17 /hurd/magic tty
root 522 5 1 1 1 S<o 0:00.52 /hurd/random --seed-file /
root 542 5 1 1 1 S<o 0:07.14 /hurd/pfinet -i /dev/eth0
root 543 5 1 1 1 So 0:00.00 /hurd/devnode -M /dev/netd
root 544 5 1 1 1 So 0:00.00 devnode -n eth0 eth0
root 637 85 14 14 1 S<o 0:00.01 /hurd/fifo
root 650 5 1 1 1 S<o 0:00.00 /hurd/streamio kmsg
root 691 1 691 691 1 Ssfo 0:00.00 /usr/sbin/cron
messagebus 715 1 715 715 1 Ssfo 0:00.00 /usr/bin/dbus-daemon --sys
root 720 5 1 1 1 S<o 0:00.00 /hurd/term /dev/tty1 hurdi
root 721 5 1 1 1 S<o 0:00.00 /hurd/console
root 726 1 725 725 1 S<fo 0:00.18 /bin/console --daemonize -
root 727 5 1 1 1 S<o 0:00.00 /hurd/storeio --no-cache m
root 734 1 732 732 1 Sfo 0:00.00 /usr/sbin/inetutils-inetd
root 764 1 764 764 1 Ssfo 0:00.00 /usr/sbin/sshd
- 785 1 785 785 785 Sslo 0:00.00 /bin/bash -noprofile /bin/
- 786 1 786 786 786 Sslo 0:00.02 /bin/bash -noprofile /bin/
- 787 1 787 787 787 Sslo 0:00.00 /bin/bash -noprofile /bin/
- 788 1 788 788 788 Sslo 0:00.00 /bin/bash -noprofile /bin/
- 789 1 789 789 789 Sslo 0:00.00 /bin/bash -noprofile /bin/
- 790 1 790 790 790 Sslo 0:00.00 /bin/bash -noprofile /bin/
- 791 1 791 791 791 Sslo 0:00.00 /bin/bash -noprofile /bin/
root 792 5 1 1 1 S<o 0:00.00 /hurd/term /dev/tty3 hurdi
root 793 5 1 1 1 S<o 0:00.00 /hurd/term /dev/tty2 hurdi
root 794 5 1 1 1 S<o 0:00.00 /hurd/term /dev/tty5 hurdi
root 795 5 1 1 1 S<o 0:00.01 /hurd/term /dev/tty4 hurdi
root 796 5 1 1 1 S<o 0:00.00 /hurd/term /dev/tty6 hurdi
root 797 5 1 1 1 S<o 0:00.01 /hurd/password
root 798 734 732 732 1 So 0:00.06 telnetd
root 799 5 1 1 1 S<o 0:00.09 /hurd/term /dev/ptyp0 pty-
root 800 798 800 800 800 Sslo 0:00.06 -bash
root 815 1 814 800 1 S<o 25:52.77 /hurd/ext2fs.prof /dev/hd3
root 816 734 732 732 1 So 0:00.79 telnetd
root 817 5 1 1 1 S<o 0:18.74 /hurd/term /dev/ptyp1 pty-
samy 818 816 818 818 818 Sslow 0:00.10 -bash
root 3159 818 3159 818 818 S 0:00.00 ps -feMj
root 19593 5 1 1 1 S<o 0:00.00 /hurd/fifo
Flat profile:
Each sample counts as 0.0001 seconds.
% cumulative self self total
time seconds seconds calls Ts/call Ts/call name
55.05 2.82 2.82 vm_map_enter.part.0
6.54 3.15 0.34 spl0
3.23 3.32 0.17 syscall_entry_2
3.20 3.48 0.16 ide_do_request
2.99 3.64 0.15 rdxtree_lookup_common
2.27 3.75 0.12 triton_dmaproc
1.46 3.83 0.07 ide_wait_stat
1.28 3.89 0.07 t_page_fault
1.19 3.96 0.06 vm_page_lookup
1.09 4.01 0.06 lock_done
1.08 4.07 0.06 memset
0.98 4.12 0.05 kmem_cache_alloc
0.96 4.17 0.05 mach_msg_trap
0.93 4.21 0.05 vm_map_copyout
0.83 4.26 0.04 lock_write
0.82 4.30 0.04 copyoutmsg
0.53 4.33 0.03 kmem_cache_free
0.51 4.35 0.03 syscall_native
0.50 4.38 0.03 ipc_kmsg_copyin_header
0.48 4.40 0.02 spl7
0.43 4.42 0.02 ipc_right_lookup_write
0.42 4.44 0.02 ipc_kmsg_queue_next
0.38 4.46 0.02 copyinmsg
0.37 4.48 0.02 _kret_popl_fs
0.37 4.50 0.02 _kret_popl_gs
0.35 4.52 0.02 ipc_kobject_server
0.33 4.54 0.02 pmap_remove_range
0.32 4.55 0.02 ipc_right_copyin
0.31 4.57 0.02 pmap_enter
0.30 4.58 0.02 ipc_mqueue_send
0.29 4.60 0.02 ipc_kmsg_copyin_body
0.29 4.61 0.01 port_name_to_space
0.29 4.63 0.01 lock_read
0.28 4.64 0.01 ipc_mqueue_copyin
0.26 4.66 0.01 syscall_entry_3
0.25 4.67 0.01 _kret_popl_ds
0.24 4.68 0.01 pmap_remove
0.24 4.69 0.01 t_nofpu
0.22 4.71 0.01 __delay
0.22 4.72 0.01 fpnoextflt
0.20 4.73 0.01 vm_object_enter
0.19 4.74 0.01 vm_fault
0.19 4.75 0.01 vm_map
0.18 4.75 0.01 ipc_mqueue_move
0.18 4.76 0.01 vm_map_lookup_entry
0.17 4.77 0.01 vm_fault_page
0.17 4.78 0.01 copyout
0.17 4.79 0.01 syscall_vm_map
0.16 4.80 0.01 ipc_mqueue_receive
0.16 4.81 0.01 ipc_kmsg_get
0.16 4.81 0.01 pmap_pte
0.15 4.82 0.01 _kret_popl_es
0.15 4.83 0.01 mach_call_call
0.15 4.84 0.01 ipc_kmsg_copyout_body
0.14 4.84 0.01 user_trap
0.13 4.85 0.01 memcpy
0.13 4.86 0.01 vm_object_deallocate
0.12 4.86 0.01 ll_rw_block
0.12 4.87 0.01 fp_load
0.12 4.88 0.01 port_name_to_map
0.11 4.88 0.01 vm_object_cache_remove
0.10 4.89 0.01 mach_msg_send
0.10 4.89 0.01 vm_map_delete
0.10 4.90 0.00 vm_map_lookup
0.09 4.90 0.00 ipc_port_release_send
0.09 4.91 0.00 mach_msg_receive
0.09 4.91 0.00 _kret_iret
0.09 4.92 0.00 trap_push_segs
0.09 4.92 0.00 ipc_object_copyin
0.09 4.93 0.00 ipc_right_dealloc
0.09 4.93 0.00 rbtree_insert_rebalance
0.08 4.93 0.00 copyinmsg_ret
0.08 4.94 0.00 ipc_entry_alloc
0.08 4.94 0.00 phys_attribute_test
0.08 4.95 0.00 ipc_kmsg_put
0.08 4.95 0.00 rdxtree_remove
0.08 4.96 0.00 kfree
0.08 4.96 0.00 rbtree_remove
0.07 4.96 0.00 convert_port_to_space
0.07 4.97 0.00
_Xmach_port_set_protected_payload
0.07 4.97 0.00 ipc_kmsg_copyout_header
0.07 4.97 0.00 ipc_right_delta
0.07 4.98 0.00 pmap_page_protect
0.07 4.98 0.00 vm_object_page_remove
0.06 4.98 0.00 _Xmach_port_mod_refs
0.06 4.99 0.00 ipc_right_check
0.06 4.99 0.00 phys_attribute_clear
0.06 4.99 0.00 rdxtree_insert_common
0.06 5.00 0.00
ipc_kmsg_copyout_object.part.6
0.06 5.00 0.00 ipc_port_destroy
0.06 5.00 0.00 kalloc
0.05 5.01 0.00 copyin
0.05 5.01 0.00 vm_map_enter
0.05 5.01 0.00 ipc_object_copyin_type
0.05 5.01 0.00 vm_map_entry_delete
0.05 5.02 0.00 _Xmach_port_move_member
0.05 5.02 0.00 copyoutmsg_ret
0.05 5.02 0.00 ipc_object_copyout_name
0.05 5.02 0.00 thread_block
0.05 5.03 0.00 ipc_kmsg_copyin
0.04 5.03 0.00 mach_port_move_member
0.04 5.03 0.00 ipc_object_alloc
0.04 5.03 0.00 ipc_right_copyout
0.04 5.03 0.00 ipc_right_reverse
0.04 5.04 0.00 mach_msg_receive_continue
0.04 5.04 0.00 pmap_valid_page
0.04 5.04 0.00
syscall_mach_port_deallocate
0.04 5.04 0.00
thread_will_wait_with_timeout
0.04 5.04 0.00
syscall_mach_port_insert_right
0.03 5.05 0.00 ipc_entry_dealloc
0.03 5.05 0.00 rdxtree_remove_bm_set
0.03 5.05 0.00 copyout_ret
0.03 5.05 0.00 syscall_vm_deallocate
0.03 5.05 0.00 ipc_object_copyout
0.03 5.05 0.00 vm_fault_cleanup
0.03 5.06 0.00 mach_port_deallocate
0.03 5.06 0.00 rbtree_rotate
0.03 5.06 0.00 vm_page_deactivate
0.03 5.06 0.00 _return_from_trap
0.03 5.06 0.00 ipc_port_alloc
0.03 5.06 0.00 ipc_pset_remove
0.03 5.06 0.00 mach_port_mod_refs
0.03 5.07 0.00 syscall_mach_port_allocate
0.03 5.07 0.00 user_page_fault_continue
0.03 5.07 0.00 vm_object_copy_delayed
0.02 5.07 0.00 _vm_map_entry_create
0.02 5.07 0.00 copyin_ret
0.02 5.07 0.00 ipc_pset_move
0.02 5.07 0.00 vm_object_lookup
0.02 5.07 0.00 vm_page_free
0.02 5.07 0.00 ipc_object_translate
0.02 5.08 0.00 ipc_thread_dequeue
0.02 5.08 0.00 thread_syscall_return
0.02 5.08 0.00 vm_map_copyin.part.2
0.02 5.08 0.00 vm_page_alloc_pa
0.02 5.08 0.00 ipc_entry_alloc_name
0.02 5.08 0.00 memory_object_lock_request
0.02 5.08 0.00 rdwr_full
0.02 5.08 0.00 vm_page_insert
0.02 5.08 0.00 vm_page_mem_free
0.02 5.09 0.00
vm_page_seg_alloc_from_buddy
0.02 5.09 0.00 _vm_map_clip_start
0.02 5.09 0.00 ipc_port_clear_receiver
0.02 5.09 0.00 space_deallocate
0.02 5.09 0.00 thread_go
0.02 5.09 0.00
_Xmach_port_request_notification
0.02 5.09 0.00 _vm_map_clip_end
0.02 5.09 0.00 ipc_mqueue_changed
0.02 5.09 0.00 mach_port_insert_right
0.02 5.09 0.00 thread_handoff
0.02 5.09 0.00 vm_map_deallocate
0.02 5.09 0.00 vm_map_remove
0.02 5.10 0.00 vm_map_verify
0.01 5.10 0.00 vm_map_copy_page_discard
0.01 5.10 0.00
vm_map_copyin_page_list.part.3
0.01 5.10 0.00 vm_page_remove
0.01 5.10 0.00 _Xdevice_write
0.01 5.10 0.00 __wait_on_buffer
0.01 5.10 0.00 ipc_kmsg_dequeue
0.01 5.10 0.00 ipc_object_release
0.01 5.10 0.00 ipc_target_init
0.01 5.10 0.00
mach_port_set_protected_payload
0.01 5.10 0.00 memory_object_lock_page
0.01 5.10 0.00 thread_bootstrap_return
0.01 5.10 0.00 thread_will_wait
0.01 5.10 0.00 vm_page_free_pa
0.01 5.10 0.00 dev_port_lookup
0.01 5.10 0.00 device_read
0.01 5.10 0.00 device_write
0.01 5.11 0.00 ipc_kmsg_copyout
0.01 5.11 0.00 ipc_kmsg_destroy
0.01 5.11 0.00 ipc_mqueue_init
0.01 5.11 0.00 trap_set_segs
0.01 5.11 0.00 check_limit
0.01 5.11 0.00
ipc_object_copyin_from_kernel
0.01 5.11 0.00 ipc_port_nsrequest
0.01 5.11 0.00
ipc_port_set_protected_payload
0.01 5.11 0.00 mach_port_allocate
0.01 5.11 0.00 rdxtree_cleanup
0.01 5.11 0.00 thread_invoke
0.01 5.11 0.00 vm_map_copy_discard
0.01 5.11 0.00 vm_object_collapse.part.6
0.01 5.11 0.00 vm_page_convert
0.01 5.11 0.00 vm_page_grab_fictitious
0.01 5.11 0.00 vm_page_seg_free_to_buddy
0.01 5.11 0.00
_Xmemory_object_data_supply
0.01 5.11 0.00 call_continuation
0.01 5.11 0.00 get_block_size
0.01 5.11 0.00
ipc_kmsg_copyin_from_kernel
0.01 5.11 0.00 ipc_notify_no_senders
0.01 5.11 0.00 ipc_port_lock_mqueue
0.01 5.11 0.00 mach_msg_continue
0.01 5.11 0.00 pmap_clear_modify
0.01 5.11 0.00 trap_from_user
0.01 5.11 0.00 vm_deallocate
0.01 5.11 0.00 vm_page_activate
0.01 5.12 0.00 vm_page_zero_fill
0.01 5.12 0.00 vm_pageout_setup
0.00 5.12 0.00
_Xmemory_object_lock_request
0.00 5.12 0.00 __const_udelay
0.00 5.12 0.00 _vm_object_allocate
0.00 5.12 0.00 block_write
0.00 5.12 0.00 ds_device_write
0.00 5.12 0.00 ds_device_write_reply
0.00 5.12 0.00 ipc_kmsg_clean
0.00 5.12 0.00 ipc_kmsg_get_from_kernel
0.00 5.12 0.00 ipc_port_init
0.00 5.12 0.00 ipc_pset_add
0.00 5.12 0.00 ipc_target_terminate
0.00 5.12 0.00 mach_msg_send_from_kernel
0.00 5.12 0.00
mach_port_request_notification
0.00 5.12 0.00 memory_object_data_supply
0.00 5.12 0.00 pmap_is_modified
0.00 5.12 0.00 pmap_zero_page
0.00 5.12 0.00 rdxtree_node_create
0.00 5.12 0.00 retrieve_thread_self_fast
0.00 5.12 0.00 schedule
0.00 5.12 0.00 vm_map_copyin
0.00 5.12 0.00 vm_map_copyin_page_list
0.00 5.12 0.00 vm_object_remove
0.00 5.12 0.00 vm_object_terminate
0.00 5.12 0.00 vm_page_release_fictitious
0.00 5.12 0.00
_Xmemory_object_change_attributes
0.00 5.12 0.00 _Xvm_copy
0.00 5.12 0.00 _take_trap
0.00 5.12 0.00 block_read
0.00 5.12 0.00 copyoutmsg_fail
0.00 5.12 0.00 do_rdwr
0.00 5.12 0.00 ipc_kobject_set
0.00 5.12 0.00 memory_object_data_request
0.00 5.12 0.00 memory_object_data_unlock
0.00 5.12 0.00 pmap_clear_reference
0.00 5.12 0.00
rdxtree_insert_alloc_common
0.00 5.12 0.00 vm_external_destroy
0.00 5.12 0.00 vm_map_copyin_object
0.00 5.12 0.00 vm_object_coalesce
0.00 5.12 0.00 vm_object_reference
0.00 5.12 0.00 vm_page_lookup_pa
0.00 5.12 0.00 copyout_fail
% the percentage of the total running time of the
time program used by this function.
cumulative a running sum of the number of seconds accounted
seconds for by this function and those listed above it.
self the number of seconds accounted for by this
seconds function alone. This is the major sort for this
listing.
calls the number of times this function was invoked, if
this function is profiled, else blank.
self the average number of milliseconds spent in this
ms/call function per call, if this function is profiled,
else blank.
total the average number of milliseconds spent in this
ms/call function and its descendents per call, if this
function is profiled, else blank.
name the name of the function. This is the minor sort
for this listing. The index shows the location of
the function in the gprof listing. If the index is
in parenthesis it shows where it would appear in
the gprof listing if it were to be printed.
Copyright (C) 2012-2015 Free Software Foundation, Inc.
Copying and distribution of this file, with or without modification,
are permitted in any medium without royalty provided the copyright
notice and this notice are preserved.