On Thu, Mar 09, 2017 at 10:13:10AM +0800, Ye Xiaolong wrote:
On 03/02, Borislav Petkov wrote:
Hi,
On Thu, Mar 02, 2017 at 09:09:34AM +0800, kernel test robot wrote:
FYI, we noticed the following commit:
commit: ed3ce2a9172457ef7dbaa9f964e63dfde2bdcb5f ("x86: Optimize clear_page()")
url:
https://github.com/0day-ci/linux/commits/Borislav-Petkov/x86-Optimize-clear_page/20170215-193441
in testcase: will-it-scale
with following parameters:
test: poll2
cpufreq_governor: performance
test-description: Will It Scale takes a testcase and runs it from 1 through to
n parallel copies to see if the testcase will scale. It builds both a process
and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale
thanks for the report, I was able to reproduce.
BUT(!) this report is misleading because it talks about will-it-scale
but your splat happens when you kexec the kernel:
[ 336.340747] LKP: kexec loading...
[ 336.340852]
[ 336.343323] kexec --noefi -l
/tmp/cache/pkg/linux/x86_64-rhel-7.2/gcc-6/ed3ce2a9172457ef7dbaa9f964e63dfde2bdcb5f/vmlinuz-4.9.0-rc6-00134-ged3ce2a
--initrd=/tmp/cache/initrd-concatenated
[ 336.343758]
[ 337.893471] --append=ip=::::lkp-ivb-d01::dhcp root=/dev/ram0 user=lkp
job=/lkp/scheduled/lkp-ivb-d01/will-it-scale-poll2-performance-debian-x86_64-2016-08-31.cgz-ed3ce2a9172457ef7dbaa9f964e63dfde2bdcb5f-20170301-28072-1dqjyhl-11.yaml
ARCH=x86_64 kconfig=x86_64-rhel-7.2 branch=linux-devel/devel-hourly-2017022612
commit=ed3ce2a9172457ef7dbaa9f964e63dfde2bdcb5f
BOOT_IMAGE=/pkg/linux/x86_64-rhel-7.2/gcc-6/ed3ce2a9172457ef7dbaa9f964e63dfde2bdcb5f/vmlinuz-4.9.0-rc6-00134-ged3ce2a
max_uptime=1500
RESULT_ROOT=/result/will-it-scale/poll2-performance/lkp-ivb-d01/debian-x86_64-2016-08-31.cgz/x86_64-rhel-7.2/gcc-6/ed3ce2a9172457ef7dbaa9f964e63dfde2bdcb5f/11
LKP_SERVER=inn debug apic=debug sysrq_always_enabled
rcupdate.rcu_cpu_stall_timeout=100 net.ifnames=0 printk.devkmsg=on panic=-1
softlockup_panic=1 nmi_watchdog=panic oops=panic load_ramdisk=2
prompt_ramdisk=0 drbd.minor_count=8 systemd.log_level=err ignore_
[ 337.895521]
[ 339.467661] BUG: unable to handle kernel paging request at ffff8803cf2e2008
[ 339.468000] IP: [<ffffffff81061e71>] native_set_pmd+0x1/0x10
...
Maybe Fengguang has an idea what to do here, maybe something like add
markers to the log to denote where the test environment is prepared and
when the actual test starts. Then grep for those and generate the report
based on that...
Thanks for the suggestions, we'll keep improving the reports to avoid confusion
or misleading.
One possible improvement is to provide "lkp qemu" reproduce steps for
kernel oops -- it would be way more convenient and safe to follow than
"lkp run", since the later risks hang the physical machine.
As for the test description, the dmesg carries markers for the user
space test start/stop points, so the robot can easily tell whether the
oops happen during the test or before/after the test -- the latter may
well (but not always) indicate the oops is not relevant to the testcase,
but to the regular kernel boot/reboot/kexec process.
Thanks,
Fengguang