You have been subscribed to a public bug:

== Comment: #0 - PAVITHRA R. PRAKASH <pavra...@in.ibm.com> - 2017-03-07 
05:00:29 ==
---Problem Description---

Ubuntu 17.04: dump is not captured in remote host when kdump over ssh is
configured on firestone.

---Steps to Reproduce---

1. Configure kdump.
2. Check whether kdump is operational using ?# kdump-config show?.
3. Install ?kernel-debuginfo? and ?kernel-debuginfo-common? rpms.
4. Setup password less ssh connection, generate rsa key.
# ssh-keygen -t rsa
5. verify id_rsa and id_rsa.pub are created under /root/.ssh/
6. Edit /etc/default/kdump-tools and add below entries.
SSH="ubuntu@9.114.15.239"
SSH_KEY=/root/.ssh/id_rsa
7. Propagate RSA key.
# kdump-config propagate
8. Restart kdump service.
# kdump-config load
9. Trigger Crash using below commands.
# echo "1" > /proc/sys/kernel/sysrq
# echo "c" > /proc/sysrq-trigger
10. Verify dump is available in remote server in configured path.

Machine details
===========

$ ipmitool -I lanplus -H  9.47.70.3 -U ADMIN -P admin sol activate

$ ssh ubuntu@9.47.70.29

PW: shriya101


Attaching logs

== Comment: #1 - PAVITHRA R. PRAKASH <pavra...@in.ibm.com> - 2017-03-07
05:01:42 ==


== Comment: #5 - PAVITHRA R. PRAKASH <pavra...@in.ibm.com> - 2017-03-07 
23:19:46 ==
Hi, 

Attaching the logs.

Network info:

root@ltc-firep3:~# hwinfo --network
36: None 00.0: 10700 Loopback                                   
  [Created at net.126]
  Unique ID: ZsBS.GQNx7L4uPNA
  SysFS ID: /class/net/lo
  Hardware Class: network interface
  Model: "Loopback network interface"
  Device File: lo
  Link detected: yes
  Config Status: cfg=new, avail=yes, need=no, active=unknown

37: None 00.0: 10701 Ethernet
  [Created at net.126]
  Unique ID: 2lHw.ndpeucax6V1
  Parent ID: mIXc.aXC4wIvegH8
  SysFS ID: /class/net/enP33p3s0f2
  SysFS Device Link: 
/devices/pci0021:00/0021:00:00.0/0021:01:00.0/0021:02:01.0/0021:03:00.2
  Hardware Class: network interface
  Model: "Ethernet network interface"
  Driver: "tg3"
  Driver Modules: "tg3"
  Device File: enP33p3s0f2
  HW Address: 98:be:94:03:18:4a
  Permanent HW Address: 98:be:94:03:18:4a
  Link detected: no
  Config Status: cfg=new, avail=yes, need=no, active=unknown
  Attached to: #15 (Ethernet controller)

38: None 00.0: 10701 Ethernet
  [Created at net.126]
  Unique ID: 7Onn.ndpeucax6V1
  Parent ID: sx0U.aXC4wIvegH8
  SysFS ID: /class/net/enP33p3s0f0
  SysFS Device Link: 
/devices/pci0021:00/0021:00:00.0/0021:01:00.0/0021:02:01.0/0021:03:00.0
  Hardware Class: network interface
  Model: "Ethernet network interface"
  Driver: "tg3"
  Driver Modules: "tg3"
  Device File: enP33p3s0f0
  HW Address: 98:be:94:03:18:48
  Permanent HW Address: 98:be:94:03:18:48
  Link detected: yes
  Config Status: cfg=new, avail=yes, need=no, active=unknown
  Attached to: #16 (Ethernet controller)

39: None 00.0: 10701 Ethernet
  [Created at net.126]
  Unique ID: VwX_.ndpeucax6V1
  Parent ID: DUng.aXC4wIvegH8
  SysFS ID: /class/net/enP33p3s0f3
  SysFS Device Link: 
/devices/pci0021:00/0021:00:00.0/0021:01:00.0/0021:02:01.0/0021:03:00.3
  Hardware Class: network interface
  Model: "Ethernet network interface"
  Driver: "tg3"
  Driver Modules: "tg3"
  Device File: enP33p3s0f3
  HW Address: 98:be:94:03:18:4b
  Permanent HW Address: 98:be:94:03:18:4b
  Link detected: no
  Config Status: cfg=new, avail=yes, need=no, active=unknown
  Attached to: #25 (Ethernet controller)

40: None 00.0: 10701 Ethernet
  [Created at net.126]
  Unique ID: bZ1s.ndpeucax6V1
  Parent ID: J7HY.aXC4wIvegH8
  SysFS ID: /class/net/enP33p3s0f1
  SysFS Device Link: 
/devices/pci0021:00/0021:00:00.0/0021:01:00.0/0021:02:01.0/0021:03:00.1
  Hardware Class: network interface
  Model: "Ethernet network interface"
  Driver: "tg3"
  Driver Modules: "tg3"
  Device File: enP33p3s0f1
  HW Address: 98:be:94:03:18:49
  Permanent HW Address: 98:be:94:03:18:49
  Link detected: no
  Config Status: cfg=new, avail=yes, need=no, active=unknown
  Attached to: #4 (Ethernet controller)
root@ltc-firep3:~# 


Thanks,
Pavithra

== Comment: #6 - PAVITHRA R. PRAKASH <pavra...@in.ibm.com> - 2017-03-07
23:20:47 ==


== Comment: #7 - PAVITHRA R. PRAKASH <pavra...@in.ibm.com> - 2017-03-07 
23:21:27 ==


== Comment: #8 - Urvashi Jawere <urjaw...@in.ibm.com> - 2017-03-08 02:48:15 ==
I am able to see some errors in syslog ;

auxiliary
Mar  7 04:57:44 ltc-firep3 systemd-resolved[3486]: DNSSEC validation failed for 
question 114.15.239:/home/ubuntu/test IN SOA: failed-auxiliary
Mar  7 04:57:44 ltc-firep3 systemd-resolved[3486]: DNSSEC validation failed for 
question 9.114.15.239:/home/ubuntu/test IN DS: failed-auxiliary
Mar  7 04:57:44 ltc-firep3 systemd-resolved[3486]: DNSSEC validation failed for 
question 9.114.15.239:/home/ubuntu/test IN SOA: failed-auxiliary
Mar  7 04:57:44 ltc-firep3 systemd-resolved[3486]: DNSSEC validation failed for 
question 9.114.15.239:/home/ubuntu/test IN A: failed-auxiliary
Mar  7 04:57:44 ltc-firep3 systemd-resolved[3486]: Server 9.12.16.2 does not 
support DNSSEC, downgrading to non-DNSSEC mode.
Mar  7 04:57:44 ltc-firep3 kdump-config: /root/.ssh/id_rsa failed to be sent to 
ubuntu@9.114.15.239:/home/ubuntu/test
Mar  7 04:58:04 ltc-firep3 systemd[1]: Reloading.
Mar  7 04:59:15 ltc-firep3 systemd[1]: Reloading.
Mar  7 04:59:16 ltc-firep3 kdump-config: propagated ssh key /root/.ssh/id_rsa 
to server ubuntu@9.114.15.239
.
.
.

Mar  7 05:06:55 ltc-firep3 systemd[1]: Started Accounts Service.
Mar  7 05:06:56 ltc-firep3 kdump-tools[3498]: Starting kdump-tools: Modified 
cmdline:root=UUID=1e76cfd5-988c-46f4-bdc4-39fe1ed01152 ro quiet splash irqpoll 
nr_cpus=1 nousb systemd.unit=kdump-tools.service ata_piix.prefer_ms_hyperv=0 
elfcorehdr=155136K
Mar  7 05:06:57 ltc-firep3 kdump-tools[3498]:  * loaded kdump kernel
Mar  7 05:06:57 ltc-firep3 kdump-tools: /sbin/kexec -p 
--command-line="root=UUID=1e76cfd5-988c-46f4-bdc4-39fe1ed01152 ro quiet splash 
irqpoll nr_cpus=1 nousb systemd.unit=kdump-tools.service 
ata_piix.prefer_ms_hyperv=0" --initrd=/var/lib/kdump/initrd.img 
/var/lib/kdump/vmlinuz
Mar  7 05:06:57 ltc-firep3 kdump-tools: loaded kdump kernel
Mar  7 05:06:57 ltc-firep3 systemd[1]: Started Kernel crash dump capture 
service.
Mar  7 05:06:57 ltc-firep3 apport[3584]: ERROR: Cannot create report: [Errno 
17] File exists: '/var/crash/linux-image-4.10.0-9-generic-201703060521.crash'
Mar  7 05:06:57 ltc-firep3 apport[3584]:    ...done.

== Comment: #18 - Hari Krishna Bathini <hbath...@in.ibm.com> - 2017-03-28 
06:55:20 ==
Looks like tg3 module was not needed after all. Interesting thing though is
even after enP34p1s0f0 is up (ifup) and network.online target is reached,
network was not really active. It took about 30 seconds, after reaching 
network.online target, for the network to be active, even on a normal boot.
Adding this wait time in kdump script, before saving dump, ensured that
vmcore is captured successful. Attaching the log for the same..

Not sure why enP34p1s0f0 is taking that long to configure/initialize. Even so,
this delay should be part of ifup/network-online.target if it is inevitable,
so that network is pingable after network-online.target
 
Thanks
Hari

== Comment: #19 - Hari Krishna Bathini <hbath...@in.ibm.com> - 2017-03-28 
07:01:52 ==
The workaround snippet adding delay in kdump script:


--- kdump-config.orig   2017-03-28 03:35:17.753542107 -0500
+++ kdump-config        2017-03-28 06:59:22.887576623 -0500
@@ -761,6 +761,7 @@
        KDUMP_DMESGFILE="$KDUMP_STAMPDIR/dmesg.$KDUMP_STAMP"
        ERROR=0
 
+       sleep 30
        ssh -i $KDUMP_SSH_KEY $KDUMP_REMOTE_HOST mkdir -p $KDUMP_STAMPDIR
        ERROR=$?
        # If remote connections fails, no need to continue

---

Thanks
Hari

== Comment: #20 - PAVITHRA R. PRAKASH <pavra...@in.ibm.com> - 2017-03-30 
01:33:56 ==
(In reply to comment #19)
> The workaround snippet adding delay in kdump script:
> 
> 
> --- kdump-config.orig 2017-03-28 03:35:17.753542107 -0500
> +++ kdump-config      2017-03-28 06:59:22.887576623 -0500
> @@ -761,6 +761,7 @@
>       KDUMP_DMESGFILE="$KDUMP_STAMPDIR/dmesg.$KDUMP_STAMP"
>       ERROR=0
>  
> +     sleep 30
>       ssh -i $KDUMP_SSH_KEY $KDUMP_REMOTE_HOST mkdir -p $KDUMP_STAMPDIR
>       ERROR=$?
>       # If remote connections fails, no need to continue
> 
> ---
> 
> Thanks
> Hari

With above workaround dump captured successfully in remote host.

Thanks,
Pavithra

== Comment: #22 - Hari Krishna Bathini <hbath...@in.ibm.com> - 2017-04-10 
22:14:27 ==
(In reply to comment #18)
> Created attachment 117088 [details]
> Console log of successful dump capture after adding a time delay of 'sleep
> 30'
> 
> Looks like tg3 module was not needed after all. Interesting thing though is
> even after enP34p1s0f0 is up (ifup) and network.online target is reached,
> network was not really active. It took about 30 seconds, after reaching 
> network.online target, for the network to be active, even on a normal boot.
> Adding this wait time in kdump script, before saving dump, ensured that
> vmcore is captured successful. Attaching the log for the same..
> 
> Not sure why enP34p1s0f0 is taking that long to configure/initialize. Even
> so,
> this delay should be part of ifup/network-online.target if it is inevitable,
> so that network is pingable after network-online.target

Hi Canonical,

Since this falls outside the realm of kdump, should we add a NET_WAIT_TIME field
in /etc/default/kdump-tools file that defaults to 0 but can be changed when the
user sees timing troubles?

Thanks
Hari

** Affects: makedumpfile (Ubuntu)
     Importance: Undecided
     Assignee: Taco Screen team (taco-screen-team)
         Status: New


** Tags: architecture-ppc64le bugnameltc-152306 severity-high 
targetmilestone-inin1704
-- 
Ubuntu 17.04: dump is not captured in remote host when kdump over ssh is 
configured on firestone.
https://bugs.launchpad.net/bugs/1681909
You received this bug notification because you are a member of Kernel Packages, 
which is subscribed to makedumpfile in Ubuntu.

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to