------- Comment From ma...@de.ibm.com 2016-11-21 05:20 EDT-------
y...@cn.ibm.com,

(In reply to comment #12)
> (In reply to comment #8)
> > (In reply to comment #7)
> > > (In reply to comment #2)
> > > > (In reply to comment #1)

> > > > PV Volume information:
> > > > physical_volumes {
> > > >
> > > >                pv0 {
> > > >                        device = "/dev/sdb5"        # Hint only
> > >
> > > >                pv1 {
> > > >                        device = "/dev/sda"        # Hint only
> > >
> > > This does not look very good, having single path scsi disk devices 
> > > mentioned
> > > by LVM. With zfcp-attached SCSI disks, LVM must be on top of multipathing.
> > > Could you please double check if your installation with LVM and 
> > > multipathing
> > > does the correct layering? If not, this would be an independent bug. See
> > > also [1, slide 28 "Multipathing for Disks ? LVM on Top"].
>
> ping
>
> maybe this is part of the root cases for sudden failure

> However, I still don't understand why it had worked before and now no longer.
> What has changed meanwhile to break it?
> Have you used zfcp auto lun scan before but now no longer?
> Or the LVM on multipathing is broken (see above)?

> Actually, we're now in a lot of guessing and desperately need debug data
> from the broken system. Typically we need the output of dbginfo.sh (Ubuntu
> may prefer the output of sosreport).
> Since the system does not boot that's a but tricky, but maybe the method
> described in the previous paragraph works and you can run dbginfo.sh in
> chroot of the broken root-fs; that would us at least give the persistent
> config on disk (though of course not the dynamic config).

debug data?

What does the output of the following command look like?:
$ sudo dmsetup ls --tree -o device,blkdevname,active,open,rw,uuid

Otherwise, it seems to me we can close this as notabug.

> > The "update-initramfs -u" command was never explicitly run after the system
> > was built.
> > The second PV volume was added to VG on 10/26/2016.  However,  it was not
> > until early November that the root FS was extended.
> >
> > Between 10/16/2016 and the date the root fs was extended,  the second PV was
> > always online and and active in a VG and LV display after every Reboot.
>
> I don't understand how it would have ever worked without having ran
> "update-initramfs -u" after the addition of another PV to the root-fs
> dependencies. Maybe chzdev did some magic; what was it's exact output when
> you made the actively added paths persistent with "chzdev zfcp-lun -e
> --online"?

ping

> While zipl does support some cases of device-mapper targets under certain
> circumstances for the "zipl target" (/boot/ with Ubuntu 16.04), it's still
> dangerous to have a multi-PV root-fs _and_ the zipl target being a part of
> the root-fs, i.e. the zipl target not being it's own mount point withOUT LVM.
> [1, slide 25 "Multipathing for Disks ? Persistent Configuration"]
> http://www.ibm.com/support/knowledgecenter/linuxonibm/com.ibm.linux.z.ludd/
> ludd_c_zipl_lboot.html
> http://www.mail-archive.com/linux-390%40vm.marist.edu/msg62492.html
> (root-fs on LVM in general:
> http://www.mail-archive.com/linux-390@vm.marist.edu/msg69553.html)

NB

> > > REFERENCE
> > >
> > > [1]
> > > http://www-05.ibm.com/de/events/linux-on-z/pdf/day2/4_Steffen_Maier_zfcp-best-practices-2015.pdf

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1641078

Title:
  System cannot be booted up when root filesystem is on an LVM on two
  disks

Status in Ubuntu on IBM z Systems:
  New
Status in linux package in Ubuntu:
  New

Bug description:
  ---Problem Description---
  LVMed root file system acrossing multiple disks cannot be booted up 
    
  ---uname output---
  Linux ntc170 4.4.0-38-generic #57-Ubuntu SMP Tue Sep 6 15:47:15 UTC 2016 
s390x s390x s390x GNU/Linux
   
  ---Patches Installed---
  n/a
   
  Machine Type = z13 
   
  ---System Hang---
   cannot boot up the system after shutdown or reboot
   
  ---Debugger---
  A debugger is not configured
   
  ---Steps to Reproduce---
   Created root file system on an LVM and the LVM crosses two disks. After shut 
down or reboot the system, the system cannot be up. 
   
  Stack trace output:
   no
   
  Oops output:
   no
   
  System Dump Info:
    The system is not configured to capture a system dump.
   
  Device driver error code:
   Begin: Mounting root file system ... Begin: Running /scripts/local-top ...   
lvmetad is not active yet, using direct activation during sysinit 
    Couldn't find device with uuid 7PC3sg-i5Dc-iSqq-AvU1-XYv2-M90B-M0kO8V. 
   
  -Attach sysctl -a output output to the bug.

  More detailed installation description:

  The installation was on a FCP SCSI SAN volumes each with two active
  paths.  Multipath was involved.  The system IPLed fine up to the point
  that we expanded the /root filesystem to span volumes.  At boot time,
  the system was unable to locate the second segment of the /root
  filesystem.   The error message indicated this was due to lvmetad not
  being not active.

  Error message:   
         Begin: Running /scripts/local-block ...   lvmetad is not active yet, 
using direct activation during sysinit 
         Couldn't find device with uuid 7PC3sg-i5Dc-iSqq-AvU1-XYv2-M90B-M0kO8V 
          Failed to find logical volume "ub01-vg/root" 
          
  PV Volume information: 
  physical_volumes { 

                 pv0 { 
                         id = "L2qixM-SKkF-rQsp-ddao-gagl-LwKV-7Bw1Dz" 
                         device = "/dev/sdb5"        # Hint only 

                         status = ["ALLOCATABLE"] 
                         flags = [] 
                         dev_size = 208713728        # 99.5225 Gigabytes 
                         pe_start = 2048 
                         pe_count = 25477        # 99.5195 Gigabytes 
                 } 

                 pv1 { 
                         id = "7PC3sg-i5Dc-iSqq-AvU1-XYv2-M90B-M0kO8V" 
                         device = "/dev/sda"        # Hint only 

                         status = ["ALLOCATABLE"] 
                         flags = [] 
                         dev_size = 209715200        # 100 Gigabytes 
                         pe_start = 2048 
                         pe_count = 25599        # 99.9961 Gigabytes 

  
  LV Volume Information: 
  logical_volumes { 

                 root { 
                         id = "qWuZeJ-Libv-DrEs-9b1a-p0QF-2Fj0-qgGsL8" 
                         status = ["READ", "WRITE", "VISIBLE"] 
                         flags = [] 
                         creation_host = "ub01" 
                         creation_time = 1477515033        # 2016-10-26 
16:50:33 -0400 
                         segment_count = 2 

                         segment1 { 
                                 start_extent = 0 
                                 extent_count = 921        # 3.59766 Gigabytes 

                                 type = "striped" 
                                 stripe_count = 1        # linear 

                                 stripes = [ 
                                         "pv0", 0 
                                 ] 
                         } 
                         segment2 { 
                                 start_extent = 921 
                                 extent_count = 25344        # 99 Gigabytes 

                                 type = "striped" 
                                 stripe_count = 1        # linear 

                                 stripes = [ 
                                         "pv1", 0 
                                 ] 
                         } 
                 } 

  
  Additional testing has been done with CKD volumes and we see the same 
behavior.   Only the UUID of the fist volume in the VG can be located at boot, 
and the same message:  lvmetad is not active yet, using direct activation 
during sysinit 
  Couldn't find device with uuid xxxxxxxxxxxxxxxxx  is displayed for CKD disks. 
Just a different UUID is listed.   
  If the file /root file system only has one segment on the first volume,  CKD 
or SCSI  volumes, the system will IPL.  Because of this behavior, I do not 
believe the problem is related to SAN disk or multipath.   I think it is due to 
the system not being able to read the UUID on any PV in the VG other then the 
IPL disk.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-z-systems/+bug/1641078/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to