Hi,

On 06/17/2014 07:24 AM, Ritesh Raj Sarraf wrote:

Okay!! Let me check in the lab. I will try to reproduce it. In case you
do not hear back from me, please feel free to ping back.

Meanwhile, can you run sg_inq on the SCSI device ??

Sure:

# sg_inq /dev/sg6
standard INQUIRY:
  PQual=0  Device_type=0  RMB=0  version=0x05  [SPC-3]
  [AERC=0]  [TrmTsk=0]  NormACA=1  HiSUP=1  Resp_data_format=2
  SCCS=0  ACC=0  TPGS=0  3PC=1  Protect=0  BQue=0
  EncServ=0  MultiP=1 (VS=0)  [MChngr=0]  [ACKREQQ=0]  Addr16=0
  [RelAdr=0]  WBus16=0  Sync=0  Linked=0  [TranDis=0]  CmdQue=1
  [SPI: Clocking=0x0  QAS=0  IUS=0]
    length=117 (0x75)   Peripheral device type: disk
 Vendor identification: NETAPP
 Product identification: LUN
 Product revision level: 811a
 Unit serial number: BWr95?E2aJc9

Besides this, I'm trying to isolate a test case that is as small as possible to reproduce this behaviour, using an emptied server and only giving this server access to a small test lun.

Test case 1:

No multipath or whatever, directly operate on the lun (well, on 1 of the four paths). It's a small 10GB lun.

/mnt 0-# mkfs.ext4 -E nodiscard /dev/sdf
[...]
/mnt 0-# mkdir discard
/mnt 0-# mount /dev/sdf discard/
/mnt 0-# cd discard/
/mnt/discard 0-# fstrim -v -o 0 -l 128MB ./
./: 0 bytes were trimmed

Ok, that went fine, there's no data yet, so this was expected. Also, no errors in dmesg. Let's create some random data on the lun, remove the file and fstrim again:

/mnt/discard 0-# dd if=/dev/urandom of=bla bs=1028476 count=128
128+0 records in
128+0 records out
131644928 bytes (132 MB) copied, 14.7496 s, 8.9 MB/s

/mnt/discard 0-# fstrim -v -o 0 -l 256MB ./
./: 117063680 bytes were trimmed

So far so good.

Next test cases will work towards the situation which is identical to in which the issue occured yesterday, having a striped lvm logical volume on top of encryption and multipath... I hope somewhere in between it will break in a way that will result in a clear pointer where the issue show yesterday originates.

The way we use netapp with linux, might sound a bit unusual, but it works great in practice:

               xvda in a domU
                     |
                    xen
                     |
             lv (striped, -i 2)
                     |
              lvm volume group
             /                \
            /                  \
       lvm pv                 lvm pv
      dm-crypt               dm-crypt
       mpath1                 mpath2
        ||||                   ||||
      a,b  c,d               e,f  g,h
      ||    ||               ||    ||
   --xxxx---||--------------xxxx---||--- switch
     |  |   ||              |  |   ||
   --|--|--xxxx-------------|--|--xxxx----- switch
     |  |  |  |             |  |  |  |
     |  |  |  |             |  |  |  |
     |  |  |  |             |  |  |  |
     a  b  c  d             e  f  g  h
  NetApp controller 1   NetApp controller 2

Because we don't care about snapshots and other fancy NetApp functionality (sorry 'bout that :) ), we create two multipath devices (each to a lun on a different one of the two disk controllers in a netapp device), put encryption on them, create a lvm pv out of them, add them together in a volume group, and take striped logical volumes out of them. It's even usable on multiple attached servers, as long as you get your locking on metadata operations done right.

But I have to leave now, will continue later.


--
Hans van Kranenburg - System / Network Engineer
+31 (0)10 2760434 | hans.van.kranenb...@mendix.com | www.mendix.com


--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Reply via email to