Bug#605357: Patch for #605357

Jeroen Dekkers Wed, 15 Dec 2010 12:57:17 -0800

tags 605357 patch
thanks

I can reproduce the segfault in a virtual machine by creating multiple
RAID array members with the same number. This shouldn't happen
normally, so I still wonder how it's possible to hit this bug, but the
backtrace is exactly the same:


#0  0x0000000000407dcb in grub_disk_adjust_range (disk=0x0, 
sector=0x7fffffffdea0, offset=0x7fffffffde98, size=4096) at 
../../kern/disk.c:364
        part = 0x1010
#1  0x0000000000407f1f in grub_disk_read (disk=0x0, sector=0, offset=0, 
size=4096, buf=0x687660) at ../../kern/disk.c:397
        tmp_buf = 0x0
        real_offset = 0
#2  0x000000000042c9cd in grub_raid5_recover (array=0x66d4c0, disknr=0, 
buf=0x67e5d0 "", sector=0, size=4096) at ../../disk/raid5_recover.c:48
        err = 64
        buf2 = 0x687660 ""
        i = 1
#3  0x000000000042c014 in grub_raid_read (disk=0x66c460, sector=0, size=8, 
buf=0x67e5d0 "") at ../../disk/raid.c:400
        read_size = 8
        next_level = 0
        read_sector = 0
        e = 0
        b = 0
        p = 2
        n = 1
        disknr = 0
        array = 0x66d4c0
        err = GRUB_ERR_READ_ERROR
#4  0x000000000040809f in grub_disk_read (disk=0x66c460, sector=0, offset=0, 
size=512, buf=0x7fffffffe190) at ../../kern/disk.c:443
        data = 0x0
        start_sector = 0
        len = 512
        pos = 0
        tmp_buf = 0x67e5d0 ""
        real_offset = 0
#5  0x000000000042e16d in grub_lvm_scan_device (name=0x66d710 "md/0") at 
../../disk/lvm.c:284
        err = GRUB_ERR_NONE
        disk = 0x66c460
        da_offset = 140737488348144
        da_size = 4202832
        mda_offset = 140737488348912
        mda_size = 0
        buf = '\000' <repeats 232 times>, "o\003C", '\000' <repeats 17 times>, 
"#\n\000\000\300\342\377\377\377\177\000\000\027\247B", '\000' <repeats 13 
times>, "j\003C", '\000' <repeats 17 times>, 
"\b\000\000\000\360\342\377\377\377\177\000\000\273\251B", '\000' <repeats 13 
times>, "j\003C", '\000' <repeats 21 times>"\360, 
\343\377\377\377\177\000\000;\...@\000\000\000\000\000\261\002c\000\000\000\000\000q\002c\000\000\000\000\000\000\000\000\000n\001\000\000v\002c",
 '\000' <repeats 69 times>"\260, 
\304f\000\000\000\000\000\020\000\000\000\000\000\000\000\377\377\377\377\000\000\000\000P\266\377\367\377\177\000\000\000\000\000\000\000\000\000\000\320\343\377\377\377\177\000"
        vg_id = "\f\344\377\377\377\177\000\000`\304f", '\000' <repeats 27 
times>
        pv_id = "\340\343\377\377\377\177\000\000\177\215B", '\000' <repeats 13 
times>"\377, \000\000\000\000\000\000\000\240\341\377\377\377\177"
        metadatabuf = 0x0
        p = 0x0
        q = 0x7ffff7bb8e40 ""
        vgname = 0x7fffffffe420 "@\344\377\377\377\177"
        lh = 0x7fffffffe190
        pvh = 0x7fffffffe6f0
        dlocn = 0x0
        mdah = 0x0
        rlocn = 0x7ffff78d284c
        i = 0
        j = 4223242
        vgname_len = 0
        vg = 0xffffe400ba490040
        pv = 0x66c440
#6  0x00000000004072d9 in iterate_disk (disk_name=0x66d710 "md/0") at 
../../kern/device.c:96
        dev = 0x0
        hook = 0x42e0f0 <grub_lvm_scan_device>
        ents = 0x41007fff00e3ff49
#7  0x000000000042b5ba in grub_raid_iterate (hook=0x7fffffffe540) at 
../../disk/raid.c:84
        array = 0x66d4c0
#8  0x00000000004079a8 in grub_disk_dev_iterate (hook=0x7fffffffe540) at 
../../kern/disk.c:212
        p = 0x63afc0
#9  0x0000000000407462 in grub_device_iterate (hook=0x42e0f0 
<grub_lvm_scan_device>) at ../../kern/device.c:168
        ents = 0x63b010
#10 0x000000000042ef82 in grub_mod_init (mod=0x0) at ../../disk/lvm.c:679
No locals.
#11 0x000000000042ef6a in grub_lvm_init () at ../../disk/lvm.c:677
No locals.
#12 0x000000000042f072 in grub_init_all () at grub_probe_init.c:58
No locals.
#13 0x0000000000402e10 in main (argc=2, argv=0x7fffffffe6f8) at 
../../util/grub-probe.c:443
        dev_map = 0x0
        argument = 0x7fffffffe93c "/"

In insert_array() in disk/raid.c, we first check if we already have
all the devices of the array. After that we check if the specific
member that we want to add already exists. In both cases we just print
a debug message instead of returning an error. What happens then is
that array->device[new_array->index] gets overwritten and nr_devs gets
incremented. Thus nr_devs gets incremented without adding a new
disk. When trying to read the raid array later on, some disk pointers
are still NULL and we get a segfault when we dereference it.

The attached patch returns an error in both cases, so we at least
don't segfault (which I tested on the virtual machine). I talked with
Julien on IRC, but his segfaults disappeared, so we will probably
never know for sure what really happened.

Regards,

Jeroen dekkers

--- grub2-1.98+20100804/disk/raid.c~    2010-12-15 18:36:32.000000000 +0100
+++ grub2-1.98+20100804/disk/raid.c     2010-12-15 19:58:53.000000000 +0100
@@ -496,17 +496,18 @@
            the same.  */
 
         if (array->total_devs == array->nr_devs)
-          /* We found more members of the array than the array
-             actually has according to its superblock.  This shouldn't
-             happen normally.  */
-          grub_dprintf ("raid", "array->nr_devs > array->total_devs (%d)?!?",
-                       array->total_devs);
-
+         /* We found more members of the array than the array
+            actually has according to its superblock.  This shouldn't
+            happen normally.  */
+         return grub_error(GRUB_ERR_BAD_DEVICE,
+                           "Found more RAID array members than the superblock 
says there are");
+           
         if (array->device[new_array->index] != NULL)
           /* We found multiple devices with the same number. Again,
              this shouldn't happen.  */
-          grub_dprintf ("raid", "Found two disks with the number %d?!?",
-                       new_array->number);
+         return grub_error(GRUB_ERR_BAD_DEVICE,
+                           "Found two RAID array members with the same number 
%d",
+                           new_array->number);
 
         if (new_array->disk_size < array->disk_size)
           array->disk_size = new_array->disk_size;

Bug#605357: Patch for #605357

Reply via email to