Your message dated Mon, 26 May 2014 20:54:28 +1000
with message-id <20140526105428.ga1...@master.debian.org>
and subject line Re: Bug#713994: XFS repair
has caused the Debian Bug report #713994,
regarding xfsprogs: Failure to allocate new space, potentially after an
xfs_grow, to existing files
to be marked as done.
This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.
(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact ow...@bugs.debian.org
immediately.)
--
713994: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=713994
Debian Bug Tracking System
Contact ow...@bugs.debian.org with problems
--- Begin Message ---
Package: xfsprogs
Version: 3.1.4
Justification: breaks unrelated software
Severity: critical
Tags: upstream
*** Please type your report below this line ***
As part of my standard virtual machine template for MySQL, I created an LVM
setup as follows:
pvcreate /dev/sdb
vgcreate vg0 /dev/sdb
lvcreate -n mysql -L 10G vg0
lvcreate -n mysqlbinlog -L 5G vg0
lvcreate -n mysqllog -L 5G vg0
Then, after I deploy the template, I resize the LVs and xfs_growfs them,
apt-get upgrade, and restart.
I also frequently (multiple times per week) extend MySQL data volumes on
the fly using
echo 1 > /sys/block/sdb/device/rescane
pvresize /dev/sdb
lvresize -l +100%FREE /dev/vg0/mysql
xfs_growfs /mysql
This has worked hundreds of times. Twice, though, on Debian 6.x (I don't
have enough 7.x boxes in production yet) I've had something happen with
XFS. Once, after an extend, it was unmountable. I assumed this was an
error on my part, created the FS, and moved on.
Today, I got an SQL error on a MySQL replication box:
Error 'The table 'XXXXX' is full' on query. Default database: XXXXXYYYYYY
Normally this would be a configuration issue in MySQL related to a setting
called "innodb_data_file_path". Mine is configured correctly; otherwise
all of my templates would have died years ago. On this particular host, I
extended /mysql (the data volume) to 120G before reboot. Today, when it
hit what appears to be the *original* size:
/dev/mapper/vg0-mysql
125814784 10473148 115341636 9% /mysql
XFS threw allocation errors.
[4051547.986869] ffff880638219200: 58 41 47 46 00 00 00 01 00 00 00 0c 00
05 00 00 XAGF............
[4051547.987636] Filesystem "dm-2": XFS internal error xfs_alloc_read_agf
at line 2293 of file
/build/buildd-linux-2.6_2.6.32-48squeeze1-amd64-qu4MIV/linux-2.6-2.6.32/debian/build/source_amd64_none/fs/xfs/xfs_alloc.c.
Caller 0xffffffffa021fcb0
[4051547.987639]
[4051547.988226] Pid: 3367, comm: mysqld Not tainted 2.6.32-5-amd64 #1
[4051547.988228] Call Trace:
[4051547.988287] [<ffffffffa021fcb0>] ? xfs_alloc_read_agf+0x22/0xa4 [xfs]
[4051547.988298] [<ffffffffa021fc73>] ? xfs_read_agf+0x12e/0x149 [xfs]
[4051547.988307] [<ffffffffa021fcb0>] ? xfs_alloc_read_agf+0x22/0xa4 [xfs]
[4051547.988317] [<ffffffffa021fcb0>] ? xfs_alloc_read_agf+0x22/0xa4 [xfs]
[4051547.988331] [<ffffffffa021fd45>] ? xfs_alloc_pagf_init+0x13/0x34 [xfs]
[4051547.988342] [<ffffffffa022c3f1>] ? xfs_inode_is_filestream+0x24/0x51
[xfs]
[4051547.988353] [<ffffffffa022c67d>] ? xfs_bmap_btalloc+0x25f/0x732 [xfs]
[4051547.988368] [<ffffffffa0228b70>] ? xfs_bmap_search_extents+0x57/0xd1
[xfs]
[4051547.988379] [<ffffffffa022ee83>] ? xfs_bmapi+0x876/0x104d [xfs]
[4051547.988386] [<ffffffff8118eec3>] ? _atomic_dec_and_lock+0x33/0x50
[4051547.988406] [<ffffffffa0257217>] ? xfs_trans_reserve+0xc8/0x19d [xfs]
[4051547.988419] [<ffffffffa024a51f>] ? xfs_iomap_write_direct+0x305/0x3f5
[xfs]
[4051547.988432] [<ffffffffa024a7bc>] ? xfs_iomap+0x1ad/0x285 [xfs]
[4051547.988444] [<ffffffffa025db53>] ? __xfs_get_blocks+0x6c/0x15c [xfs]
[4051547.988450] [<ffffffff81062311>] ? queue_work_on+0x45/0x4f
[4051547.988455] [<ffffffff8111442a>] ? __blockdev_direct_IO+0x575/0xa60
[4051547.988461] [<ffffffff810482ef>] ? finish_task_switch+0x3a/0xaf
[4051547.988472] [<ffffffffa025d983>] ? xfs_vm_direct_IO+0xd4/0x102 [xfs]
[4051547.988483] [<ffffffffa025dc43>] ? xfs_get_blocks_direct+0x0/0x11
[xfs]
[4051547.988494] [<ffffffffa025d791>] ? xfs_end_io_direct+0x0/0x5f [xfs]
[4051547.988499] [<ffffffff810b55d6>] ?
generic_file_direct_write+0xdd/0x152
[4051547.988510] [<ffffffffa0264532>] ? xfs_write+0x44f/0x6ea [xfs]
[4051547.988523] [<ffffffffa02442ca>] ? xfs_iunlock+0x31/0x7c [xfs]
[4051547.988528] [<ffffffff810ef21e>] ? do_sync_write+0xce/0x113
[4051547.988532] [<ffffffff810652b2>] ? autoremove_wake_function+0x0/0x2e
[4051547.988535] [<ffffffff810482ef>] ? finish_task_switch+0x3a/0xaf
[4051547.988538] [<ffffffff810efb70>] ? vfs_write+0xa9/0x102
[4051547.988541] [<ffffffff810efc20>] ? sys_pwrite64+0x57/0x77
[4051547.988545] [<ffffffff81010b42>] ? system_call_fastpath+0x16/0x1b
[4053278.621482] ffff880638219200: 58 41 47 46 00 00 00 01 00 00 00 0c 00
05 00 00 XAGF............
[4053278.621761] Filesystem "dm-2": XFS internal error xfs_alloc_read_agf
at line 2293 of file
/build/buildd-linux-2.6_2.6.32-48squeeze1-amd64-qu4MIV/linux-2.6-2.6.32/debian/build/source_amd64_none/fs/xfs/xfs_alloc.c.
Caller 0xffffffffa021fcb0
[4053278.621764]
[4053278.622352] Pid: 3367, comm: mysqld Not tainted 2.6.32-5-amd64 #1
[4053278.622354] Call Trace:
[4053278.622381] [<ffffffffa021fcb0>] ? xfs_alloc_read_agf+0x22/0xa4 [xfs]
[4053278.622392] [<ffffffffa021fc73>] ? xfs_read_agf+0x12e/0x149 [xfs]
[4053278.622401] [<ffffffffa021fcb0>] ? xfs_alloc_read_agf+0x22/0xa4 [xfs]
[4053278.622411] [<ffffffffa021fcb0>] ? xfs_alloc_read_agf+0x22/0xa4 [xfs]
[4053278.622421] [<ffffffffa021fd45>] ? xfs_alloc_pagf_init+0x13/0x34 [xfs]
[4053278.622432] [<ffffffffa022c3f1>] ? xfs_inode_is_filestream+0x24/0x51
[xfs]
[4053278.622442] [<ffffffffa022c67d>] ? xfs_bmap_btalloc+0x25f/0x732 [xfs]
[4053278.622453] [<ffffffffa0228b70>] ? xfs_bmap_search_extents+0x57/0xd1
[xfs]
[4053278.622465] [<ffffffffa022ee83>] ? xfs_bmapi+0x876/0x104d [xfs]
[4053278.622486] [<ffffffff8118eec3>] ? _atomic_dec_and_lock+0x33/0x50
[4053278.622501] [<ffffffffa0257217>] ? xfs_trans_reserve+0xc8/0x19d [xfs]
[4053278.622514] [<ffffffffa024a51f>] ? xfs_iomap_write_direct+0x305/0x3f5
[xfs]
[4053278.622527] [<ffffffffa024a7bc>] ? xfs_iomap+0x1ad/0x285 [xfs]
[4053278.622538] [<ffffffffa025db53>] ? __xfs_get_blocks+0x6c/0x15c [xfs]
[4053278.622551] [<ffffffff8111442a>] ? __blockdev_direct_IO+0x575/0xa60
[4053278.622563] [<ffffffffa025d983>] ? xfs_vm_direct_IO+0xd4/0x102 [xfs]
[4053278.622574] [<ffffffffa025dc43>] ? xfs_get_blocks_direct+0x0/0x11
[xfs]
[4053278.622585] [<ffffffffa025d791>] ? xfs_end_io_direct+0x0/0x5f [xfs]
[4053278.622593] [<ffffffff810b55d6>] ?
generic_file_direct_write+0xdd/0x152
[4053278.622605] [<ffffffffa0264532>] ? xfs_write+0x44f/0x6ea [xfs]
[4053278.622618] [<ffffffffa02442ca>] ? xfs_iunlock+0x31/0x7c [xfs]
[4053278.622625] [<ffffffff810ef21e>] ? do_sync_write+0xce/0x113
[4053278.622637] [<ffffffff810652b2>] ? autoremove_wake_function+0x0/0x2e
[4053278.622646] [<ffffffff810482ef>] ? finish_task_switch+0x3a/0xaf
[4053278.622650] [<ffffffff810efb70>] ? vfs_write+0xa9/0x102
[4053278.622652] [<ffffffff810efc20>] ? sys_pwrite64+0x57/0x77
[4053278.622660] [<ffffffff81010b42>] ? system_call_fastpath+0x16/0x1b
[4055487.227132] ffff880638219200: 58 41 47 46 00 00 00 01 00 00 00 0c 00
05 00 00 XAGF............
[4055487.227678] Filesystem "dm-2": XFS internal error xfs_alloc_read_agf
at line 2293 of file
/build/buildd-linux-2.6_2.6.32-48squeeze1-amd64-qu4MIV/linux-2.6-2.6.32/debian/build/source_amd64_none/fs/xfs/xfs_alloc.c.
Caller 0xffffffffa021fcb0
[4055487.227681]
[4055487.228267] Pid: 32572, comm: mysqld Not tainted 2.6.32-5-amd64 #1
[4055487.228269] Call Trace:
[4055487.228292] [<ffffffffa021fcb0>] ? xfs_alloc_read_agf+0x22/0xa4 [xfs]
[4055487.228302] [<ffffffffa021fc73>] ? xfs_read_agf+0x12e/0x149 [xfs]
[4055487.228311] [<ffffffffa021fcb0>] ? xfs_alloc_read_agf+0x22/0xa4 [xfs]
[4055487.228321] [<ffffffffa021fcb0>] ? xfs_alloc_read_agf+0x22/0xa4 [xfs]
[4055487.228330] [<ffffffffa021fd45>] ? xfs_alloc_pagf_init+0x13/0x34 [xfs]
[4055487.228341] [<ffffffffa022c3f1>] ? xfs_inode_is_filestream+0x24/0x51
[xfs]
[4055487.228352] [<ffffffffa022c67d>] ? xfs_bmap_btalloc+0x25f/0x732 [xfs]
[4055487.228363] [<ffffffffa0228b70>] ? xfs_bmap_search_extents+0x57/0xd1
[xfs]
[4055487.228374] [<ffffffffa022ee83>] ? xfs_bmapi+0x876/0x104d [xfs]
[4055487.228388] [<ffffffffa0257217>] ? xfs_trans_reserve+0xc8/0x19d [xfs]
[4055487.228400] [<ffffffffa024a51f>] ? xfs_iomap_write_direct+0x305/0x3f5
[xfs]
[4055487.228413] [<ffffffffa024a7bc>] ? xfs_iomap+0x1ad/0x285 [xfs]
[4055487.228425] [<ffffffffa025db53>] ? __xfs_get_blocks+0x6c/0x15c [xfs]
[4055487.228431] [<ffffffff8111442a>] ? __blockdev_direct_IO+0x575/0xa60
[4055487.228442] [<ffffffffa025d983>] ? xfs_vm_direct_IO+0xd4/0x102 [xfs]
[4055487.228453] [<ffffffffa025dc43>] ? xfs_get_blocks_direct+0x0/0x11
[xfs]
[4055487.228464] [<ffffffffa025d791>] ? xfs_end_io_direct+0x0/0x5f [xfs]
[4055487.228468] [<ffffffff810b55d6>] ?
generic_file_direct_write+0xdd/0x152
[4055487.228479] [<ffffffffa0264532>] ? xfs_write+0x44f/0x6ea [xfs]
[4055487.228492] [<ffffffffa02442ca>] ? xfs_iunlock+0x31/0x7c [xfs]
[4055487.228496] [<ffffffff810ef21e>] ? do_sync_write+0xce/0x113
[4055487.228500] [<ffffffff810652b2>] ? autoremove_wake_function+0x0/0x2e
[4055487.228503] [<ffffffff810efb70>] ? vfs_write+0xa9/0x102
[4055487.228505] [<ffffffff810efc20>] ? sys_pwrite64+0x57/0x77
[4055487.228509] [<ffffffff81010b42>] ? system_call_fastpath+0x16/0x1b
Throwing a dd at the FS works fine:
dd if=/dev/urandom of=/mysql/data/somefile.out bs=1M count=512
Two things might be going on here that are special
1) I'm using a pretty customized MySQL configuration, including a setting
called O_DIRECT, which turns on direct IO for InnoDB flushes. Therefore,
the code path for some InnoDB writes is going to be different from what dd
does.
2) This is SAN-backed storage; it shouldn't matter, but it does introduce
IO latency.
I've been trying to find a way to fix this "hot"; these are production
systems. Luckily it was just a read-only replica. If this had happened to
the master, there would probably have been data corruption (hence the sev 1
ticket). Couldn't find a hot fix, so I'm going to unmount/repair and then
reboot. I'll keep track of what happens so I can attach it to this report
later if possible.
Thanks!
Wes
-- System Information:
Debian Release: 6.0.7
APT prefers oldstable-updates
APT policy: (500, 'oldstable-updates'), (500, 'oldstable')
Architecture: amd64 (x86_64)
Kernel: Linux 2.6.32-5-amd64 (SMP w/4 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash
Versions of packages xfsprogs depends on:
ii libblkid1 2.17.2-9 block device id library
ii libc6 2.11.3-4 Embedded GNU C Library: Shared
lib
ii libreadline5 5.2-7 GNU readline and history
libraries
ii libuuid1 2.17.2-9 Universally Unique ID library
xfsprogs recommends no packages.
Versions of packages xfsprogs suggests:
pn acl <none> (no description available)
pn attr <none> (no description available)
pn quota <none> (no description available)
pn xfsdump <none> (no description available)
-- no debconf information
--- End Message ---
--- Begin Message ---
Version: 3.1.4
On Tue, 2013-06-25 08:47:53 +1000, Dave Chinner wrote:
> On Mon, Jun 24, 2013 at 12:43:55PM -0400, Wes Deviers wrote:
>> Update:
>>
>> xfs_repair has fixed the problem:
>>
>> root@ashdbtrk-s008-02:/home/wdeviers# xfs_check /dev/vg0/mysql
>> agf_freeblks 327676, counted 655100 in ag 4
>> agf_longest 327676, counted 655100 in ag 4
>> block 12/1 expected type unknown got btbno
>> block 12/4 expected type unknown got free2
>> ...
>> block 12/45268 expected type unknown got free2
>> block 12/45269 expected type unknown got free2
>> block 12/45270 expected type unknown got free2
>> block 12/45271 expected type unknown got free2
>> block 12/45272 expected type unknown got free2
>> block 12/45273 expected type unknown got free2
>> block 12/45274 expected type unknown got free2
>> block 12/45275 expected type unknown got free2
>>
>> root@ashdbtrk-s008-02:/home/wdeviers# xfs_repair /dev/vg0/mysql
>> Phase 1 - find and verify superblock...
>> Phase 2 - using internal log
>> - zero log...
>> - scan filesystem freespace and inode maps...
>> primary/secondary superblock 4 conflict - AG superblock geometry info
>> conflicts with filesystem geometry
>> bad sequence # 12 for agf 4
>> bad length 327680 for agf 4, should be 655104
>> bad sequence # 12 for agi 4
>> bad length # 327680 for agi 4, should be 655104
>> reset bad sb for ag 4
>> reset bad agf for ag 4
>> reset bad agi for ag 4
>> sb_fdblocks 28836636, counted 28509212
>
> The AG headers for AG #4 have been overwritten with header from a
> different AG# (12 instead of 4) that has a different length (327680
> instead of 655104). The sequence number mismatch is why the runtime
> errors are being thrown
>
> However, the superblock for AG 4 has the wrong geometry and that's
> indicative of it belonging to some other filesystem. i.e. a
> misdirected write. I'd be taking a careful look at your storage,
> because misdirected writes are evil and *extremely hard* to diagnose
> without specialised equipment...
>
> Anyway, you're running an old, old kernel (2.6.32-5-amd64) and
> xfsprogs, so unless you can reproduce this reliably on a couple of
> different types of hardware with a more recent kernel and xfsprogs,
> we aren't really going to be able to do anything about your problem.
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> da...@fromorbit.com
I'm closing this bug report after 11 months since Dave's response.
Please feel free to reopen when you have the information requested by
Dave.
signature.asc
Description: Digital signature
--- End Message ---