Tuesday, April 7, 2015

Btrfs is still not very production-ready

We have few servers with not-that-important virtual machines under BTRFS. We decide to use it to utilize our not-that-robust hard drives of different sizes in JBOD with maximum efficiency.

We do not care about loosing those VMs, but we don't want to loose them all at the same time. So idea was to use BTRFS over LVM on those dirves, and if one drive failed, BTRFS will continue to operate. In my initial testing btrfs was a single filesystem to survive and operates on volume with missing parts in the middle (of course it would be disaster, RO mode, and garbage instead of disks of affected VMs, but at least some of them will survive to be migrated to other hosts).

Anyway, host catched OOM (yea, save on everything!), and the bad one (it wiped out OVS, neutron, libvirt and ssh). After reboot it stucked at the boot during mount with errors:

btrfs: device fsid 0520e52d-7681-4156-9061-388e374c4e16 devid 1 transid 407769 /dev/mapper/host-nova
parent transid verify failed on 471304036352 wanted 407770 found 407769
parent transid verify failed on 471304036352 wanted 407770 found 407769
btrfs: failed to read log tree
btrfs: open_ctree failed

btrfsck complains:
parent transid verify failed on 471304036352 wanted 407770 found 407769
parent transid verify failed on 471304036352 wanted 407770 found 407769
parent transid verify failed on 471304036352 wanted 407770 found 407769
btrfsck: disk-io.c:439: find_and_setup_log_root: Assertion `!(!log_root->node)' failed.
Aborted

NNNice!

But I quickly found article about this message. And it notes about 'new version of userspace tools'. We have been running Ubuntu 12.04, so I've upgraded it to trusty btrfs-utils package. New version of btrfsck can do more, but during execution it consumed 18 gigs of memory. Just for 2Tb volume. Hard to image how it would looks on the machine with 4-8 Gb...


btrfsck --repair /dev/mapper/host-nova
enabling repair mode
parent transid verify failed on 471304036352 wanted 407770 found 407769
parent transid verify failed on 471304036352 wanted 407770 found 407769
parent transid verify failed on 471304036352 wanted 407770 found 407769
parent transid verify failed on 471304036352 wanted 407770 found 407769
Ignoring transid failure
Checking filesystem on /dev/mapper/host-nova
UUID: 0520e52d-7681-4156-9061-388e374c4e16
checking extents
checking free space cache
cache and super generation don't match, space cache will be invalidated
checking fs roots
root 5 inode 407 errors 80, file extent overlap
found 214360836421 bytes used err is 1
total csum bytes: 0
total tree bytes: 10665472000
total fs tree bytes: 5452877824
total extent tree bytes: 5212286976
btree space waste bytes: 843886520
file data blocks allocated: 892593057792
 referenced 890681024512
Btrfs v3.12

More job done, but no help, it still not mounting with the same complain.


As last resort I've tried btrfs-zero-log according to article. And it helped! Volume has been mounted and has been operating normally since than.

But I still be worry about that behavior. Why log was corrupted? Merely OOM can cause that much problems? And why it error reporting is so cryptic?

Meanwhile I was able to save image (btrfs-image, just 800 Mb of metadata for 2Tb filesystem) and I will report that to the kernel bugtracker.