2

As the title suggests, mdadm keeps marking a drive as "Removed" (from mdadm --detail) and I was hoping to get suggestions as to why that might happen.

I was wanting to fsck the drives however I got the following error:

$ fsck /dev/sda1
fsck from util-linux 2.20.1
fsck: fsck.linux_raid_member: not found
fsck: error 2 while executing fsck.linux_raid_member for /dev/sda1

I've since learned that an internal bitmap would help stop me from needing to --add the third drive back and avoiding the resync process/time however I'm assuming I need the third disk to be added back first for the bitmap to be of any use. Any other suggestions on how to avoid a costly resync would be appreciated. The usage of this RAID is for media serving, thus a high read low write application.


Update: At the request of MadHatter, here's the output from /proc/mdstat (the RAID is in the process of rebuilding).

Personalities : [raid6] [raid5] [raid4]
md1 : active raid5 sdc1[3] sda1[2] sdb1[1]
  3907023872 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/2] [_UU]
  [=====>...............]  recovery = 25.2% (493990636/1953511936) finish=1893.9m
in speed=12843K/sec

unused devices: <none>
kierans
  • 133

1 Answers1

0

The drive is being removed because md thinks it is bad. You should investigate why. It could be the drive is (intermittently) bad.

You never fsck a partition that is part of an md device.

A write intent bitmap wouldn't have helped. Once the disk is removed from the md device an entire sync is needed. A write intent bitmap only helps when the members of the device are in sync and the server crashes.

Mark Wagner
  • 18,129
  • Once the disk is removed from the md device an entire sync is needed. is not entirely true. From the kernel.org wiki: Therefore a write-intent bitmap reduces rebuild/recovery (md sync) time if... one spindle is disconnected, then reconnected. https://raid.wiki.kernel.org/index.php/Write-intent_bitmap – R. S. Feb 19 '13 at 00:15
  • that depends why. if the drive returns an error then the entries drive is considered suspect. if it merely disappears and reappears, then the bitmap applies. – longneck Feb 19 '13 at 01:04
  • 1
    I think smartctl -a /dev/sdX deserves some mentioning, no? That'd be regarding the second sentence in your answer. – 0xC0000022L Feb 19 '13 at 01:15
  • What's the best way to figure out if the disk is (partly) bad? – kierans Feb 19 '13 at 01:32