How mdadm recognizes which device is right one to sync others with

Imagine you have mdadm raid1 with 2 disks.

Then you turn off the server, split the disks each one to different machine and boot both of them, so they will become slightly inconsistent

Now you turn off these 2 boxes and put the disks back to original machine and boot it.

How does mdadm figure out which one of these 2 disks is correct one and which one is “wrong” that would be synced with correct one?

Is it even possible to somehow specify this? What would actually happen if you did it, would the disks sync automatically or the array would just break?

Asked By: Petr

||

In this scenario, each disk would claim that the other disk failed.

The result depends on how exactly you’re assembling disks, but essentially, it will go with one disk and ignore the other; or it might assemble the other as a separate raid, which gives you a split brain.

I’ve done an experiment with loop devices, where I first changed loop1 and then changed loop2, so loop2 was the more “recent” one, yet it gets ignored:

# mdadm --assemble /dev/md42 /dev/loop1 /dev/loop2
mdadm: ignoring /dev/loop2 as it reports /dev/loop1 as failed
mdadm: /dev/md42 has been started with 1 drive (out of 2).

And if you do it the other way around it just ignores the other disk:

# mdadm --assemble /dev/md42 /dev/loop2 /dev/loop1
mdadm: ignoring /dev/loop1 as it reports /dev/loop2 as failed
mdadm: /dev/md42 has been started with 1 drive (out of 2).

Which leads me to believe that it just sticks with whatever disk it found first. So it’s entirely possible that after a reboot, you suddenly see the other side of the RAID. This is really bad.

It does not sync automatically, it even ignores the other disk entirely:

# cat /proc/mdstat 
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] 
md42 : active raid1 loop1[0]
      102272 blocks super 1.2 [2/1] [U_]

You do not ever want to be in this situation, so you should avoid it (not provoke it on purpose by moving disks the way you described). Conflicts such as these must be resolved manually.

Answered By: frostschutz
Categories: Answers Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.