Synology DS1815+ and double-disk failure on a SHR volume

Yesterday my DS1815+ sent an email alert telling me one of the volumes on my NAS entered into degraded state. Minutes later, I realized one of the 3TB drives registered 8 bad-block errors in the log, at which point DS decided to put it in Crashed state.

I quickly checked the warranty on it, of course it’s expired.

I then proceeded to order same-day delivery through Amazon Prime, a new WD RED 6TB NAS drive.

Of course, soon as disk arrived, I realized this won’t solve my issue in a most optimum way, because the failed volume was RAID10 – the pair for the failed disk was still 3TB and chugging along. My options were to fork another $270 for the 2nd RED or reshuffle couple of disks to survive some more. I chose to not entertain the option of wasting the remaining 3TB, which would be the case if I simply inserted the new disk in place of the failed one.

The RAID10 volume I have uses 4 disks on the 8-bay DS. I was prepared for this day; had I used all 8 bays to form the RAID10, I’d have a lot more difficulty shuffling things around.

The remaining 4 disks were experimental, a combination of back-of-the-drawer disks.. 2x2TB + 2x1TB. It was a single disk protected SHR. I decided to replace one of the 1TB disks in the SHR volume with the 6TB just arrived, then move all of the content in RAID10 volume over to the SHR, then rebuild the RAID10 using 2x4TB + 2x2TB that I already had.

To do this, I yanked 1TB disk out on the SHR — which is not a problem in itself. So although DS beeper a bit, it’s only degraded. It’d be fine when I inserted the 6TB and after it finished doing a repair on the volume.

Well… I started that process and went out for a family dinner.

Sure enough, 2 hours later I got another email alert — the volume (not disk) had crashed :(.

When I got back home, one of the 2TB on the SHR volume died before repair was able to be completed. And this is where I come to the reason I made this post, which is that even though Synology reported that the volume was Crashed, it actually was still online — as in, I could access the network share and could still reach the files.

On Windows this would not happen — if Volume goes down because of a double-disk failure, it will immediately dismount and you can’t do anything to it.

With a smile on my face, I proceeded to copy some 700GB of the content that was on the SHR volume while it was showing crashed; then took care of the failed disk.

So Synology DSM appears to be doing “what it can” to make the data available even after experiencing double-disk failure on a single-disk protected SHR volume. Good job Synology.

Leave a comment

Blog at WordPress.com.