From 890b34bcc1a6b4073d1e512b1386634f7bc5ea52 Mon Sep 17 00:00:00 2001 From: "Adam T. Carpenter" Date: Wed, 21 Apr 2021 22:57:39 -0400 Subject: unified posts dir, until I can figure out makefile sub-subdirs. makefile auto-generates index --- ...n-zfs-a-zpool-of-mirror-vdevs-the-easy-way.html | 375 +++++++++++++++++++++ 1 file changed, 375 insertions(+) create mode 100644 posts/2021-01-15-root-on-zfs-a-zpool-of-mirror-vdevs-the-easy-way.html (limited to 'posts/2021-01-15-root-on-zfs-a-zpool-of-mirror-vdevs-the-easy-way.html') diff --git a/posts/2021-01-15-root-on-zfs-a-zpool-of-mirror-vdevs-the-easy-way.html b/posts/2021-01-15-root-on-zfs-a-zpool-of-mirror-vdevs-the-easy-way.html new file mode 100644 index 0000000..6f515f3 --- /dev/null +++ b/posts/2021-01-15-root-on-zfs-a-zpool-of-mirror-vdevs-the-easy-way.html @@ -0,0 +1,375 @@ + + + + + + + + + + + + + 53hornet ➙ Root on ZFS: A ZPool of Mirror VDEVs The Easy Way + + + + + +
+

Root on ZFS: A ZPool of Mirror VDEVs

+ +

+ I wanted/needed to make a root on ZFS pool out of multiple mirror VDEVs, + and since I'm not a ZFS expert, I took a little shortcut. +

+ +

+ I recently got a new-to-me server (yay!) and I wanted to do a + root-on-ZFS setup on it. I've really enjoyed using ZFS for my data + storage pools for a long time. I've also enjoyed the extra functionality + that comes with having a bootable system installed on ZFS on my laptop + and decided with this upgrade it's time to do the same on my server. + Historically I've used RAIDZ for my storage pools. RAIDZ functions + almost like a RAID10 but at the ZFS level. It gives you parity so that a + certain number of disks can die from your pool and you won't lose any + data. It does have a few tradeoffs however*, and for personal + preferences I've decided that for the future I would like to have a + single ZPool over top of multiple mirror VDEVs. In other words, my main + root+storage pool will be made up of two-disk mirrors and can be + expanded to include any number of new mirrors I can fit into the + machine. +

+ +

+ This did present some complications. First of all, + bsdinstall won't set this up for you automatically (and + sure enough, + in the handbook + it mentions the guided root on ZFS tool will only create a single, + top-level VDEV unless it's a stripe). It will happily let you use RAIDZ + for your ZROOT but not the more custom approach I'm taking. I did + however use + bsdinstall as a shortcut so I wouldn't have to do all of + the partitioning and pool setup manually, and that's what I'm going to + document below. Because I'm totally going to forget how this works the + next time I have to do it. +

+ +

+ In my scenario I have an eight-slot, hot-swappable PERC H310 controller + that's configured for AHCI passthrough. In other words, all FreeBSD sees + is as many disks as I have plugged into the backplane. I'm going to fill + it with 6x2TB hard disks which, as I said before, I want to act as three + mirrors (two disks each) in a single, bootable, growable ZPool. For + starters, I shoved the FreeBSD installer on a flash drive and booted + from it. I followed all of the regular steps (setting hostname, getting + online, etc.) until I got to the guided root on ZFS disk partitioning + setup. +

+ +

+ Now here's where I'm going to take the first step on my shortcut. Since + there is no option to create the pool of arbitrary mirrors I'm just + going to create a pool from a single mirror VDEV of two disks. Later I + will expand the pool to include the other two mirrors I had intended + for. My selections were as follows: +

+ + + +

+ Everything else was left as a default. Then I followed the installer to + completion. At the end, when it asked if I wanted to drop into a shell + to do more to the installation, I did. +

+ +

+ The installer created the following disk layout for the two disks that I + selected. +

+ +
+
+atc@macon:~ % gpart show
+=>        40  3907029088  mfisyspd0  GPT  (1.8T)
+          40      409600          1  efi  (200M)
+      409640        2008             - free -  (1.0M)
+      411648     8388608          2  freebsd-swap  (4.0G)
+     8800256  3898228736          3  freebsd-zfs  (1.8T)
+  3907028992         136             - free -  (68K)
+
+=>        40  3907029088  mfisyspd1  GPT  (1.8T)
+          40      409600          1  efi  (200M)
+      409640        2008             - free -  (1.0M)
+      411648     8388608          2  freebsd-swap  (4.0G)
+     8800256  3898228736          3  freebsd-zfs  (1.8T)
+  3907028992         136             - free -  (68K)
+
+
+ +

+ The installer also created the following ZPool from my single mirror + VDEV. +

+ +
+
+atc@macon:~ % zpool status
+  pool: zroot
+ state: ONLINE
+  scan: none requested
+config:
+
+	NAME             STATE     READ WRITE CKSUM
+	zroot            ONLINE       0     0     0
+	  mirror-0       ONLINE       0     0     0
+	    mfisyspd0p3  ONLINE       0     0     0
+	    mfisyspd1p3  ONLINE       0     0     0
+
+errors: No known data errors
+
+
+ +

+ There are a couple of things to take note of here. First of all, + both disks in the bootable ZPool have an EFI boot partition. + That means they're both a part of (or capable of?) booting the pool. + Second, they both have some swap space. Finally, they both have a third + partition which is dedicated to ZFS data, and that partition is what got + added to my VDEV. +

+ +

+ So where do I go from here? I was tempted to just + zpool add mirror ... ... and just add my other disks to the + pool (actually, I did do this but it rendered the volume + unbootable for a very important reason), but then I wouldn't have those + all-important boot partitions (using whole-disk mirror VDEVS). Instead, + I need to manually go back and re-partition four disks exactly like the + first two. Or, since all I want is two more of what's already been done, + I can just clone the partitions using gpart backup and + restore! Easy! Here's what I did for all four remaining + disks: +

+ +
+
+root@macon:~ # gpart backup mfisyspd0 | gpart restore -F mfisyspd2`
+
+
+ +

+ Full disclosure, I didn't even think of this as a possibility + until I read this Stack Exchange post. This gave me a disk layout like this: +

+ +
+
+atc@macon:~ % gpart show
+=>        40  3907029088  mfisyspd0  GPT  (1.8T)
+          40      409600          1  efi  (200M)
+      409640        2008             - free -  (1.0M)
+      411648     8388608          2  freebsd-swap  (4.0G)
+     8800256  3898228736          3  freebsd-zfs  (1.8T)
+  3907028992         136             - free -  (68K)
+
+=>        40  3907029088  mfisyspd1  GPT  (1.8T)
+          40      409600          1  efi  (200M)
+      409640        2008             - free -  (1.0M)
+      411648     8388608          2  freebsd-swap  (4.0G)
+     8800256  3898228736          3  freebsd-zfs  (1.8T)
+  3907028992         136             - free -  (68K)
+
+=>        40  3907029088  mfisyspd2  GPT  (1.8T)
+          40      409600          1  efi  (200M)
+      409640        2008             - free -  (1.0M)
+      411648     8388608          2  freebsd-swap  (4.0G)
+     8800256  3898228736          3  freebsd-zfs  (1.8T)
+  3907028992         136             - free -  (68K)
+
+=>        40  3907029088  mfisyspd3  GPT  (1.8T)
+          40      409600          1  efi  (200M)
+      409640        2008             - free -  (1.0M)
+      411648     8388608          2  freebsd-swap  (4.0G)
+     8800256  3898228736          3  freebsd-zfs  (1.8T)
+  3907028992         136             - free -  (68K)
+
+=>        40  3907029088  mfisyspd4  GPT  (1.8T)
+          40      409600          1  efi  (200M)
+      409640        2008             - free -  (1.0M)
+      411648     8388608          2  freebsd-swap  (4.0G)
+     8800256  3898228736          3  freebsd-zfs  (1.8T)
+  3907028992         136             - free -  (68K)
+
+=>        40  3907029088  mfisyspd5  GPT  (1.8T)
+          40      409600          1  efi  (200M)
+      409640        2008             - free -  (1.0M)
+      411648     8388608          2  freebsd-swap  (4.0G)
+     8800256  3898228736          3  freebsd-zfs  (1.8T)
+  3907028992         136             - free -  (68K)
+
+
+ +

+ And to be fair, this makes a lot of logical sense. You don't want a + six-disk pool to only be bootable by two of the disks or you're + defeating some of the purposes of redundancy. So now I can extend my + ZPool to include those last four disks. +

+ +

+ This next step may or may not be a requirement. I wanted to overwrite + where I assumed any old ZFS/ZPool metadata might be on my four new + disks. This could just be for nothing and I admit that, but I've run + into trouble in the past where a ZPool wasn't properly + exported/destroyed before the drives were removed for another purpose + and when you use those drives in future + zpool imports, you can see both the new and the old, failed + pools. And, in the previous step I cloned an old ZFS partition many + times! So I did a small dd on the remaining disks to help + me sleep at night: +

+ +
+
+root@macon:~ # dd if=/dev/zero of=/dev/mfisyspd2 bs=1M count=100
+
+
+ +

+ One final, precautionary step is to write the EFI boot loader to the new + disks. In + zpool admin handbook + it mentions you should do this any time you replace a zroot + device, so I'll do it just for safe measure on all four additional + disks: +

+ +
+
+root@macon:~ # gpart bootcode -p /boot/boot1.efifat -i 1 mfisyspd2
+
+
+ +

+ Don't forget that the command is different for UEFI and a traditional + BIOS. And finally, I can add my new VDEVs: +

+ +
+
+root@macon:~ # zpool zroot add mirror mfisyspd2p3 mfisyspd3p3
+root@macon:~ # zpool zroot add mirror mfisyspd4p3 mfisyspd5p3
+
+
+ +

And now my pool looks like this:

+ +
+
+atc@macon:~ % zpool status
+  pool: zroot
+ state: ONLINE
+  scan: none requested
+config:
+
+	NAME             STATE     READ WRITE CKSUM
+	zroot            ONLINE       0     0     0
+	  mirror-0       ONLINE       0     0     0
+	    mfisyspd0p3  ONLINE       0     0     0
+	    mfisyspd1p3  ONLINE       0     0     0
+	  mirror-1       ONLINE       0     0     0
+	    mfisyspd2p3  ONLINE       0     0     0
+	    mfisyspd3p3  ONLINE       0     0     0
+	  mirror-2       ONLINE       0     0     0
+	    mfisyspd4p3  ONLINE       0     0     0
+	    mfisyspd5p3  ONLINE       0     0     0
+
+errors: No known data errors
+
+
+ +

+ Boom. A growable, bootable zroot ZPool. Is it easier than just + configuring the partitions and root on ZFS by hand? Probably not for a + BSD veteran. But since I'm a BSD layman, this is something I can live + with pretty easily. At least until this becomes an option in + bsdintall maybe? At least now I can add as many more + mirrors as I can fit into my system. And it's just as easy to replace + them. This is better for me than my previous RAIDZ, where I would have + to destroy and re-create the pool in order to add more disks to the + VDEV. Now I just create another little mirror and grow the pool and all + of my filesystems just see more storage. And of course, having ZFS for + all of my data makes it super easy to create filesystems on the fly, + compress or quota them, and take snapshots (including the live ZROOT!) + and send those snapshots over the network. Pretty awesome. +

+ +

+ * I'm not going to explain why here, but + this is a pretty well thought out article + that should give you an idea about the pros and cons of RAIDZ versus + mirror VDEVs so you can draw your own conclusions. +

+
+ + -- cgit v1.2.3