I did it, I finally effed up my FreeBSD install.
It all started with trying to build a custom kernel. I used Git to checkout the latest stable source code, but realized I used the wrong branch name. So, I figured I'd interrupt the clone and just remove the /usr/src
directory (the standard place to put FreeBSD's source code) and start over. So I changed into /usr/src
and ran rm -r *
. Only I wasn't in /usr/src
, I was in /usr
.
By the time I realized what was happening and cancelled the operation, I had already wiped out all of /usr/bin
and an unknown number of other files and directories. My window manager (i3) fell on its face, and the status bar threw the "I can't run any of these commands" message. Turns out there's a lot of useful stuff I expected to live in /bin
that actually lives in /usr/bin
! ssh(1)
was gone, tar(1)
was gone. Even doas(1)
! I really, truly, thoroughly effed up my install.
Oh God, what about my home directory? Turns out it was still there, along with most (if not all of my files). I'm not enough of an idiot to not have a home snapshot, so I log in as root (doas
is gone), and zfs rollback
to the rescue. Now my primary concern is getting that home directory out of there, because I didn't feel like restoring it from my cloud backup if the rest of my recovery went bad. But I didn't have ssh(1)
or scp(1)
to clone it to my server. I did have ZFS' send and receive functionality but I figured I'd take the easy way out and use my unscathed rclone
to SCP it to freedom. Pretty sure my data shamed and gloated at me as it reached its lifeboat.
So now I could start to fearlessly think about un-effing my install. This is where most people (previously including myself) would suck it up, start from scratch with a USB installer, try to remember all of the customization steps they took to bring the system back to its current working state, and restore user data. But I'm not the man I once was. I've been playing this game long enough. I don't go crawling back to the dusty install media when something as trivial as critical system files go missing or corrupt. I know what I did wrong and I can think of several creative ways to fix it. Say it with me:
First of all, my entire system (sans /usr/bin
is still somewhat operational. I have access to a root shell (and my X session with a browser) so I'm in pretty good shape. I am lacking some very basic core utilities but I might be able to get them back without even rebooting. I don't have any system-wide snapshots to restore from but I do have another running FreeBSD 13.0-RELEASE system on my network: my server. rclone
worked to move data over there in an emergency, so I'll use that to copy my coreutils back where they belong. And it worked.
Now came the hard part. Un-effing everything I didn't know was missing or broken. Who knows what else got removed during that operation. I can reinstall my entire package tree pretty easily, so I'm not that worried about anything missing from /usr/local
. Maybe I have one or two config files in /usr/local/etc
that I can live without. I know /usr/home
is safe and restored. So all that's left is stuff like lib, sbin, include, lib32, share
and a few others that aren't very unique to my system (packages notwithstanding).
make buildworld
and upgrade/replace my system in-place to track STABLE. The install/upgrade/switching process is well-documented, and there are already difftools/mergetools responsible for making sure that all of the new artifacts go exactly where they're supposed to (overtop of your old or broken existing ones).
https://docs.freebsd.org/en/books/handbook/cutting-edge/#makeworld
So let's follow the Handbook shall we? I'll start where I left off when everything imploded. Clean out /usr/src
and clone the source tree.
# rm -r /usr/src/*
# git clone -b stable/13 https://git.FreeBSD.org/src.git /usr/src
And then start by compiling the world (userland) and the kernel. I'm going to use the GENERIC kernel for now so I can just get back up and running.
# make -j 4 buildworld buildkernel
It took literally hours. Poor little dual-core i5. It was already well on its way to completing when I realized I could have done this on my 8-core/16-thread server. "Oh, so you're reckless and stupid." Don't worry, I redeem myself later. Now I can actually install the new kernel and reboot into it. This is required before installing the world (userland).
# make installkernel
...
# reboot
After rebooting I can check out my new version.
# freebsd-version
13.0-RELEASE
Hmm, probably can't use freebsd-version
because it's tied to the userland and freebsd-update
. Let's try uname
:
# uname -r
13.0-STABLE
Success. Kernel is rebuilt, reinstalled, and tracking STABLE. Now it's time to install/upgrade everything else. Side note: this is one of the cool things about FreeBSD. It's a complete operating system, not just a kernel or a userland. All of the pieces were made to fit together instead of being glued together into a distro.
# make installworld
This took a little bit longer, but from the output I could see that all of my important, potentially-missing binaries were being installed. Libraries, core utilities, applications, daemons, and man pages all got put back in their proper place. My system is totally back to life nad I'm confident that I'm running an un-maimed FreeBSD 13.0-STABLE.
Now comes the sanity checking part of this job. I'm actually running a newer system now than I was before the upgrade. One of the components that was upgraded was ZFS, and with every major ZFS upgrade, I'm going to reinstall my ZROOT bootloader:
# gpart bootcode -p /boot/gptzfsboot -i 1 ada0
...
# reboot
This may be unecessary, but it ensures that my root-on-ZFS will load correctly after a reboot with the new ZFS. And to be fair, it is mentioned in the UPDATING
guide in the source.
After a reboot, I've got one more sanity check. FreeBSD comes with etcupdate(8)
, which you can use to manage merging upgrades with local changes to your /etc /usr/local/etc
system config. If you run etcupdate diff
you can see a diff of all of your customized system config. This is so good, I can't believe something like this doesn't exist on your typical Linux distro. Maybe it does and I just never realized it, but I'm betting they're all just different enough to not be able to share something like this. Anyway, after reviewing the diff, I applied any changes/merges by running etcupdate
.
Now for one last bit of housekeeping, and this comes straight from the handbook. After an upgrade, the world installation leaves behind old libraries and files that the new system doesn't need but old applications or ports built against an older target might still require. To get rid of them, you can use the Makefile
directives in /usr/src
# make check-old check-old-libs
After reviewing the list and ensuring you don't need those files, you can clean them up with
# make BATCH_DELETE_OLD_FILES=yes delete-old delete-old-libs
// TODO: pkg reinstall
//I'm a pkg leaf | xargs pkg install -f
away from completely restoring those.
Redemption. I went from attempting to customize my kernel to annihilating /usr
to restoring my entire system by building from FreeBSD's source tree via git
and make
. And I got an upgrade in the process too! Moving forward, I'm running slightly frequent automatic full-system snapshots. It should make it a lot easier to rescue accidental deletions of system files. I'm also going to take the time to learn more about the rescue disk process using the FreeBSD installer image. All told, not too bad for something that could have gone a lot worse.