I did it, I finally effed up my FreeBSD install.
It all started with trying to build a custom kernel. I used Git to checkout the latest stable source code, but realized I used the wrong branch name. So, I figured I'd interrupt the clone and just remove the /usr/src
directory (the standard place to put FreeBSD's source code) and start over. So I changed into /usr/src
and ran rm -r *
. Only I wasn't in /usr/src
, I was in /usr
.
By the time I realized what was happening and cancelled the operation, I had already wiped out all of /usr/bin
and an unknown number of other files and directories. My window manager (i3) fell on its face, and the status bar threw the "I can't run any of these commands" message. Turns out there's a lot of useful stuff I expected to live in /bin
that actually lives in /usr/bin
! ssh(1)
was gone, tar(1)
was gone. Even doas(1)
! I really, truly, thoroughly effed up my install.
Oh God, what about my home directory? Turns out it was still there, along with most (if not all of my files). I'm not enough of an idiot to not have a home snapshot, so I log in as root (doas
is gone), and zfs rollback
to the rescue. Now my primary concern is getting that home directory out of there, because I didn't feel like restoring it from my cloud backup if the rest of my recovery went bad. But I didn't have ssh(1)
or scp(1)
to clone it to my server. I did have ZFS' send and receive functionality but I figured I'd take the easy way out and use my unscathed rclone
to SCP it to freedom. Pretty sure my data shamed and gloated at me as it reached its lifeboat.
So now I could start to fearlessly think about un-effing my install. This is where most people (previously including myself) would suck it up and pull out the install media, try to remember all of the customization steps they took to bring the system back to its current working state, and restore user data. But I'm not the man I once was. I've been playing this game long enough. I don't go crawling back to the dusty install media when something as trivial as critical system files go missing or corrupt. I know what I did wrong and I can think of several creative ways to fix it. Say it with me:
First of all, my entire system (sans /usr/bin
is still somewhat operational. I have access to a root shell (and my X session with a browser) so I'm in pretty good shape. I am lacking some very basic core utilities but I might be able to get them back without even rebooting. I don't have any system-wide snapshots to restore from but I do have another running FreeBSD 13.0-RELEASE system on my network: my server. rclone
worked to move data over there in an emergency, so I'll use that to copy my coreutils back where they belong. And it worked. [2]
Now came the hard part. Un-effing everything I didn't know was missing or broken. Who knows what else got removed during that operation. I can reinstall my entire package tree pretty easily, so I'm not that worried about anything missing from /usr/local
. Maybe I have one or two config files in /usr/local/etc
that I can live without. I know /usr/home
is safe and restored. So all that's left is stuff like lib, sbin, include, lib32, share
and a few others that aren't very unique to my system (packages notwithstanding).
Here's where the YOLO part begins. I was already in the middle of building my system from source to track 13.0-STABLE, instead of RELEASE. So instead of using a rescue CD to copy just the right files back or completely reinstalling my system, I'll just upgrade my system in-place to track STABLE. The install/upgrade/switching process is well-documented, and there are already mergetools responsible for making sure that all of the new artifacts go exactly where they're supposed to (over top of your old or broken existing ones).
Time for the Handbook. I'll start where I left off when everything imploded. Clean out /usr/src
and clone the source tree, but with an absolute path this time.
# rm -r /usr/src/*
# git clone -b stable/13 https://git.FreeBSD.org/src.git /usr/src
And then start compiling the world (userland, services, utilities) and the kernel. I'm going to use the GENERIC kernel for now so I can just get back up and running. This part takes a really really long time.
# make -j 4 buildworld buildkernel
It took literally half the day. Poor little dual-core i5. It was already well on its way to completing when I realized I could have done this on my 8-core/16-thread server[1]. "Oh, so you're reckless and stupid." Don't worry, I redeem myself later. Now I can actually install the new kernel and reboot into it. This is required before installing the world (userland).
# make installkernel
...
# reboot
After rebooting I can check out my new version.
# freebsd-version
13.0-RELEASE
Hmm, probably can't use freebsd-version(1)
because it's tied to the userland and freebsd-update(8)
. Let's try uname(1)
:
# uname -r
13.0-STABLE
Success. Kernel is rebuilt, reinstalled, and tracking STABLE. Now it's time to install/upgrade everything else. Side note: this is one of the cool things about FreeBSD. It's a complete operating system, not just a kernel or just a userland with a compiler and some utilities. All of the pieces were made to fit together instead of being glued together into a distro.
# make installworld
This took a little bit longer, but from the output I could see that all of my important, potentially-missing files were being restored. Libraries, core utilities, applications, daemons, config, and man pages all got put back in their proper place. My system is totally back to life and I'm confident that I'm running an un-maimed FreeBSD 13.0-STABLE:
% freebsd-version
13.0-STABLE
Now comes the sanity checking part of this job. I'm actually running a newer system now than I was before the upgrade. One of the components that was upgraded was ZFS, and with every major ZFS upgrade, I'm going to reinstall my root-on-ZFS bootloader before I reboot:
# gpart bootcode -p /boot/gptzfsboot -i 1 ada0
...
# reboot
This may be unecessary, but it ensures that my root-on-ZFS will load correctly after a reboot with the new ZFS. And to be fair, it is mentioned in the UPDATING
guide in the source.
After a reboot, I've got one more sanity check. FreeBSD comes with etcupdate(8)
, which you can use to manage merging upgrades with local changes to your /etc, /usr/local/etc
system config. If you run etcupdate diff
you can see a diff of all of your customizations. This is so good, I can't believe something like this doesn't exist on your typical Linux distro. Maybe it does and I just never realized it, but I'm betting they're all just different enough to not be able to share something like this. Anyway, after reviewing the diff, I applied any changes/merges by running etcupdate
.
Now for one last bit of housekeeping, and this comes straight from the handbook. After an upgrade, the world installation leaves behind old libraries and files that the new system doesn't need but old applications or ports built against an older target might still require. To get rid of them, you can use the Makefile
directives in /usr/src
# make check-old check-old-libs
After reviewing the list and ensuring you don't need those files, you can clean them up with
# make BATCH_DELETE_OLD_FILES=yes delete-old delete-old-libs
And finally, I'll force install my entire package tree to make sure any third-party missing files are reinstalled:
# pkg leaf | xargs pkg install -f
Redemption. I went from attempting to customize my kernel to annihilating /usr
to restoring my entire system by building from FreeBSD's source tree via git(1)
and make(1)
. And I got a free upgrade out of it! Moving forward, I'm running slightly frequent automatic full-system snapshots. It should make it a lot easier to rescue accidental deletions of system files. I'm also going to take the time to learn more about the rescue disk process using the FreeBSD installer image. All told, not too bad for a disaster-turned-learning-experience.