Forcing fsck to repair a corrupted CentOS root for systemd based hosts
WARNING: This is just a pile of text that isn't probably super useful to anyone, including myself. I've already forgotten how I got some of the data off the disk. In the end, I destroyed the VM's disk and rebuilt it from scratch.
This last weekend I had an unfortunate incident with one of my remote VMs. The underlying host crashed, hard, taking my VM down with it in an unclean manner. And lucky me, that left a bunch of docker containers with corrupted filesystems lying around!
May 27 22:38:40 blimp.aether.earth dockerd[9767]: Error starting daemon: error initializing graphdriver: lstat /var/lib/docker/overlay2/280979306bc8a5a038282a1216fc887c
alizing graphdriver: lstat /var/lib/docker/overlay2/280979306bc8a5a038282a1216fc887c443d1b9756a74641c2e15f3259e6aeda: input/output error
ls: cannot access /var/lib/docker/overlay2/280979306bc8a5a038282a1216fc887c443d1b9756a74641c2e15f3259e6aeda: Input/output error
rm: cannot remove ‘/var/lib/docker/overlay2/280979306bc8a5a038282a1216fc887c443d1b9756a74641c2e15f3259e6aeda’: Input/output error
Referring to the documentation of systemd-fsck
:
https://www.freedesktop.org/software/systemd/man/systemd-fsck@.service.html
Edit grub config:
vim /etc/default/grub
sudo grub2-mkconfig -o /boot/grub2/grub.cfg
reboot
recovering dotfiles
this is the easiest one to fix, since it's all stored in my public gitlab repo:
rm ~/dotfiles
git clone
rake
recovering yum
One of the configured repositories failed (Unknown),
and yum doesn't have enough cached data to continue. At this point the only
safe thing yum can do is fail. There are a few ways to work "fix" this:
file is encrypted or is not a database
Running file
on the failed repo returns just data
rather than recognizing it as an SQLite3 file, indicating it's been corrupted.
file /var/cache/yum/x86_64/7/jdoss-wireguard/gen/primary_db.sqlite
/var/cache/yum/x86_64/7/jdoss-wireguard/gen/primary_db.sqlite: data
deleting it fixes some things but wasn't the complete solution.
delete all of the downloaded sqlite package dbs:
sudo rm /var/cache/yum/x86_64/7/*/*primary.sqlite.bz2
got a step closer, yum changed error message to 'database disk image is malformed'
yum clean all
yum makecache
Seemed to work, but now I'm getting a bunch of rpm errors:
error: rpmdbNextIterator: skipping h# 500 region trailer: BAD, tag 962398765 type 542860144 offset -1969648229 count 1901928553
And yum update is freezing and locking a CPU at 100%. Rebuild the RPMDB:
rm -f /var/lib/rpm/__db*
rpm --rebuilddb
That worked, but now yum check
reports a bunch of errors.
In particular,
/sbin/ldconfig: /lib64/libseaudit.so.4 is not an ELF file - it has the wrong magic bytes at the start.
/sbin/ldconfig: /lib64/libseaudit.so.4 is not a symbolic link
Figured out that libseaudit.so.4
is owned by the setools-libs
package
So reinstall it fixes that error: sudo yum reinstall setools-libs
Then ran yum check obsoleted duplicates dependencies
again