The House is on Fire!

The House Is On Fire!
Crisis Management in Linux

We’re halfway through the week and well into the series on Crisis Management in Linux. Changelogs, scheduled backups; both are useful extinguishers for the “fire” that will eventually strike your Linux system. However, sometimes the solution to a show-stopping problem doesn’t require reversion. Linux has a built-in program for checking and correcting filesystem errors that may put out the fire before you need to rebuild your kitchen.

One of the most common crises in Linux is file or filesystem corruption, especially when using the ext2 filesystem. This system doesn’t handle “dirty” shutdowns too well. We’ve talked about the reasons why in earlier Penguin Shell issues. In short, disk writes are cached in Linux. If a system crash occurs in the middle of a disk write, you’re likely to lose a file or, worse yet, a filesystem. The latter happens most often when the system is in the process of writing the meta-data – the data about the data. The ext3 and ReiserFS filesystems overcome this potential hazard by journalling all file actions and checking at boot for the existence of a “completed” flag for each “pending” action. That’s really an over simplification, but you understand the process.

So, what’s the crisis remedy when a filesystem is corrupted in ext2? fsck will find and fix the corruption – sort of a Linux Serpico. And, fsck is not terribly hard to use, though it may take a while to check a large filesystem. fsck is, in many ways, the Linux equivalent to Windows’ scandisk. It serves a similar purpose, anyway. If your filesystems are corrupted by a power failure, for example, fsck is the tool to use to check and correct the corruption.

    fsck -t ext2 /dev/hda5

This command executes fsck, indicating that it needs to check a filesystem of [-t]ype ext2 on /dev/hda5. This is most commonly run as the result of a prompt at boot indicating that it’s necessary. What you’re likely to see as the output of the command will look something like this:

/dev/hda5 was not cleanly unmounted, check forced.
Pass 1: Checking inodes, blocks and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 3: Checking reference counts
Pass 4: Checking group summary information

Free blocks count wrong for group 4 (4488, counted 4516).   FIXED
Free blocks count wrong for group 5 (2866, counted 3154),  &nbsp:FIXED
Inode bitmap differences: -4145.   FIXED
Free inodes count wrong for group #3 (2166, counted-2168).    FIXED
Free inodes count wrong (21685, counted=21688)   FIXED

/dev/hda5 ***** FILE SYSTEM WAS MODIFIED *****
/dev/hda5 16921/51486 files, 120154/191216 blocks

fsck, in other words, shows you the corrections its made and lets you know when the process is complete.

There are some advanced options for fsck and more condition-specific ways to use it. For a list of these options and circumstances, the fsck Man Page is a great reference. Its scope is far too broad to cover in Penguin Shell.

Remember, filesystem corruption may be correctable using fsck. It should be the first weapon in your boot-time arsenal for managing a crisis in Linux.