|Execute code in ROM(BIOS)||Loads the sector from sector 0, cylinder 0 of the boot drive|
|Execute code in sector 0||Boot loader code such as LILO/GRUB, loads kernel or the start of an operating system kernel. The loaded kernel, initialises device drivers , internal data structures|
|Kernel runs||Consults the ramdisk word. Word tells how and where to find its root filesystem. Root filesystem can be loaded into a ramdisk, the kernel can load a compressed filesystem from the floppy and uncompress it onto the ramdisk|
|Execute init program||(in /bin or /sbin) init reads its configuration file /etc/inittab, looks for a line designated sysinit and executes the named script|
|Run sysinit script||/etc/rc or /etc/init.d/boot sets up basic system services e.g. runs fsck, loads kernel modules, initialises swapping, network, and mounts disks mentioned in /etc/fstab. invokes other scripts to do modular initialization.|
|Run /etc/rc.d/ scripts||Control returned to init enters the default runlevel, runs getty which invokes the login program to handle login validation and to set up user sessions.|
- Main programs run during the boot process
- LILO fails to boot
a common cause
A common lilo problem is forgetting to rerun 'lilo' after changes to lilo.conf.
Must run 'lilo' after any changes to /etc/lilo.conf or whenever a new kernel is installed (lilo records the absolute location of the kernel on disk to the MBR or partition boot reord).
'lilo' needs to access the partition to locate the image. If the configuration file includes Linux images from multiple partitions these must be mounted before 'lilo' runs.
Display lilo.conf file settings - lilo
# lilo -qv
- LILO prompt is a diagnostic tool
The LILO prompt provides status codes. How much of the prompt you can see identifies how far in its boot process it has managed to get.
LILO boot diagnostic codes
Code Meaning L First stage boot loader is loaded and started LI Second stage boot loader is loaded LIL Second stage boot loader is started LIL? Second stage boot loader was loaded at an incorrect address LIL- Descriptor table is corrupt LILO all of LILO has loaded correctly
- GRUB (v1)
Stage 1 codes
Message Description Hard Disk Error Attempt to determine the size and geometry of the hard disk failed. Floppy Error Attempt to determine the size and geometry of the floppy disk failed. Read Error Disk read error happened while trying to read the stage2 or stage1.5. Geom Error Location of the stage2 or stage1.5 is not in the portion of the disk supported directly by the BIOS read calls. This could occur because the BIOS translated geometry has been changed by the user or the disk is moved to another machine or controller after installation or GRUB was not installed using itself
Stage 2 codes
Message Description Filename must be either an absolute filename or blocklist A file name is requested which doesn't fit the syntax/rules listed in the Filesystem. Bad file or directory type A file requested is not a regular file, but something like a symbolic link, directory, or FIFO. Bad or corrupt data while decompressing file The run-length decompression code gets an internal error. This is usually from a corrupt file. Bad or incompatible header in compressed file The file header for a supposedly compressed file is bad. Partition table invalid or corrupt If the sanity checks on the integrity of the partition table fail. This is a bad sign. Mismatched or corrupt version of stage1/stage2 The install command points to incompatible or corrupt versions of the stage1 or stage2. Loading below 1MB is not supported The lowest address in a kernel is below the 1MB boundary. zImage format can be handled Kernel must be loaded before booting If GRUB is told to execute the boot sequence without having a kernel to start. Unknown boot failure The boot attempt did not succeed for reasons which are unknown. Unsupported Multiboot features requested The Multiboot features word in the Multiboot header requires a feature that is not recognised. Unrecognised device string A device string was expected, and the string encountered didn't fit the syntax/rules listed in the Filesystem. Invalid device requested A device string is recognizable but does not fall under the other device errors. Invalid or unsupported executable format The kernel image being loaded is not recognised as Multiboot or one of the supported native formats. Filesystem compatibility error, cannot read whole file Some of the filesystem reading code in GRUB has limits on the length of the files it can read. File not found The specified file name cannot be found, but everything else (like the disk/partition info) is OK. Inconsistent filesystem structure Filesystem code to denote an internal error caused by the sanity checks of the filesystem structure on disk not matching what it expects. This is usually caused by a corrupt filesystem or bugs in the code handling it in GRUB. Cannot mount selected partition The partition requested exists, but the filesystem type cannot be recognised by GRUB. Selected cylinder exceeds maximum supported by BIOS A read is attempted at a linear block address beyond the end of the BIOS translated area. This generally happens if your disk is larger than the BIOS can handle (512MB for (E)IDE disks on older machines or larger than 8GB in general). Linux kernel must be loaded before initrd The initrd command is used before loading a Linux kernel. Multiboot kernel must be loaded before modules The module load command is used before loading a Multiboot kernel. Selected disk does not exist Device part of a device- or full file name refers to a disk or BIOS device is not present or not recognised by the BIOS No such partition A partition is requested in the device part of a device- or full file name which isn't on the selected disk. Error while parsing number GRUB was expecting to read a number and encountered bad data. Attempt to access block outside partition A linear block address is outside of the disk partition. - corrupt filesystem on the disk or a bug in the code handling it in GRUB Disk read error There is a disk read error when trying to probe or read data from a particular disk. Too many symbolic links The link count is beyond the maximum (currently 5), possibly the symbolic links are looped. Unrecognised command An unrecognised command is entered on the command-line or in a boot sequence section of a configuration file. Selected item cannot fit into memory A kernel, module, or raw file load cmd is either trying to load its data such that it won't fit into memory or it is too big. Disk write error A disk write error when trying to write to a particular disk. generally occur during an install of set active partition cmd. Invalid argument An argument specified to a command is invalid. File is not sector aligned Only when you access a ReiserFS partition by block-lists (e.g. the command install). In this case, you should mount the partition with the '-o notail' option. Must be authenticated You try to run a locked entry. You should enter a correct password before running such an entry. Serial device not configured You try to change your terminal to a serial one before initializing any serial device. No spare sectors on the disk A disk doesn't have enough spare space. This happens when you try to embed Stage 1.5 into the unused sectors after the MBR, but the first partition starts right after the MBR or they are used by EZ-BIOS.
- GRUB (v1) boot
command line sequence
If there is no /boot/grub/menu.lst file, incorrectly specified kernel or a problem with accessing partitions, GRUB drops to a prompt - expecting a sequence of commands.
Boot command sequence
grub> root (hd0,4) (Location of root filesystem) grub> kernel /vmlunuz-2.6.24-23 root=/dev/sda5 ro vga=0x317 (kernel name:kernel location,kernel params) (root device:location of /boot) grub> initrd /initrd.img-2.6.24-23 (initrd name:name and location of ramdisk image if being used) grub> boot
- GRUB2 diagnostic codes
GRUB2 does not provide any instead it drops to a rescue shell whenever there is a problem. Getting a rescue shell usually means that grub failed to load the 'normal' module.
The first thing 'grub2' does is read the 'prefix'. If this is wrong (i.e. it refers to the wrong device or the relative path to /boot/grub/ is wrong) it will be unable to find what it needs. This can be corrected via the rescue shell.
Load normal module and continue boot process
# Inspect the current prefix (and other preset variables): grub rescue > set # Find out which devices are available: grub rescue > ls # Set to the correct value, which might be something like this: grub rescue > set prefix=(hd0,1)/grub grub rescue > set root=(hd0,1) grub rescue > insmod normal grub rescue > normal
Generally though, if you get a rescue shell it probably means that GRUB was not installed correctly and needs to be re-installed via 'grub-install [device]'
- Boot failure
No operating system found
Occurs post BIOS pre OS load/run. Could be that:
- boot sector is trashed and needs to be reinstalled
- active partition has no boot sector
- wrong partition designated as active - misconfiguration
- Boot failure
Several possible causes:
- bad hard drive sector
- cannot find a kernel module
- cannot find root device
Use on-screen information to try and determine where the problem lies.
- Boot failure
fails to run 'init' and enter correct runlevel
Usually down to configuration errors in /etc/inittab or whatever the distribution uses
- Boot failure
fails to start one or more processes in a run-level
Generally application level faults e.g.
- 'autofs' will not start if no maps are defined
- 'httpd' may fail if httpd.conf or apache2.conf is misconfigured
- dependent software may not be installed or configured correctly
- network applications may have a problem with name resolution
- Boot failure
fails to present a login prompt
Down to how the program used for this function is configured be it - getty, gdm, xdm, kdm ..., for some reason they cannot run
- Boot failure
Possible solutions include:
- (a) Boot from floppy (we all keep one handy!!) and re-install LILO or GRUB.
- (b) If no boot floppy, boot from install media (or rescue/live cd) and select recovery mode.
The root filesystem is usually mounted under /mnt/sysimage. If it is not then mount the / filesystem (and /boot if separate filesystem) manually.
Change root to the mount point
# chroot /mnt/sysimage
and re-install LILO or GRUB
- Boot failure
partition number has changed
If this is the case booting from a floppy will not work as it is the Stage 2 part of the boot loader that has been moved. Boot from install media as for re-install LILO/GRUB above. To avoid this type of problem use labels (UUID) whenever possible.
Use labels with LILO or GRUB
LILO: /etc/lilo.conf append="root=LABEL=boot_partition_label" GRUB: /boot/grub/menu.lst kernel /vmlunuz-2.6.24-23 root=LABEL=boot_partition_label ro vga=0x317
Remember to rerun 'lilo' after modifying it's configuration file.
Also consider having '/boot' as a separate filesystem in a partition that is unlikely to be changed or affected by changes to other partitions - especially if '/' is in a Logical Volume.
- Boot failure
system init failure
When there is a problem with 'init' you need to bypass it and work within a shell instead.
(1) Bypass 'init' at either the lilo or grub prompt
The resulting shell provides no job control so 'ctrl-C', 'ctrl-Z' do not work, no mouse, no VTs etc..
- A hung or long running program can only be stopped by a re-boot. To prevent having to re-boot, put it in the background.
- Shift-Page up/down to scroll or page the console display.
- Can use 'dd' to view, copy, edit files - not very friendly mind.
- 'cat' can be used to list files or as a hard-core editor.
Using 'dd' as unwieldy editor
$ more fred.txt 1 2 3 4 ..... $ $ dd ibs=1 skip=2 count=3 if=fred.txt of=bert.txt 3+0 records in 0+1 records out 3 bytes (3 B) copied, 0.000817361 s, 3.7 kB/s $ $ more bert.txt 2 3 $
'ibs' = input block size (set to 1 byte), 'skip' 2 bytes, read the next 'count' 3 bytes and write to 'of' bert.txt
Using 'cat' as a hardcore editor
cat '.......' > /etc/fstab
like the use of 'dd' maybe better to boot from a rescue disk!! 'busybox' maybe available => access to more commands.
(2) Mount '/' - if unsure if the root filesystem is okay then check it first
# fsck -f / # mount -o remount,rw /
(3) Mount '/' - if '/etc/fstab' is damaged may need to use device names
# fsck -f /dev/sda? (? = / partition number) # mount -o remount,rw /dev/sda?
(4) Mount other filesystems - if '/etc/fstab' is intact
# fsck -A -a # mount -av
Determine 'init' fault - run and observe each startup script manually. Fix problem then boot into single user for more friendly environment.
- Restoring the bootstrap
reinstall GRUB or LILO
Mount the root filesystem and reinstall the boot loader
# chroot /mnt /bin/bash (Assume / has been mounted on /mnt) # mount /boot # grub-install /dev/sda (or) # lilo -v
- Get more commands to work on a damaged system
Easier if /etc/mtab (linked to /proc/mounts) is correct - usually is but not so in the chroot environment.
# chroot /mnt /bin/bash # mount /proc # cat /proc/mounts /etc/mtab # fsck -A -a # mount -av ..... # df -k (Will now work)
- Loss of key files
Missing binaries and shared libraries for example, require reinstallation. Many files in /etc can be copied from other machines or written by hand.
Use a rescue CD - go into the chroot environment - get network up by running networking initialisation scripts - use 'ftp', 'scp', 'wget' or a package update command such as 'apt-get' to get required files from another machine.
If system is badly damaged making execution of progs hard/impossible both 'dpkg' and 'rpm' support '--root=/mnt and' '-root /mnt' respectively to allow installation of packages without doing a chroot first. Easiest way is to get a tar file of damaged/missing files then copy to damaged system and unpack on top of the damaged fs.
- Checking a filesystem
For recovery purposes, the ability to control mount points allows for forensic analysis on partitions using 'fsck' or other tools without risk of further damage to a damaged filesystem.
May also custom mount a filesystem using various options, the most important being read-only using either of the synonyms '-r' or '-o ro'.
See the Filesystem section for more details on mounting and maintaining.
- Boot diskettes
Mainly used to install a distribution when the system cannot boot from CD-rom, netbooting is an alternative. Or when the boot disk has a problem.
Creating boot floppies is dependent upon the distribution. They can require the creation of one or more boot diskettes - a boot diskette and maybe a root diskette.
This usually involves copying a boot image to the floppy, the boot image may be called anything - boot.img, rootdisk.img ...
Diskette type Description boot Contains a kernel which can be booted. kernel on a bootdisk usually needs to be told where to find its root filesystem. Can set up to load a hard disk's root filesystem instead. root Contains a filesystem with files required to run a Linux system. Can be used to run the system independently of any other disks, once the kernel has been booted. Usually the root disk is automatically copied to a ramdisk (disk accesses much faster, and frees up the disk drive for a utility disk) boot/root Contains both the kernel and a root fs i.e. everything necessary to boot and run a Linux system without a hard disk. utility Contains a filesystem,not intended to be mounted as a root file system. Additional data disk.
- Create a (generic) boot floppy
Mount the installation CD-rom and a blank floppy
# mount -t iso9660 /dev/cdrom /media/cdrom # mount /dev/fd0 /media/floppy
Locate 'boot.img' file on CD-rom (usually 'images/boot.img').
Copy boot.img to floppy
# dd if=/media/cdrom/images/boot.img of=/media/floppy
If using a windows machine (CD-rom on d:)
c:\rawrite.exe Enter source file name: d:\images\boot.img Enter destination drive: A
- Create debian and redhat boot floppies
Makes a bootdisk, by default the bootdisk will use the kernel /vmlinuz and the current root partition.
mkboot [-r rootpartition] [-i] [-d device] [kernel] $ sudo mkboot Insert a floppy diskette into your boot drive, and press . Creating a lilo bootdisk...
Creates a boot floppy appropriate for the running system. The boot disk is entirely self-contained, includes an initial ramdisk image which loads any necessary SCSI modules for the system.
The created boot disk looks for the root filesystem on the device suggested by /etc/fstab. The only required argument is the kernel version to put onto the boot floppy.
mkbootdisk [options] kernel