[2024-feb-29] Sad news: Eric Layton aka Nocturnal Slacker aka vtel57 passed away on Feb 26th, shortly after hospitalization. He was one of our Wiki's most prominent admins. He will be missed.

Welcome to the Slackware Documentation Project

This is an old revision of the document!


Software RAID troubleshoot

Once, after upgrading my desktop slackware64 from 14.1 to 14.2 (with kernel upgrade too),
i be done with system, who,after lilo menu,
write “loading kernel ………………………..”
and then stops completely - nothing more.

Initial configuration was:
Intel DG965SS motherboard, core 2 duo 2.2 gHz E4500 CPU, 8 Gb RAM,
2 x 1000 Gb Seagate SATA HDD ( as sda and sdb)
ST1000DM003-1CH1
, dvd-writer on sata4 port

both seagate discs is partitioned as FD type ( linux autodetect raid) and 4 partitions ( mbr type) -

  • 100 Gb root (md1)
  • 2 Gb swap (md2)
  • 350 Gb /home (md3)
  • 550 Gb /Second (md4)

cat /proc/mdstat :

Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] 
md1 : active raid1 sda1[0] sdb1[1]
      104857536 blocks [2/2] [UU]
 
md2 : active raid1 sda2[0] sdb2[1]
      2097088 blocks [2/2] [UU]
 
md3 : active raid1 sda3[0] sdb3[1]
      367001536 blocks [2/2] [UU]
 
md4 : active raid1 sda4[0] sdb4[1]
      502805120 blocks [2/2] [UU]
 
unused devices: <none>

mdadm -Es :

ARRAY /dev/md1 UUID=7cc47bea:832f8260:208cdb8d:9e23b04b
ARRAY /dev/md2 UUID=cce81d3a:78965aa5:208cdb8d:9e23b04b
ARRAY /dev/md3 UUID=f0bc71fc:8467ef54:208cdb8d:9e23b04b
ARRAY /dev/md4 UUID=3f4daae2:cbf37a2a:208cdb8d:9e23b04b

# for p in 1 2 3 4; do mdadm –create /dev/md$p –name=$p –level=1 –raid-devices 2 /dev/sda$p /dev/sdb$p –metadata=0.90; done

My "fall" and "sucess" story

I have slackware64 14.1 system with raid1 on two discs.
no any mdadm.conf configuration, no any initrd -i use “huge” kernel, and all just works right.
Then i do massive system update via slackpkg update-all, including kernel update too.
check lilo.conf, restart - all looks ok. then i decide to upgrade system to 14.2 via the same slackpkg
( looks like live on macos is too boring, too predictable - all work, and so on…:D ) always, i check lilo.conf, check the new kernel is named right, have no old kernel for backup -
only one entry in lilo ( who was not good thing at all! ), and do reboot.
Then all interesting things start!!! :)

I have LiLo menu, kernel start loading. it show lots of “…” but then all stop, and nothing more do. That was indicated some problems with lilo updating, i suppose. I want to boot, and re-run lilo -v As so, i booting from Slackware64 Live CD http://bear.alienbase.nl/mirrors/slackware-live/ from AlienBob, and try mount my root partition for run lilo again.

But - there was a big problem!

there no my /dev/md1, /dev/md2, /dev/md3 and /dev/md4 after i load via slackware live CD!

and my lilo.conf was that:

# LILO configuration file
# generated by 'liloconfig'
#
# Start LILO global section
# Append any additional kernel parameters:
append=" vt.default_utf8=1"
boot = /dev/sda

#compact        # faster, but won't work on all systems.

# Boot BMP Image.
# Bitmap in BMP format: 640x480x8
  bitmap = /boot/slack.bmp
# Menu colors (foreground, background, shadow, highlighted
# foreground, highlighted background, highlighted shadow):
  bmp-colors = 255,0,255,0,255,0
# Location of the option table: location x, location y, number of
# columns, lines per column (max 15), "spill" (this is how many
# entries must be in the first column before the next begins to
# be used.  We don't specify it here, as there's just one column.
  bmp-table = 60,6,1,16
# Timer location x, timer location y, foreground color,
# background color, shadow color.
  bmp-timer = 65,27,0,255

# Standard menu.
# Or, you can comment out the bitmap menu above and 
# use a boot message with the standard menu:
#message = /boot/boot_message.txt

# Wait until the timeout to boot (if commented out, boot the
# first entry immediately):
prompt
# Timeout before the first entry boots.
# This is given in tenths of a second, so 600 for every minute:
timeout = 1200
# Override dangerous defaults that rewrite the partition table:
change-rules
  reset
# Normal VGA console
vga = normal
# Ask for video mode at boot (time out to normal in 30s)
#vga = ask
# VESA framebuffer console @ 1024x768x64k
#vga=791
# VESA framebuffer console @ 1024x768x32k
#vga=790
# VESA framebuffer console @ 1024x768x256
#vga=773
# VESA framebuffer console @ 800x600x64k
#vga=788
# VESA framebuffer console @ 800x600x32k
#vga=787
# VESA framebuffer console @ 800x600x256
#vga=771
# VESA framebuffer console @ 640x480x64k
#vga=785
# VESA framebuffer console @ 640x480x32k
#vga=784
# VESA framebuffer console @ 640x480x256
#vga=769
# End LILO global section
# Linux bootable partition config begins
image = /boot/vmlinuz
  root = /dev/md1
  label = Linux
  read-only
# Linux bootable partition config ends

so i do some research, do dmesg |grep md

and from array size found out, what number was my root partition ( it was 100 gb size). it was md126.

i mount it: mount /dev/md126 /mnt/hd

then i run mc, and check, i was really mount there my root partition.

then i do in console chroot /mnt/hd /sbin/lilo -v 3

but - lilo command end with error - it was cannot find root partition - /dev/md1.

That was problem, because now /dev/md1 was become as /dev/md126 for whatever reason. Then, i do some reading about RAID subsystems, forums and so on, made some mistakes and experiments, who resulted on these shortcuts:

Useful commands in this case

mdadm -Es mdadm -As

mdadm -D /dev/md127 mdadm -Db /dev/md127

lsscsi cat /proc/mdstat ls /dev/disk/by-uuid/

dmesg |grep md

chroot /mnt/hd /sbin/lilo -v 3

mdadm –stop /dev/md127

kernel options:

Useful Links:

Sources

Originally written by — John Ciemgals 2013/02/07 04:50

Rewrited with used materials from “Links” and LinuxQuestions.org Slackware forum, especially user bassmadrigal and bormant from linux.org.ru help — John Ciemgals 2014/03/19 01:15

 howtos:misc:software_raid_troubleshoot_howto ()