[2024-feb-29] Sad news: Eric Layton aka Nocturnal Slacker aka vtel57 passed away on Feb 26th, shortly after hospitalization. He was one of our Wiki's most prominent admins. He will be missed.

Welcome to the Slackware Documentation Project

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
Next revisionBoth sides next revision
howtos:misc:software_raid_troubleshoot_howto [2016/11/26 17:01 (UTC)] wisedracohowtos:misc:software_raid_troubleshoot_howto [2016/11/27 12:41 (UTC)] – [My fall and sucess story] wisedraco
Line 1: Line 1:
 +<!-- Add your text below. We strongly advise to start with a Headline (see button bar above). -->
 +
 +====== Software RAID troubleshoot ======
 +
 +Once, after upgrading my desktop slackware64 from 14.1 to 14.2 (with kernel upgrade too),\\ 
 +i be done with system, who,after lilo menu,\\
 + write "loading kernel ............................." \\
 +\\
 +and then stops completely - nothing more.\\
 +\\
 +Initial configuration was:\\
 +Intel DG965SS motherboard, core 2 duo 2.2 gHz E4500 CPU, 8 Gb RAM, \\
 +2 x 1000 Gb Seagate SATA HDD ( as sda and sdb)\\
 +ST1000DM003-1CH1\\
 +dvd-writer on sata4 port
 +
 +both seagate discs is partitioned as FD type ( linux autodetect raid) and 4 partitions ( mbr type) - 
 +  * 100 Gb root (md1)
 +  * 2 Gb swap (md2)
 +  * 350 Gb /home (md3)
 +  * 550 Gb /Second (md4)
 +\\
 +\\
 +cat /proc/mdstat :
 +
 +<code bash>
 +Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] 
 +md1 : active raid1 sda1[0] sdb1[1]
 +      104857536 blocks [2/2] [UU]
 +      
 +md2 : active raid1 sda2[0] sdb2[1]
 +      2097088 blocks [2/2] [UU]
 +      
 +md3 : active raid1 sda3[0] sdb3[1]
 +      367001536 blocks [2/2] [UU]
 +      
 +md4 : active raid1 sda4[0] sdb4[1]
 +      502805120 blocks [2/2] [UU]
 +      
 +unused devices: <none>
 +</code>
 +\\
 +\\
 +mdadm -Es :
 +
 +<code bash>
 +
 +ARRAY /dev/md1 UUID=7cc47bea:832f8260:208cdb8d:9e23b04b
 +ARRAY /dev/md2 UUID=cce81d3a:78965aa5:208cdb8d:9e23b04b
 +ARRAY /dev/md3 UUID=f0bc71fc:8467ef54:208cdb8d:9e23b04b
 +ARRAY /dev/md4 UUID=3f4daae2:cbf37a2a:208cdb8d:9e23b04b
 +
 +</code>
 +\\
 +\\
 +<code>
 +# for p in 1 2 3 4; do mdadm --create /dev/md$p --name=$p --level=1 --raid-devices 2 /dev/sda$p /dev/sdb$p --metadata=0.90; done
 +</code>
 +\\
 +
  ===== My "fall" and "sucess" story =====  ===== My "fall" and "sucess" story =====
  
Line 5: Line 65:
 Then i do massive system update via slackpkg update-all, including kernel update too.\\ Then i do massive system update via slackpkg update-all, including kernel update too.\\
 check lilo.conf, restart - all looks ok. then i decide to upgrade system to 14.2 via the same slackpkg \\ check lilo.conf, restart - all looks ok. then i decide to upgrade system to 14.2 via the same slackpkg \\
-( looks like live on macos is too boring, too predictable - all work, and so on...:D )+( looks like live on macos is too boring, too predictable - all work, and so on...:D )\\
 always, i check lilo.conf, check the new kernel is named right, have no old kernel for backup - \\ always, i check lilo.conf, check the new kernel is named right, have no old kernel for backup - \\
 only one entry in lilo ( who was not good thing at all! ), and do reboot. only one entry in lilo ( who was not good thing at all! ), and do reboot.
Line 96: Line 156:
 # Linux bootable partition config ends # Linux bootable partition config ends
 </code> </code>
-\\+\\ 
 so i do some research, do so i do some research, do
 <code> <code>
 dmesg |grep md dmesg |grep md
 </code> </code>
-\\+\\ 
 and from array size found out, what number was my root partition ( it was 100 gb size). it was md126. and from array size found out, what number was my root partition ( it was 100 gb size). it was md126.
-\\+\\ 
 i mount it: i mount it:
 <code> <code>
 mount /dev/md126 /mnt/hd mount /dev/md126 /mnt/hd
 </code> </code>
-\\+\\  
  
 Then i run mc, and check, i was really mount there my root partition. Then i run mc, and check, i was really mount there my root partition.
-\\+\\ 
  
 Then i do in console : Then i do in console :
-\\+\\ 
 <code> <code>
 chroot /mnt/hd /sbin/lilo -v 3 chroot /mnt/hd /sbin/lilo -v 3
 </code> </code>
-\\+\\ 
 but - lilo command end with error - it was cannot find root partition - /dev/md1. but - lilo command end with error - it was cannot find root partition - /dev/md1.
  
-That was problem, because now /dev/md1 was become as /dev/md126 for whatever reason.//+That was problem, because now /dev/md1 was become as /dev/md126 for whatever reason.\\
 Then, i do some reading about RAID subsystems, forums and so on, made some mistakes and experiments, who resulted on these shortcuts: Then, i do some reading about RAID subsystems, forums and so on, made some mistakes and experiments, who resulted on these shortcuts:
 \\ \\
Line 262: Line 322:
 that was, in a most, all of story. that was, in a most, all of story.
 yet, there is some another workarounds of that situation. yet, there is some another workarounds of that situation.
 +\\
 +\\
 +
 + ===== Workarounds for incorrect raid devices naming =====
 +
 +
 +  - 1. Using UUID in lilo ( i do not check this), and in fstab for mounting partitions.
 +
 + do a 
 +
 +<code>
 +ls /dev/disk/by-uuid/
 +</code>
 +
 +or better, go in that location with midnight commander, and youl see, there is a "files" named as numbers - 
 +that was the raid array disk uuid - and symlink to /dev/mdx.
 +
 +<code>
 +/dev/md2         swap             swap        defaults           0
 +/dev/md1         /                ext4        defaults           1
 +##/dev/md3         /home            ext4        defaults           2
 +/dev/disk/by-uuid/ef92814a-2db1-4d47-8d70-4c5a8d56e287   /home            ext4        defaults           2
 +/dev/md4         /Second          ext4        defaults           2
 +#/dev/cdrom      /mnt/cdrom       auto        noauto,owner,ro,comment=x-gvfs-show 0   0
 +/dev/fd0         /mnt/floppy      auto        noauto,owner       0
 +devpts           /dev/pts         devpts      gid=5,mode=620     0
 +proc             /proc            proc        defaults           0
 +tmpfs            /dev/shm         tmpfs       defaults           0
 +
 +</code>
 +
 +<note important>take a note!
 +disk UUID by 
 +
 +/dev/disk/by-uuid/
 +
 +and that one, who you get via 
 +
 +<code>
 +mdadm -D
 +mdadm -Db
 +mdadm -Es
 +</code>
 +
 +differ, not the same!!!
 + in fstab ( lilo too?) you must use UID from /dev/disk/by-uuid/  !</note>
 +
 +
 +
 +
 +  - 2. Using initramd.
 +
 +<code>
 +#
 +# mkinitrd_command_generator.sh revision 1.45
 +#
 +# This script will now make a recommendation about the command to use
 +# in case you require an initrd image to boot a kernel that does not
 +# have support for your storage or root filesystem built in
 +# (such as the Slackware 'generic' kernels').
 +# A suitable 'mkinitrd' command will be:
 +
 +#/usr/share/mkinitrd/
 +
 +mkinitrd -c -k 3.2.29 -f ext4 -r /dev/md1 -m mbcache:jbd2:ext4 -R -u -o /boot/initrd.gz
 +
 +</code>
 +\\
 +\\
 +rightly edited mdadm.conf then must be copied in/boot/tree??? before you run this mkinitrd conf.\\
 +
 +after you run that mkinitrd, you must update lilo.
 +
 +
 + 
 +
 +
 +
 +
 + 
 +
 +
 +
 +
 +
 + 
 +====== Useful commands in this case ======
 +<code>
 +mdadm -Es
 +</code>
 +
 +<code>
 +mdadm -As
 +</code>
 +
 +<code>
 +mdadm -D /dev/md127
 +</code>
 +
 +<code>
 +mdadm -Db /dev/md127
 +</code>
 +
 +<code>
 +lsscsi
 +</code>
 +
 +<code>
 +cat /proc/mdstat
 +</code>
 +
 +<code>
 +ls /dev/disk/by-uuid/
 +</code>
 +
 +<code>
 +dmesg |grep md
 +</code>
 +
 +<code>
 +chroot /mnt/hd /sbin/lilo -v 3
 +</code>
 +
 +<code>
 +mdadm --stop /dev/md127
 +</code>
 +
 +kernel options:
 +
 +<code>
 +$kernelname raid=noautodetect md=1, /dev/sda1,/dev/sdb1
 +</code>
 +
 +
 +====== Useful Links: ======
 +  * [[http://www.linuxquestions.org/questions/slackware-14/repair-lilo-on-software-raid1-4175593663/]]
 +  * [[https://bugzilla.redhat.com/show_bug.cgi?id=606481]]
 +  * [[https://www.linux.org.ru/forum/admin/13033496?lastmod=1479927790872]] (in russian )
 +  * [[https://raid.wiki.kernel.org/index.php/Linux_Raid]]
 +  * [[https://www.kernel.org/doc/Documentation/md.txt]]
 +  
 +
 +
 +
 +====== Sources ======
 +Originally written by  --- //[[wiki:user:wisedraco|John Ciemgals]] 2016/11/28 04:50//
 +
 +Rewrited with used materials from "Links" and LinuxQuestions.org Slackware forum, especially user bassmadrigal and bormant from linux.org.ru help --- //[[wiki:user:wisedraco|John Ciemgals]] 2016/11/28 09:15//
 +
 +<!-- Please do not modify anything below, except adding new tags.-->
 +{{tag>software raid raid1 /dev/md127 enumeration broken linux slackware author_wisedraco}}
 +
 +
 +
 +
  
 howtos:misc:software_raid_troubleshoot_howto ()