[2024-feb-29] Sad news: Eric Layton aka Nocturnal Slacker aka vtel57 passed away on Feb 26th, shortly after hospitalization. He was one of our Wiki's most prominent admins. He will be missed.

Welcome to the Slackware Documentation Project

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
howtos:misc:software_raid_troubleshoot_howto [2016/11/26 16:55 (UTC)] – [My fall and sucess story] wisedracohowtos:misc:software_raid_troubleshoot_howto [2017/06/26 15:08 (UTC)] (current) – [Workarounds for incorrect raid devices naming] wisedraco
Line 1: Line 1:
 <!-- Add your text below. We strongly advise to start with a Headline (see button bar above). --> <!-- Add your text below. We strongly advise to start with a Headline (see button bar above). -->
 +
 ====== Software RAID troubleshoot ====== ====== Software RAID troubleshoot ======
- 
- 
-########### IT WAS STILL IN WRITING  / EDITING PHASE! NOT FINISHED! TRY FINISH IN A FEW DAYS, THEN REMOVE THIS LINE! 
-##################################################################################################################### 
- 
- 
  
 Once, after upgrading my desktop slackware64 from 14.1 to 14.2 (with kernel upgrade too),\\  Once, after upgrading my desktop slackware64 from 14.1 to 14.2 (with kernel upgrade too),\\ 
 i be done with system, who,after lilo menu,\\ i be done with system, who,after lilo menu,\\
  write "loading kernel ............................." \\  write "loading kernel ............................." \\
 +\\
 and then stops completely - nothing more.\\ and then stops completely - nothing more.\\
 +\\
 Initial configuration was:\\ Initial configuration was:\\
 Intel DG965SS motherboard, core 2 duo 2.2 gHz E4500 CPU, 8 Gb RAM, \\ Intel DG965SS motherboard, core 2 duo 2.2 gHz E4500 CPU, 8 Gb RAM, \\
 2 x 1000 Gb Seagate SATA HDD ( as sda and sdb)\\ 2 x 1000 Gb Seagate SATA HDD ( as sda and sdb)\\
 ST1000DM003-1CH1\\ ST1000DM003-1CH1\\
-dvd-writer on sata4 port+dvd-writer on sata4 port
  
 both seagate discs is partitioned as FD type ( linux autodetect raid) and 4 partitions ( mbr type) -  both seagate discs is partitioned as FD type ( linux autodetect raid) and 4 partitions ( mbr type) - 
-   * 100 Gb root (md1) +  * 100 Gb root (md1) 
-  *  2 Gb swap (md2) +  * 2 Gb swap (md2) 
-  *  350 Gb /home (md3) +  * 350 Gb /home (md3) 
-  *  550 Gb /Second (md4) +  * 550 Gb /Second (md4) 
 +\\ 
 +\\
 cat /proc/mdstat : cat /proc/mdstat :
  
Line 43: Line 40:
 unused devices: <none> unused devices: <none>
 </code> </code>
 +\\ 
 +\\
 mdadm -Es : mdadm -Es :
  
Line 54: Line 52:
  
 </code> </code>
-// +\\ 
-//+\\ 
 +<code>
 # for p in 1 2 3 4; do mdadm --create /dev/md$p --name=$p --level=1 --raid-devices 2 /dev/sda$p /dev/sdb$p --metadata=0.90; done # for p in 1 2 3 4; do mdadm --create /dev/md$p --name=$p --level=1 --raid-devices 2 /dev/sda$p /dev/sdb$p --metadata=0.90; done
 +</code>
 +\\
  
  ===== My "fall" and "sucess" story =====  ===== My "fall" and "sucess" story =====
Line 64: Line 65:
 Then i do massive system update via slackpkg update-all, including kernel update too.\\ Then i do massive system update via slackpkg update-all, including kernel update too.\\
 check lilo.conf, restart - all looks ok. then i decide to upgrade system to 14.2 via the same slackpkg \\ check lilo.conf, restart - all looks ok. then i decide to upgrade system to 14.2 via the same slackpkg \\
-( looks like live on macos is too boring, too predictable - all work, and so on...:D )+( looks like live on macos is too boring, too predictable - all work, and so on...:D )\\
 always, i check lilo.conf, check the new kernel is named right, have no old kernel for backup - \\ always, i check lilo.conf, check the new kernel is named right, have no old kernel for backup - \\
 only one entry in lilo ( who was not good thing at all! ), and do reboot. only one entry in lilo ( who was not good thing at all! ), and do reboot.
Line 155: Line 156:
 # Linux bootable partition config ends # Linux bootable partition config ends
 </code> </code>
-\\+\\ 
 so i do some research, do so i do some research, do
 <code> <code>
 dmesg |grep md dmesg |grep md
 </code> </code>
-//+\\ 
 and from array size found out, what number was my root partition ( it was 100 gb size). it was md126. and from array size found out, what number was my root partition ( it was 100 gb size). it was md126.
-//+\\ 
 i mount it: i mount it:
 <code> <code>
 mount /dev/md126 /mnt/hd mount /dev/md126 /mnt/hd
 </code> </code>
-//+\\  
  
 Then i run mc, and check, i was really mount there my root partition. Then i run mc, and check, i was really mount there my root partition.
-//+\\ 
  
 Then i do in console : Then i do in console :
-//+\\ 
 <code> <code>
 chroot /mnt/hd /sbin/lilo -v 3 chroot /mnt/hd /sbin/lilo -v 3
 </code> </code>
-//+\\ 
 but - lilo command end with error - it was cannot find root partition - /dev/md1. but - lilo command end with error - it was cannot find root partition - /dev/md1.
  
-That was problem, because now /dev/md1 was become as /dev/md126 for whatever reason.//+That was problem, because now /dev/md1 was become as /dev/md126 for whatever reason.\\
 Then, i do some reading about RAID subsystems, forums and so on, made some mistakes and experiments, who resulted on these shortcuts: Then, i do some reading about RAID subsystems, forums and so on, made some mistakes and experiments, who resulted on these shortcuts:
-//+\\
  
 I made array assembling strings in mdadm.conf via do these comands in terminal: I made array assembling strings in mdadm.conf via do these comands in terminal:
Line 193: Line 194:
 mdadm -Db /dev/md124 >> /mnt/hd/etc/mdadm.conf mdadm -Db /dev/md124 >> /mnt/hd/etc/mdadm.conf
 </code> </code>
-// +\\ 
-//+\\
 In a result i get something like that in end of mdadm.conf: In a result i get something like that in end of mdadm.conf:
-//+\\
 <code> <code>
 ARRAY /dev/md125 metadata=0.90 UUID=7cc47bea:832f8260:208cdb8d:9e23b04b  #this one is 100 gb partition: \ (md1) ARRAY /dev/md125 metadata=0.90 UUID=7cc47bea:832f8260:208cdb8d:9e23b04b  #this one is 100 gb partition: \ (md1)
Line 205: Line 206:
  
 then, based on : then, based on :
-//+\\
 <code> <code>
 dmesg | grep md dmesg | grep md
 </code> </code>
-//+\\
 and /mnt/hd/etc/fstab i found out which md12x must be md1, md2, md3 and md4, and write it there after # as shown above. and /mnt/hd/etc/fstab i found out which md12x must be md1, md2, md3 and md4, and write it there after # as shown above.
-// +\\ 
-//+\\
 Then i edit it to become in right way: Then i edit it to become in right way:
-//+\\
 <code> <code>
 ARRAY /dev/md1 metadata=0.90 UUID=7cc47bea:832f8260:208cdb8d:9e23b04b ARRAY /dev/md1 metadata=0.90 UUID=7cc47bea:832f8260:208cdb8d:9e23b04b
Line 221: Line 222:
 ARRAY /dev/md4 metadata=0.90 UUID=3f4daae2:cbf37a2a:208cdb8d:9e23b04b ARRAY /dev/md4 metadata=0.90 UUID=3f4daae2:cbf37a2a:208cdb8d:9e23b04b
 </code> </code>
-//+\\
 also i wrote in mdadm.conf that string, just to be sure, hostname do not affect raid naming: also i wrote in mdadm.conf that string, just to be sure, hostname do not affect raid naming:
-//+\\
 <code> <code>
 HOMEHOST <ignore> HOMEHOST <ignore>
 </code> </code>
-//+\\
 then i made mdadm_stop_127.scr script: then i made mdadm_stop_127.scr script:
-//+\\
 <code bash> <code bash>
 #!/bin/sh #!/bin/sh
Line 249: Line 250:
 </code> </code>
  
-// +\\ 
-//+\\
 I copy it to livesystem root, copy also my edited mdadm.conf from /mnt/hd/etc/mdadm.conf I copy it to livesystem root, copy also my edited mdadm.conf from /mnt/hd/etc/mdadm.conf
 to livesystem /etc, and umount /dev/md125 ! to livesystem /etc, and umount /dev/md125 !
-// +\\ 
-//+\\
  
 And then  i run it - my mdadm_stop_127.scr And then  i run it - my mdadm_stop_127.scr
-// +\\ 
-//+\\
 I see, all that arrays be stopped, and  I see, all that arrays be stopped, and 
 then i run: then i run:
-//+\\
 <code> <code>
 mdadm -As mdadm -As
 </code> </code>
-//+\\
 then i see then i see
  
Line 271: Line 272:
 cat /proc/mdstat cat /proc/mdstat
 </code> </code>
-//+\\
 I see, all my RAID become as it must be - /dev/md1, md2, md3, and md4! I see, all my RAID become as it must be - /dev/md1, md2, md3, and md4!
-// +\\ 
-//+\\
 then i do mount my root hdd again: then i do mount my root hdd again:
-//+\\
 <code> <code>
 mount /dev/md1  /mnt/hd mount /dev/md1  /mnt/hd
 </code> </code>
-//+\\
 and do: and do:
  
Line 286: Line 287:
 chroot /mnt/hd /sbin/lilo -v 3 chroot /mnt/hd /sbin/lilo -v 3
 </code> </code>
-//+\\
  
 all looks good. all looks good.
Line 295: Line 296:
  
 ( or press ctrl +alt+ del) ( or press ctrl +alt+ del)
-// +\\ 
-//+\\
 after restart i see, system start loading, and went after previously dead point. i get to after restart i see, system start loading, and went after previously dead point. i get to
 login screen, log in as root, and see, there is md1 ( root) and md2 ( swap), but no md3 and md4 ( instead of it i have these /home and /Second arrays as md125 and md124). login screen, log in as root, and see, there is md1 ( root) and md2 ( swap), but no md3 and md4 ( instead of it i have these /home and /Second arrays as md125 and md124).
Line 304: Line 305:
  
 i restart machine, press tab on LiLo prompt, and use kernel parameters: i restart machine, press tab on LiLo prompt, and use kernel parameters:
-//+\\
 <code> <code>
 Linux raid=noautodetect md=1,/dev/sda1,/dev/sdb1 Linux raid=noautodetect md=1,/dev/sda1,/dev/sdb1
 </code> </code>
-//+\\
 that says kernel not to autodetect raid arrays, but assemble md1 raid ( md=1) from /dev/sda1 and /dev/sdb1 partition, that says kernel not to autodetect raid arrays, but assemble md1 raid ( md=1) from /dev/sda1 and /dev/sdb1 partition,
 because without root kernel system cant start. because without root kernel system cant start.
-//+\\
 as i load system in that way, all looks right - there was /dev/md1, md2, md3 and md4. as i load system in that way, all looks right - there was /dev/md1, md2, md3 and md4.
-//+\\
 then i just do system restart, without any kernel parameters, and all again going to be right - md1 till md4. then i just do system restart, without any kernel parameters, and all again going to be right - md1 till md4.
-//+\\
 looks like, system writes something in RAID arrays superblock, or metadata, or something like that, about previously gived md name, because, if not, after restart i must get again situation as previously - with md1, md2, md125 and ,d124... looks like, system writes something in RAID arrays superblock, or metadata, or something like that, about previously gived md name, because, if not, after restart i must get again situation as previously - with md1, md2, md125 and ,d124...
-// +\\ 
-//+\\
 that was, in a most, all of story. that was, in a most, all of story.
 yet, there is some another workarounds of that situation. yet, there is some another workarounds of that situation.
 +\\ 
 +\\
  
  ===== Workarounds for incorrect raid devices naming =====  ===== Workarounds for incorrect raid devices naming =====
Line 329: Line 331:
  
  do a   do a 
 +
 +<code>
 ls /dev/disk/by-uuid/ ls /dev/disk/by-uuid/
 +</code>
  
 or better, go in that location with midnight commander, and youl see, there is a "files" named as numbers -  or better, go in that location with midnight commander, and youl see, there is a "files" named as numbers - 
 that was the raid array disk uuid - and symlink to /dev/mdx. that was the raid array disk uuid - and symlink to /dev/mdx.
 +
 +for automate info feed into fstab you can use this way:
 +
 +<code>
 +
 +cd /dev/disk/by-uuid
 +
 +ls -d -l $PWD/* >> /etc/fstab
 +</code>
 +
 +after that you must immediately edit /etc/fstab and make in a right way, otherwise you may have problems with mounting in next boot...
 +
  
 <code> <code>
Line 348: Line 365:
 </code> </code>
  
-take a note!+<note important>take a note!
 disk UUID by  disk UUID by 
  
Line 354: Line 371:
  
 and that one, who you get via  and that one, who you get via 
 +
 +<code>
 mdadm -D mdadm -D
 mdadm -Db mdadm -Db
 mdadm -Es mdadm -Es
 +</code>
  
-differ! in fstab ( lilo too?) you must use UID from /dev/disk/by-uuid/  !+differ, not the same!!! 
 + in fstab ( lilo too?) you must use UID from /dev/disk/by-uuid/  !</note>
  
  
Line 365: Line 386:
   - 2. Using initramd.   - 2. Using initramd.
  
 +<code>
 # #
 # mkinitrd_command_generator.sh revision 1.45 # mkinitrd_command_generator.sh revision 1.45
Line 378: Line 400:
 mkinitrd -c -k 3.2.29 -f ext4 -r /dev/md1 -m mbcache:jbd2:ext4 -R -u -o /boot/initrd.gz mkinitrd -c -k 3.2.29 -f ext4 -r /dev/md1 -m mbcache:jbd2:ext4 -R -u -o /boot/initrd.gz
  
-rightly edited mdadm.conf then must be copied in/boot/tree??? before you run this mkinitrd conf.+</code> 
 +\\ 
 +\\ 
 +rightly edited mdadm.conf then must be copied in/boot/tree??? before you run this mkinitrd conf.\\
  
 after you run that mkinitrd, you must update lilo. after you run that mkinitrd, you must update lilo.
Line 397: Line 422:
 ====== Useful commands in this case ====== ====== Useful commands in this case ======
  
 +show raid array info:
 +
 +<code>
 mdadm -Es mdadm -Es
 +</code>
 +
 +Assemble RAID array based on mdadm.conf
 +<code>
 mdadm -As mdadm -As
 +</code>
  
 +show array info:
 +<code>
 mdadm -D /dev/md127 mdadm -D /dev/md127
 +</code>
 +
 +show defined array another info:
 +<code>
 mdadm -Db /dev/md127 mdadm -Db /dev/md127
 +</code>
  
 +show scsi devices info: 
 +<code>
 lsscsi lsscsi
 +</code>
 +
 +show assembled raid arrays status:
 +<code>
 cat /proc/mdstat cat /proc/mdstat
 +</code>
 +
 +show UUID info about discs ( or RAID arrays) in system:
 +<code>
 ls /dev/disk/by-uuid/ ls /dev/disk/by-uuid/
 +</code>
  
 +<code>
 dmesg |grep md dmesg |grep md
 +</code>
  
 +re-run lilo, when booted from another source.
 +<code>
 chroot /mnt/hd /sbin/lilo -v 3 chroot /mnt/hd /sbin/lilo -v 3
 +</code>
  
-mdadm --stop /dev/md127 
  
 +stop named RAID array:
 +<code>
 +mdadm --stop /dev/md127
 +</code>
  
 kernel options: kernel options:
 +
 +<code>
 +$kernelname raid=noautodetect md=1, /dev/sda1,/dev/sdb1
 +</code>
 +
 +turn on not to autodetect RAID arrays, and define raid array /dev/md1, from two partitions ( members? )
  
  
Line 430: Line 494:
  
 ====== Sources ====== ====== Sources ======
-Originally written by  --- //[[wiki:user:wisedraco|John Ciemgals]] 2016/11/26 04:50//+Originally written by  --- //[[wiki:user:wisedraco|John Ciemgals]] 2016/11/28 04:50//
  
-Rewrited with used materials from "Links" and LinuxQuestions.org Slackware forum, especially user bassmadrigal and bormant from linux.org.ru help --- //[[wiki:user:wisedraco|John Ciemgals]] 2016/11/26 09:15//+Rewrited with used materials from "Links" and LinuxQuestions.org Slackware forum, especially user bassmadrigal and bormant from linux.org.ru help --- //[[wiki:user:wisedraco|John Ciemgals]] 2016/11/28 09:15//
  
 <!-- Please do not modify anything below, except adding new tags.--> <!-- Please do not modify anything below, except adding new tags.-->
 {{tag>software raid raid1 /dev/md127 enumeration broken linux slackware author_wisedraco}} {{tag>software raid raid1 /dev/md127 enumeration broken linux slackware author_wisedraco}}
 +
 +
 +
 +
  
 howtos:misc:software_raid_troubleshoot_howto ()