[2025-jun-17] The SlackDocs mailing lists at https://lists.alienbase.nl/mailman/listinfo have been retired. No one has been using these lists for years and it's time to say goodbye. The list archives remain available at https://scalzi.slackware.nl/mailman/listinfo/slackdocs
[2025-jun-17] The SlackDocs Wiki has moved to a new server, in order to make it more performant.
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
howtos:misc:software_raid_troubleshoot_howto [2016/11/24 20:41 (UTC)] – [Software RAID troubleshoot] wisedraco | howtos:misc:software_raid_troubleshoot_howto [2017/06/26 15:08 (UTC)] (current) – [Workarounds for incorrect raid devices naming] wisedraco | ||
---|---|---|---|
Line 1: | Line 1: | ||
<!-- Add your text below. We strongly advise to start with a Headline (see button bar above). --> | <!-- Add your text below. We strongly advise to start with a Headline (see button bar above). --> | ||
+ | |||
====== Software RAID troubleshoot ====== | ====== Software RAID troubleshoot ====== | ||
- | |||
- | |||
- | ########### IT WAS STILL IN WRITING | ||
- | ##################################################################################################################### | ||
- | |||
- | |||
Once, after upgrading my desktop slackware64 from 14.1 to 14.2 (with kernel upgrade too), | Once, after upgrading my desktop slackware64 from 14.1 to 14.2 (with kernel upgrade too), | ||
i be done with system, who,after lilo menu,\\ | i be done with system, who,after lilo menu,\\ | ||
write " | write " | ||
+ | \\ | ||
and then stops completely - nothing more.\\ | and then stops completely - nothing more.\\ | ||
+ | \\ | ||
Initial configuration was:\\ | Initial configuration was:\\ | ||
Intel DG965SS motherboard, | Intel DG965SS motherboard, | ||
2 x 1000 Gb Seagate SATA HDD ( as sda and sdb)\\ | 2 x 1000 Gb Seagate SATA HDD ( as sda and sdb)\\ | ||
ST1000DM003-1CH1\\ | ST1000DM003-1CH1\\ | ||
- | , dvd-writer on sata4 port | + | dvd-writer on sata4 port |
both seagate discs is partitioned as FD type ( linux autodetect raid) and 4 partitions ( mbr type) - | both seagate discs is partitioned as FD type ( linux autodetect raid) and 4 partitions ( mbr type) - | ||
- | * 100 Gb root (md1) | + | |
- | * 2 Gb swap (md2) | + | * 2 Gb swap (md2) |
- | * 350 Gb /home (md3) | + | * 350 Gb /home (md3) |
- | * 550 Gb /Second (md4) | + | * 550 Gb /Second (md4) |
+ | \\ | ||
+ | \\ | ||
cat / | cat / | ||
Line 43: | Line 40: | ||
unused devices: < | unused devices: < | ||
</ | </ | ||
+ | \\ | ||
+ | \\ | ||
mdadm -Es : | mdadm -Es : | ||
Line 54: | Line 52: | ||
</ | </ | ||
- | // | + | \\ |
- | // | + | \\ |
+ | < | ||
# for p in 1 2 3 4; do mdadm --create /dev/md$p --name=$p --level=1 --raid-devices 2 /dev/sda$p /dev/sdb$p --metadata=0.90; | # for p in 1 2 3 4; do mdadm --create /dev/md$p --name=$p --level=1 --raid-devices 2 /dev/sda$p /dev/sdb$p --metadata=0.90; | ||
+ | </ | ||
+ | \\ | ||
===== My " | ===== My " | ||
Line 64: | Line 65: | ||
Then i do massive system update via slackpkg update-all, including kernel update too.\\ | Then i do massive system update via slackpkg update-all, including kernel update too.\\ | ||
check lilo.conf, restart - all looks ok. then i decide to upgrade system to 14.2 via the same slackpkg \\ | check lilo.conf, restart - all looks ok. then i decide to upgrade system to 14.2 via the same slackpkg \\ | ||
- | ( looks like live on macos is too boring, too predictable - all work, and so on...:D ) | + | ( looks like live on macos is too boring, too predictable - all work, and so on...:D )\\ |
always, i check lilo.conf, check the new kernel is named right, have no old kernel for backup - \\ | always, i check lilo.conf, check the new kernel is named right, have no old kernel for backup - \\ | ||
only one entry in lilo ( who was not good thing at all! ), and do reboot. | only one entry in lilo ( who was not good thing at all! ), and do reboot. | ||
\\ | \\ | ||
Then all interesting things start!!! :) | Then all interesting things start!!! :) | ||
- | + | \\ | |
- | I have LiLo menu, kernel start loading. | + | \\ |
- | it show lots of " | + | I have LiLo menu, kernel start loading... |
- | That was indicated some problems with lilo updating, i suppose. | + | \\ |
- | I want to boot, and re-run lilo -v | + | it show lots of " |
+ | That was indicated some problems with lilo updating, i suppose.\\ | ||
+ | I want to boot, and re-run lilo -v \\ | ||
As so, i booting from Slackware64 Live CD [[http:// | As so, i booting from Slackware64 Live CD [[http:// | ||
+ | \\ | ||
+ | \\ | ||
But - there was a big problem! | But - there was a big problem! | ||
+ | \\ | ||
there no my /dev/md1, /dev/md2, /dev/md3 and /dev/md4 after i load via slackware live CD! | there no my /dev/md1, /dev/md2, /dev/md3 and /dev/md4 after i load via slackware live CD! | ||
+ | \\ | ||
and my lilo.conf was that: | and my lilo.conf was that: | ||
+ | \\ | ||
< | < | ||
# LILO configuration file | # LILO configuration file | ||
Line 152: | Line 156: | ||
# Linux bootable partition config ends | # Linux bootable partition config ends | ||
</ | </ | ||
+ | \\ | ||
so i do some research, do | so i do some research, do | ||
+ | < | ||
dmesg |grep md | dmesg |grep md | ||
+ | </ | ||
+ | \\ | ||
and from array size found out, what number was my root partition ( it was 100 gb size). it was md126. | and from array size found out, what number was my root partition ( it was 100 gb size). it was md126. | ||
+ | \\ | ||
i mount it: | i mount it: | ||
+ | < | ||
mount /dev/md126 /mnt/hd | mount /dev/md126 /mnt/hd | ||
+ | </ | ||
+ | \\ | ||
+ | Then i run mc, and check, i was really mount there my root partition. | ||
+ | \\ | ||
- | then i run mc, and check, i was really mount there my root partition. | + | Then i do in console |
- | + | \\ | |
- | then i do in console | + | < |
chroot /mnt/hd /sbin/lilo -v 3 | chroot /mnt/hd /sbin/lilo -v 3 | ||
+ | </ | ||
+ | \\ | ||
but - lilo command end with error - it was cannot find root partition - /dev/md1. | but - lilo command end with error - it was cannot find root partition - /dev/md1. | ||
- | That was problem, because now /dev/md1 was become as /dev/md126 for whatever reason. | + | That was problem, because now /dev/md1 was become as /dev/md126 for whatever reason.\\ |
Then, i do some reading about RAID subsystems, forums and so on, made some mistakes and experiments, | Then, i do some reading about RAID subsystems, forums and so on, made some mistakes and experiments, | ||
+ | \\ | ||
- | I made array assembling strings in mdadm.conf: | + | I made array assembling strings in mdadm.conf |
+ | < | ||
mdadm -Db /dev/md127 >> / | mdadm -Db /dev/md127 >> / | ||
Line 179: | Line 193: | ||
mdadm -Db /dev/md125 >> / | mdadm -Db /dev/md125 >> / | ||
mdadm -Db /dev/md124 >> / | mdadm -Db /dev/md124 >> / | ||
+ | </ | ||
+ | \\ | ||
+ | \\ | ||
+ | In a result i get something like that in end of mdadm.conf: | ||
+ | \\ | ||
+ | < | ||
+ | ARRAY /dev/md125 metadata=0.90 UUID=7cc47bea: | ||
+ | ARRAY /dev/md124 metadata=0.90 UUID=f0bc71fc: | ||
+ | ARRAY /dev/md126 metadata=0.90 UUID=3f4daae2: | ||
+ | ARRAY /dev/md127 metadata=0.90 UUID=cce81d3a: | ||
+ | </ | ||
- | get something like that in end of mdadm.conf: | + | then, based on : |
- | + | \\ | |
- | #ARRAY /dev/md125 metadata=0.90 UUID=7cc47bea: | + | < |
- | #ARRAY /dev/md124 metadata=0.90 UUID=f0bc71fc: | + | |
- | #ARRAY /dev/md126 metadata=0.90 UUID=3f4daae2: | + | |
- | #ARRAY /dev/md127 metadata=0.90 UUID=cce81d3a: | + | |
- | + | ||
- | + | ||
- | then, based on | + | |
dmesg | grep md | dmesg | grep md | ||
- | and / | + | </ |
- | + | \\ | |
- | then i edit it to become in right way: | + | and / |
+ | \\ | ||
+ | \\ | ||
+ | Then i edit it to become in right way: | ||
+ | \\ | ||
+ | < | ||
ARRAY /dev/md1 metadata=0.90 UUID=7cc47bea: | ARRAY /dev/md1 metadata=0.90 UUID=7cc47bea: | ||
ARRAY /dev/md2 metadata=0.90 UUID=cce81d3a: | ARRAY /dev/md2 metadata=0.90 UUID=cce81d3a: | ||
ARRAY /dev/md3 metadata=0.90 UUID=f0bc71fc: | ARRAY /dev/md3 metadata=0.90 UUID=f0bc71fc: | ||
ARRAY /dev/md4 metadata=0.90 UUID=3f4daae2: | ARRAY /dev/md4 metadata=0.90 UUID=3f4daae2: | ||
+ | </ | ||
+ | \\ | ||
also i wrote in mdadm.conf that string, just to be sure, hostname do not affect raid naming: | also i wrote in mdadm.conf that string, just to be sure, hostname do not affect raid naming: | ||
+ | \\ | ||
+ | < | ||
HOMEHOST < | HOMEHOST < | ||
+ | </ | ||
+ | \\ | ||
then i made mdadm_stop_127.scr script: | then i made mdadm_stop_127.scr script: | ||
+ | \\ | ||
<code bash> | <code bash> | ||
#!/bin/sh | #!/bin/sh | ||
Line 224: | Line 250: | ||
</ | </ | ||
- | copy it to livesystem root, copy my edited mdadm.conf from / | + | \\ |
- | to livesystem /etc, and umount /dev/md125 | + | \\ |
- | + | I copy it to livesystem root, copy also my edited mdadm.conf from / | |
- | and then i run it my mdadm_stop_127.scr | + | to livesystem /etc, and umount / |
+ | \\ | ||
+ | \\ | ||
- | i see, all that arrays be stopped, and | + | And then |
+ | \\ | ||
+ | \\ | ||
+ | I see, all that arrays be stopped, and | ||
then i run: | then i run: | ||
+ | \\ | ||
+ | < | ||
mdadm -As | mdadm -As | ||
+ | </ | ||
+ | \\ | ||
then i see | then i see | ||
+ | < | ||
cat / | cat / | ||
- | + | </ | |
- | + | \\ | |
- | i see, all my RAID become as it must be - /dev/md1, md2, md3, and md4! | + | I see, all my RAID become as it must be - /dev/md1, md2, md3, and md4! |
- | + | \\ | |
- | + | \\ | |
- | then i do | + | then i do mount my root hdd again: |
- | + | \\ | |
- | mount my root hdd again: | + | < |
mount / | mount / | ||
+ | </ | ||
+ | \\ | ||
+ | and do: | ||
- | and do | + | < |
chroot /mnt/hd /sbin/lilo -v 3 | chroot /mnt/hd /sbin/lilo -v 3 | ||
+ | </ | ||
+ | \\ | ||
all looks good. | all looks good. | ||
- | i do restart | + | i do restart: |
+ | < | ||
shutdown -r now | shutdown -r now | ||
+ | </ | ||
( or press ctrl +alt+ del) | ( or press ctrl +alt+ del) | ||
+ | \\ | ||
+ | \\ | ||
after restart i see, system start loading, and went after previously dead point. i get to | after restart i see, system start loading, and went after previously dead point. i get to | ||
login screen, log in as root, and see, there is md1 ( root) and md2 ( swap), but no md3 and md4 ( instead of it i have these /home and /Second arrays as md125 and md124). | login screen, log in as root, and see, there is md1 ( root) and md2 ( swap), but no md3 and md4 ( instead of it i have these /home and /Second arrays as md125 and md124). | ||
Line 266: | Line 305: | ||
i restart machine, press tab on LiLo prompt, and use kernel parameters: | i restart machine, press tab on LiLo prompt, and use kernel parameters: | ||
+ | \\ | ||
+ | < | ||
Linux raid=noautodetect md=1,/ | Linux raid=noautodetect md=1,/ | ||
- | + | </ | |
+ | \\ | ||
that says kernel not to autodetect raid arrays, but assemble md1 raid ( md=1) from /dev/sda1 and /dev/sdb1 partition, | that says kernel not to autodetect raid arrays, but assemble md1 raid ( md=1) from /dev/sda1 and /dev/sdb1 partition, | ||
because without root kernel system cant start. | because without root kernel system cant start. | ||
+ | \\ | ||
as i load system in that way, all looks right - there was /dev/md1, md2, md3 and md4. | as i load system in that way, all looks right - there was /dev/md1, md2, md3 and md4. | ||
+ | \\ | ||
then i just do system restart, without any kernel parameters, and all again going to be right - md1 till md4. | then i just do system restart, without any kernel parameters, and all again going to be right - md1 till md4. | ||
+ | \\ | ||
looks like, system writes something in RAID arrays superblock, or metadata, or something like that, about previously gived md name, because, if not, after restart i must get again situation as previously - with md1, md2, md125 and ,d124... | looks like, system writes something in RAID arrays superblock, or metadata, or something like that, about previously gived md name, because, if not, after restart i must get again situation as previously - with md1, md2, md125 and ,d124... | ||
- | + | \\ | |
+ | \\ | ||
that was, in a most, all of story. | that was, in a most, all of story. | ||
yet, there is some another workarounds of that situation. | yet, there is some another workarounds of that situation. | ||
+ | \\ | ||
+ | \\ | ||
===== Workarounds for incorrect raid devices naming ===== | ===== Workarounds for incorrect raid devices naming ===== | ||
Line 289: | Line 331: | ||
do a | do a | ||
+ | |||
+ | < | ||
ls / | ls / | ||
+ | </ | ||
or better, go in that location with midnight commander, and youl see, there is a " | or better, go in that location with midnight commander, and youl see, there is a " | ||
that was the raid array disk uuid - and symlink to /dev/mdx. | that was the raid array disk uuid - and symlink to /dev/mdx. | ||
+ | |||
+ | for automate info feed into fstab you can use this way: | ||
+ | |||
+ | < | ||
+ | |||
+ | cd / | ||
+ | |||
+ | ls -d -l $PWD/* >> /etc/fstab | ||
+ | </ | ||
+ | |||
+ | after that you must immediately edit /etc/fstab and make in a right way, otherwise you may have problems with mounting in next boot... | ||
+ | |||
< | < | ||
Line 308: | Line 365: | ||
</ | </ | ||
- | take a note! | + | <note important> |
disk UUID by | disk UUID by | ||
Line 314: | Line 371: | ||
and that one, who you get via | and that one, who you get via | ||
+ | |||
+ | < | ||
mdadm -D | mdadm -D | ||
mdadm -Db | mdadm -Db | ||
mdadm -Es | mdadm -Es | ||
+ | </ | ||
- | differ! in fstab ( lilo too?) you must use UID from / | + | differ, not the same!!! |
+ | in fstab ( lilo too?) you must use UID from / | ||
Line 325: | Line 386: | ||
- 2. Using initramd. | - 2. Using initramd. | ||
+ | < | ||
# | # | ||
# mkinitrd_command_generator.sh revision 1.45 | # mkinitrd_command_generator.sh revision 1.45 | ||
Line 338: | Line 400: | ||
mkinitrd -c -k 3.2.29 -f ext4 -r /dev/md1 -m mbcache: | mkinitrd -c -k 3.2.29 -f ext4 -r /dev/md1 -m mbcache: | ||
- | rightly edited mdadm.conf then must be copied in/ | + | </ |
+ | \\ | ||
+ | \\ | ||
+ | rightly edited mdadm.conf then must be copied in/ | ||
after you run that mkinitrd, you must update lilo. | after you run that mkinitrd, you must update lilo. | ||
Line 357: | Line 422: | ||
====== Useful commands in this case ====== | ====== Useful commands in this case ====== | ||
+ | show raid array info: | ||
+ | |||
+ | < | ||
mdadm -Es | mdadm -Es | ||
+ | </ | ||
+ | |||
+ | Assemble RAID array based on mdadm.conf | ||
+ | < | ||
mdadm -As | mdadm -As | ||
+ | </ | ||
+ | show array info: | ||
+ | < | ||
mdadm -D /dev/md127 | mdadm -D /dev/md127 | ||
+ | </ | ||
+ | |||
+ | show defined array another info: | ||
+ | < | ||
mdadm -Db /dev/md127 | mdadm -Db /dev/md127 | ||
+ | </ | ||
+ | show scsi devices info: | ||
+ | < | ||
lsscsi | lsscsi | ||
+ | </ | ||
+ | |||
+ | show assembled raid arrays status: | ||
+ | < | ||
cat / | cat / | ||
+ | </ | ||
+ | |||
+ | show UUID info about discs ( or RAID arrays) in system: | ||
+ | < | ||
ls / | ls / | ||
+ | </ | ||
+ | < | ||
dmesg |grep md | dmesg |grep md | ||
+ | </ | ||
+ | re-run lilo, when booted from another source. | ||
+ | < | ||
chroot /mnt/hd /sbin/lilo -v 3 | chroot /mnt/hd /sbin/lilo -v 3 | ||
+ | </ | ||
- | mdadm --stop /dev/md127 | ||
+ | stop named RAID array: | ||
+ | < | ||
+ | mdadm --stop /dev/md127 | ||
+ | </ | ||
kernel options: | kernel options: | ||
+ | |||
+ | < | ||
+ | $kernelname raid=noautodetect md=1, / | ||
+ | </ | ||
+ | |||
+ | turn on not to autodetect RAID arrays, and define raid array /dev/md1, from two partitions ( members? ) | ||
Line 390: | Line 494: | ||
====== Sources ====== | ====== Sources ====== | ||
- | Originally written by --- // | + | Originally written by --- // |
- | Rewrited with used materials from " | + | Rewrited with used materials from " |
<!-- Please do not modify anything below, except adding new tags.--> | <!-- Please do not modify anything below, except adding new tags.--> | ||
{{tag> | {{tag> | ||
+ | |||
+ | |||
+ | |||
+ | |||