Re-ordered for clarity -- David.
On 1/12/24 18:42, gene heskett wrote:
I just found an mbox file in my home directory, containing about 90 days
worth of undelivered msgs from smartctl running as root.
Do you know how the mbox file got there?
smartctl says my raid10 is dying, ...
Please post a console session with a command that displays the message.
On 1/12/24 20:57, gene heskett wrote:
> ... there are 4 1t drives as a raid10, and the
> various messages in that mbox file name 3 of the individual drives.
Please post a representative sample of the messages.
> Then I find the linux has played 52 pickup with the device names.
/dev/sd* device node names are unpredictable. The traditional solution
is UUID's. Linux added /dev/disk/by-id/* a while ago and I am starting
to use them as much as possible. Make sure you look very carefully at
the serial numbers when you have several drives of the same make and model.
> There are in actual fact 3 sata controller is this machine, the
> motherboards 6 ports, 6 more on an inexpensive sata contrller that are
> actually the 4 raid10 amsung 870 1T drives, and 4 more on a more
> sxpensive 16 port card which has a quartet of 2T gigastone SSD's on it,
> but the drives are not found in the order of the controllers. That
> raid10 was composed w/o the third controller.
So:
* /home is on a RAID 10 with 2 @ mirror of 2 @ 1 TB Samsung 870 SSD?
* 4 @ 2 TB Gigastone SSD for a new RAID 10?
What drives are connected to which ports?
What is on the other 20 ports?
> blkid does not sort them in order either. And of coarse does not list
> whats unmounted, forcing me to ident the drive by gparted in order to
> get its device name. From that I might be able to construct another raid
> from the 8T of 4 2T drives but its confusing as hell when the first of
> those 2T drives is assigned /dev/sde and the next 4 on the new
> controller are /dev/sdi, j, k, & l.
> So it appears I have 5 of those gigastones, and sde is the odd one
I am confused -- do you have 4 or 5 Gigastone 2 TB SSD?
> So that one could be formatted ext4 and serve as a backup of the raid10.
> how do I make an image of that
> raid10 to /dev/sde and get every byte? That seems like the first step
> to me.
Please get a USB 3.x HDD, do a full backup of your entire computer, put
it off-site, get another USB 3.x HDD, do another full backup, and keep
it nearby.
> But since I can't copy a locked file,
What file is lock? Please post a console session that demonstrates.
> /dev/sde1 has been formatted and mounted, what cmd line will copy every
> byte including locked files in that that raid10 to it?
See above for locked. Otherwise, I suggest rsync(1).
On 1/13/24 09:02, gene heskett wrote:
> ... I've done this this morning:
> used gparted to format to ext4 a single gpt partition on that /dev/sde
> with a LABEL=homesde1 but forgot the 1 when editing /etc/fstab to
> remount it on a reboot to /mnt/homesde1, which resulted in a failed
> boot, look up the root pw and finally get in to fix /etc/fstab for the
> missing 1 in the labelname.
>
> but first mounted a 2t gigastone ssd to /mnt/homesde1 which is where it
> showed up in an lsblk -f report.
> Spent 2+ hours rsync'ing with:
> sudo rsync -av /home/ /mnt/homesde1
> which worked entirely within the same 6 port controller as this raid10
> is running on.
>
> reboot failed, moved the data cable to the motherboard port 5 or 6 (or
> maybe 1 or 2, 6 ports, nfi which is 0 and which is 5) but its on the
> mobo ports now, should be easily found at boot time.
>
> Finally look up root pw, get in to fix /etc/fstab and get booted.
> Talk about portable devicenames, that drive is now /dev/sdk1 !!! And
> empty of a LABELname but now has the 360gigs of data I just rsync'd
to it.
> but on reboot, its now /dev/sdb1 and empty.
>
> from a df:
> gene@coyote:~$ df
> Filesystem 1K-blocks Used Available Use% Mounted on
> udev 16327704 0 16327704 0% /dev
> tmpfs 3272684 1888 3270796 1% /run
> /dev/sda1 863983352 376505108 443516596 46% /
> tmpfs 16363420 1244 16362176 1% /dev/shm
> tmpfs 5120 8 5112 1% /run/lock
> /dev/sda3 47749868 132 45291728 1% /tmp
> /dev/md0p1 1796382580 334985304 1370072300 20% /home
> /dev/sdb1 1967892164 28 1867855888 1% /mnt/homesde1
> tmpfs 3272684 2544 3270140 1% /run/user/1000
> gene@coyote:~$
> and gparted now says that indeed, /dev/sdb is the drive with the label
> "homesde1" on it.
> And showing 31GiB used. What for unless thats ext4
> overhead. All I can see on /mnt/homesde1 is lost+found, which is empty.
> So at this point I still have a home raid10, and have NDI
What does "NDI" mean?
> where the he!!
> the rsync line actually copied 360 Gb of stuff from home to.
Please post:
# ls -a /mnt
# ls -a /mnt/homesde1
> gene@coyote:~$ sudo smartctl -a /dev/sdb
Please use /dev/disk/by-id/* paths.
> ...
> Device Model: Gigastone SSD <- the devices name
> Serial Number: GST02TBG221146
> ...
> User Capacity: 2,048,408,248,320 bytes [2.04 TB]
> Sector Size: 512 bytes logical/physical
> ...
> Form Factor: 2.5 inches
> TRIM Command: Available
> ...
> SMART overall-health self-assessment test result: PASSED
Okay.
> SMART Attributes Data Structure revision number: 1
> Vendor Specific SMART Attributes with Thresholds:
> ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED
> WHEN_FAILED RAW_VALUE
> 1 Raw_Read_Error_Rate 0x0032 100 100 050 Old_age
> Always - 0
> 5 Reallocated_Sector_Ct 0x0032 100 100 050 Old_age
> Always - 0
> 9 Power_On_Hours 0x0032 100 100 050 Old_age
> Always - 884
> ...
Attribute values of 100 for all VALUE and WORST makes sense for a brand
new drive, which contradicts "Old_age" and "Pre-fail" (?).
How long has that drive been in your computer? How many hours per week
has it been on?
> SMART Error Log Version: 1
> No Errors Logged
Okay.
> SMART Self-test log structure revision number 1
> No self-tests have been logged. [To run self-tests, use: smartctl -t]
/sd
Please run:
# smartctl -t short /dev/disk/by-id/*GST02TBG221146*
Wait a few minutes. Then run:
# smartctl -x short /dev/disk/by-id/*GST02TBG221146*
We do not need to see the whole output, but please post the "SMART
Self-test log structure revision number 1" section. It should show the
short test.
> please $diety, deliver me from linux's vaporous disk naming scheme that
> changes faster than the weather. Even device LABEL= does not work. I
> mounted that drive by its label to /mnt/homesde1 and rsync'd /home/ to
> it but that 360Gb of data went someplace else.
Use script(1) to capture your console sessions. Save the files it
generates. Use cat(1) to display their contents.
> Since the data, according
> to what I see in gparted, actually went to /dev/sdk1,
> which is another
> of the 2T gigastones, I intend to make a raid6 out of, no harm to my
> data is done. My raid10 was not destroyed. But I'm burned out and
> frustrated. This is hardware, not a roll of the dice per boot.
>
> I can easily erase and restart that drive for a raid with gparted, But
> howinhell do I get a stable drive detection system so I know what I am
> doing??????????????????????????????????????????
>
> Besides that, I'm running low on hair too.
Please use /dev/disk/by-id/* paths.
On 1/13/24 09:23, gene heskett wrote:
> One last question before I embark on a replay of what I just did, and
> failed at. This time by using device labels.
>
> Does making a raid erase the drives label field in a gpt partition
scheme?
It has been several years since I used mdadm(8), but I suspect the
answer depends upon whether you give whole disks as arguments or give
disk partitions as arguments. If you give whole disks, it would not
surprise me if mdadm(8) overwrote any and all sectors as it pleased,
including MBR and/or GPT partition tables. If you give partitions, I
would expect mdadm(8) to not write outside those partitions, so GPT
labels should be untouched. Please also see mdadm(8) "CREATE MODE" and
the discussion about partition type and version-1.x metadata.
On 1/13/24 15:21, gene heskett wrote:
> I'm looking for a solution to a broken install, all caused by the
> installer finding a plugged in FDTI usb-serial adapter so it
> automatically assumed I was blind and automatically installed brltty and
> orca, which are not removable once installed without ripping the system
> apart rendering it unbootable. If orca is disabled, the system will
> _NOT_ reboot. And I catch hell for discriminating against the blind when
> I complained at the time.
>
> That took me 20+ installs to get this far because if I removed the exec
> bits on orca, disabling it=no reboot=yet another re-install go thru the
> same thing with orca yelling at me for every keystroke entered, till
> someone took pity on me and wrote to unplug the usb stuff which looks
> like a weeping willow tree here, nothing more or less.
Do you have a USB drive with an installation of Debian? If not, build
one. I used SanDisk Ultra Fit USB 3.0 16 GB drives for many years. Now
I use Samsung UM410 16 GB SSD's and a StarTech USB to SATA adapter cable.
Then disconnect all the drives except the 4 @ 1 TB SSD's for the RAID10
/home, boot your USB Debian drive, assemble the RAID10, mount the file
system read-only, and test for the 30 second delay.
David