Re-ordered for clarity -- David.

On 1/12/24 18:42, gene heskett wrote:
I just found an mbox file in my home directory, containing about 90 days worth of undelivered msgs from smartctl running as root.


Do you know how the mbox file got there?


smartctl says my raid10 is dying, ...


Please post a console session with a command that displays the message.


On 1/12/24 20:57, gene heskett wrote:
> ... there are 4 1t drives as a raid10, and the
> various messages in that mbox file name 3 of the individual drives.


Please post a representative sample of the messages.


> Then I find the linux has played 52 pickup with the device names.


/dev/sd* device node names are unpredictable. The traditional solution is UUID's. Linux added /dev/disk/by-id/* a while ago and I am starting to use them as much as possible. Make sure you look very carefully at the serial numbers when you have several drives of the same make and model.


> There are in actual fact 3 sata controller is this machine, the
> motherboards 6 ports, 6 more on an inexpensive sata contrller that are
> actually the 4 raid10 amsung 870 1T drives, and 4 more on a more
> sxpensive 16 port card which has a quartet of 2T gigastone SSD's on it,
> but the drives are not found in the order of the controllers. That
> raid10 was composed w/o the third controller.


So:

* /home is on a RAID 10 with 2 @ mirror of 2 @ 1 TB Samsung 870 SSD?

* 4 @ 2 TB Gigastone SSD for a new RAID 10?


What drives are connected to which ports?


What is on the other 20 ports?


> blkid does not sort them in order either. And of coarse does not list
> whats unmounted, forcing me to ident the drive by gparted in order to
> get its device name. From that I might be able to construct another raid
> from the 8T of 4 2T drives but its confusing as hell when the first of
> those 2T drives is assigned /dev/sde and the next 4 on the new
> controller are /dev/sdi, j, k, & l.
> So it appears I have 5 of those gigastones, and sde is the odd one


I am confused -- do you have 4 or 5 Gigastone 2 TB SSD?


> So that one could be formatted ext4 and serve as a backup of the raid10.

> how do I make an image of that
> raid10  to /dev/sde and get every byte?  That seems like the first step
> to me.


Please get a USB 3.x HDD, do a full backup of your entire computer, put it off-site, get another USB 3.x HDD, do another full backup, and keep it nearby.


>   But since I can't copy a locked file,


What file is lock?  Please post a console session that demonstrates.


> /dev/sde1 has been formatted and mounted, what cmd line will copy every
> byte including locked files in that that raid10 to it?


See above for locked.  Otherwise, I suggest rsync(1).


On 1/13/24 09:02, gene heskett wrote:
> ... I've done this this morning:
> used gparted to format to ext4 a single gpt partition on that /dev/sde
> with a LABEL=homesde1 but forgot the 1 when editing /etc/fstab to
> remount it on a reboot to /mnt/homesde1, which resulted in a failed
> boot, look up the root pw and finally get in to fix /etc/fstab for the
> missing 1 in the labelname.
>
> but first mounted a 2t gigastone ssd to /mnt/homesde1 which is where it
> showed up in an lsblk -f report.
> Spent 2+ hours rsync'ing with:
> sudo rsync -av /home/ /mnt/homesde1
> which worked entirely within the same 6 port controller as this raid10
> is running on.
>
> reboot failed, moved the data cable to the motherboard port 5 or 6 (or
> maybe 1 or 2, 6 ports, nfi which is 0 and which is 5) but its on the
> mobo ports now, should be easily found at boot time.
>
> Finally look up root pw, get in to fix /etc/fstab and get booted.
> Talk about portable devicenames, that drive is now /dev/sdk1 !!!  And
> empty of a LABELname but now has the 360gigs of data I just rsync'd to it.
> but on reboot, its now /dev/sdb1 and empty.
>
> from a df:
> gene@coyote:~$ df
> Filesystem      1K-blocks      Used  Available Use% Mounted on
> udev             16327704         0   16327704   0% /dev
> tmpfs             3272684      1888    3270796   1% /run
> /dev/sda1       863983352 376505108  443516596  46% /
> tmpfs            16363420      1244   16362176   1% /dev/shm
> tmpfs                5120         8       5112   1% /run/lock
> /dev/sda3        47749868       132   45291728   1% /tmp
> /dev/md0p1     1796382580 334985304 1370072300  20% /home
> /dev/sdb1      1967892164        28 1867855888   1% /mnt/homesde1
> tmpfs             3272684      2544    3270140   1% /run/user/1000
> gene@coyote:~$
> and gparted now says that indeed, /dev/sdb is the drive with the label
> "homesde1" on it.
> And showing 31GiB used. What for unless thats ext4
> overhead. All I can see on /mnt/homesde1 is lost+found, which is empty.
> So at this point I still have a home raid10, and have NDI


What does "NDI" mean?


> where the he!!
> the rsync line actually copied 360 Gb of stuff from home to.


Please post:

# ls -a /mnt

# ls -a /mnt/homesde1


> gene@coyote:~$ sudo smartctl -a /dev/sdb


Please use /dev/disk/by-id/* paths.


> ...
> Device Model:     Gigastone SSD <- the devices name
> Serial Number:    GST02TBG221146
> ...
> User Capacity:    2,048,408,248,320 bytes [2.04 TB]
> Sector Size:      512 bytes logical/physical
> ...
> Form Factor:      2.5 inches
> TRIM Command:     Available
> ...
> SMART overall-health self-assessment test result: PASSED


Okay.


> SMART Attributes Data Structure revision number: 1
> Vendor Specific SMART Attributes with Thresholds:
> ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE UPDATED
> WHEN_FAILED RAW_VALUE
>    1 Raw_Read_Error_Rate     0x0032   100   100   050    Old_age
> Always       -       0
>    5 Reallocated_Sector_Ct   0x0032   100   100   050    Old_age
> Always       -       0
>    9 Power_On_Hours          0x0032   100   100   050    Old_age
> Always       -       884
> ...


Attribute values of 100 for all VALUE and WORST makes sense for a brand new drive, which contradicts "Old_age" and "Pre-fail" (?).


How long has that drive been in your computer? How many hours per week has it been on?


> SMART Error Log Version: 1
> No Errors Logged


Okay.


> SMART Self-test log structure revision number 1
> No self-tests have been logged.  [To run self-tests, use: smartctl -t]

/sd
Please run:

# smartctl -t short /dev/disk/by-id/*GST02TBG221146*


Wait a few minutes.  Then run:


# smartctl -x short /dev/disk/by-id/*GST02TBG221146*


We do not need to see the whole output, but please post the "SMART Self-test log structure revision number 1" section. It should show the short test.


> please $diety, deliver me from linux's vaporous disk naming scheme that
> changes faster than the weather. Even device LABEL= does not work. I
> mounted that drive by its label to /mnt/homesde1 and rsync'd /home/ to
> it but that 360Gb of data went someplace else.


Use script(1) to capture your console sessions. Save the files it generates. Use cat(1) to display their contents.


> Since the data, according
> to what I see in gparted, actually went to /dev/sdk1,
> which is another
> of the 2T gigastones, I intend to make a raid6 out of, no harm to my
> data is done. My raid10 was not destroyed. But I'm burned out and
> frustrated. This is hardware, not a roll of the dice per boot.
>
> I can easily erase and restart that drive for a raid with gparted, But
> howinhell do I get a stable drive detection system so I know what I am
> doing??????????????????????????????????????????
>
> Besides that, I'm running low on hair too.


Please use /dev/disk/by-id/* paths.


On 1/13/24 09:23, gene heskett wrote:
> One last question before I embark on a replay of what I just did, and
> failed at. This time by using device labels.
>
> Does making a raid erase the drives label field in a gpt partition scheme?


It has been several years since I used mdadm(8), but I suspect the answer depends upon whether you give whole disks as arguments or give disk partitions as arguments. If you give whole disks, it would not surprise me if mdadm(8) overwrote any and all sectors as it pleased, including MBR and/or GPT partition tables. If you give partitions, I would expect mdadm(8) to not write outside those partitions, so GPT labels should be untouched. Please also see mdadm(8) "CREATE MODE" and the discussion about partition type and version-1.x metadata.


On 1/13/24 15:21, gene heskett wrote:
> I'm looking for a solution to a broken install, all caused by the
> installer finding a plugged in FDTI usb-serial adapter so it
> automatically assumed I was blind and automatically installed brltty and
> orca, which are not removable once installed without ripping the system
> apart rendering it unbootable. If orca is disabled, the system will
> _NOT_ reboot. And I catch hell for discriminating against the blind when
> I complained at the time.
>
> That took me 20+ installs to get this far because if I removed the exec
> bits on orca,  disabling it=no reboot=yet another re-install go thru the
> same thing with orca yelling at me for every keystroke entered, till
> someone took pity on me and wrote to unplug the usb stuff which looks
> like a weeping willow tree here, nothing more or less.


Do you have a USB drive with an installation of Debian? If not, build one. I used SanDisk Ultra Fit USB 3.0 16 GB drives for many years. Now I use Samsung UM410 16 GB SSD's and a StarTech USB to SATA adapter cable.


Then disconnect all the drives except the 4 @ 1 TB SSD's for the RAID10 /home, boot your USB Debian drive, assemble the RAID10, mount the file system read-only, and test for the 30 second delay.


David

Reply via email to