I had an issue some months back. It turned out to be a bad RAM stick
in my NAS. The issues would not show up on a restart but after some
usage it would hit the RAM errors and :(

This may not be your issue, but I remember how annoying it was to figure out.

On Fri, May 22, 2026 at 9:53 AM Charles Curley
<[email protected]> wrote:
>
> I have four four terabyte hard drives. Each has a partition on it. The
> four partitions comprise a RAID 5 array using mdadm. On top of that,
> LUKS encryption, then LVM with ext4 logical volumes.
>
> On one LVM partition I have a number of backup files, tarred,
> bzipped, and sha256 and sha512 summed. I have a script which will find
> checksum files, and execute the appropriate program to test the
> archives. It puts each program into the background, parallising any
> number of checksum tests.
>
> Starting about a week ago, the script finds an error in one or more
> files out of several. Results are inconsistent: one pass may find an
> error in a given file, the next pass not find any errors in it. Running
> checksums manually, one at a time, does not turn up an error. Running
> "tar tvf" finds no error in a suspect file. Running "bunzip2 -t" also
> turns up no error. Only running the script turns up any errors.
>
> I create two checksum files when I create the backups, for sha256 and
> sha512. After this problem surfaced (about a week ago), I then made two
> new checksum files of a suspect file. The two checksum file pairs
> (e.g. both sha512sum files) show the same checksums. The script now
> tests using both the old and new checksum files. Sometime only one pair
> of checksum files fail the suspect file.
>
> In addition to all of that, I also get the occasional "bad message"
> error. I have no idea what that means, but an fsck seems to deal with
> it.
>
> To be thorough, I have run extended SMART tests on the hard drives,
> kicked mdadm into testing the RAID array, and fscked the LVM partitions
> on the RAID array. Only fsck turned up issues, and that has not stopped.
>
> I also back some of this up to offsite USB drives. I ran the script on
> one of those, using a different computer. No errors reported.
>
> I have a hypothesis as to what is going on, but would like to hear from
> you before I discuss it.
>
> --
> Does anybody read signatures any more?
>
> https://charlescurley.com
> https://charlescurley.com/blog/
>


-- 
- Andrew "lathama" Latham -

Reply via email to