> On Aug 21, 2025, at 4:07 AM, Miles Goodhew <[email protected]> wrote:
>
> Hi Robert,
> I'm not an expert on the low-level details and "modern" Ceph, so I hope I
> don't lead you on any wild goose chases, but I might at least give some leads.
> It seems odd that the metrics mention NVM/e - I'm guessing that it's just a
> cross-product test and tries all tools on all devices.
Recent releases of smartctl pass through stats for NVMe devices via the
name-cli command "nvme". Whether it invokes that for all devices, ordering,
etc I don't know.
> SMART test failure is more of an issue. It's a pity the error message is so
> nondescript. Some things I can think of from simplest to most complicated are:
> * Are smartmontools installed on the drive host?
Does it happen with other drives on the same host?
If you have availability through your chassis vendor, look for a firmware
update.
> * Does the monitoring UID have sudo access?
> * Does a manual "sudo smartctl -a /dev/sdc" give the same or similar result?
> * Is the drive managed by a hardware RAID controller or concentrator (Like
> Dell PERC or a USB adapter or something)
> * (This is a stretch) Is there an OSD for the drive that's given the "NVME"
> class?
>
> Hope that gives you something.
>
> M0les.
>
>
> On Thu, 21 Aug 2025, at 17:15, Robert Sander wrote:
>> Hi,
>>
>> On a new cluster with version 19.2.3 the device health metrics only show a
>> smartctl error:
>>
>> {
>> "20250821-000313": {
>> "dev": "/dev/sdc",
>> "error": "smartctl failed",
>> "nvme_smart_health_information_add_log_error": "nvme returned an
>> error: sudo: exit status: 1",
>> "nvme_smart_health_information_add_log_error_code": -22,
>> "nvme_vendor": "ata",
>> "smartctl_error_code": -22,
>> "smartctl_output": "smartctl returned an error (1): stderr:\nsudo:
>> exit status: 1\nstdout:\n"
>> }
>> }
>>
>> The device in question (like all the other in the cluster) is a Samsung
>> MZ7L37T6 SATA SSD.
>>
>> What is happening here?
>>
>> Regards
>> --
>> Robert Sander
>> Linux Consultant
>>
>> Heinlein Consulting GmbH
>> Schwedter Str. 8/9b, 10119 Berlin
>>
>> https://www.heinlein-support.de
>>
>> Tel: +49 30 405051 - 0
>> Fax: +49 30 405051 - 19
>>
>> Amtsgericht Berlin-Charlottenburg - HRB 220009 B
>> Geschäftsführer: Peer Heinlein - Sitz: Berlin
>> _______________________________________________
>> ceph-users mailing list -- [email protected]
>> To unsubscribe send an email to [email protected]
>>
> _______________________________________________
> ceph-users mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]