Hi Thomas,

On 3/17/25 16:50, Thomas Lange wrote:
In setup-storage, we call
udevadm settle --timeout=10

before we determine the list of disks like this:

# read the sizes and partition tables of all disks listed in $FAI::disks
&FAI::get_current_disks;

Maybe the timeout is too short for your hardware?


I have checked with our vendor and during testing their next to last step during burn-in testing, is running

dd if=/dev/zero of=/dev/$h bs=1M count=4096

where $h is taken from

lsblk -ido Name,Type,Model | grep disk | cut -d " " -f 1

followed by running `sync` just before powering the machine off via sysrq.

So, maybe the power off happened to quickly and the NVMe had still some writing to do but the `sync` call returned earlier? Frankly, I don't know.

So far, I have failed to reproduce the problem myself on one of the nodes where this happened. Given that it only affected less than 5% of the delivered nodes, at the moment I would probably attribute this issue to the burn-in tests and not to fai-client. What do you think?

Thus, feel free to close and sorry for the noise!

Cheers

Carsten

Reply via email to