Yes I agree about the reservation, that was the next thing I was about to
focus on.....

Please do show your res config.

On Wed, Nov 26, 2025, 3:26 PM Christopher Samuel via slurm-users <
[email protected]> wrote:

> On 11/13/25 2:16 pm, Lee via slurm-users wrote:
>
> > 1. When I look at our 8 non-MIG DGXs, via `scontrol show node=dgxXY |
> > grep Gres`, 7/8 DGXs report "Gres=gpu:*H100*:8(S:0-1)" while dgx09
> > reports "Gres=gpu:*h100*:8(S:0-1)"
>
> Two thoughts:
>
> 1) Looking at the 24.11 code when it's using NVML to get the names
> everything gets lowercased - so I wonder if these new ones are getting
> correctly discovered by NVML but the older ones are not and so using the
> uppercase values in your config?
>
>         gpu_common_underscorify_tolower(device_name);
>
> I would suggest making sure the GPU names are lower-cased everywhere for
> consistency.
>
> 2) From memory (away from work at the moment) slurmd caches hwloc
> library information in an XML file - you might want to go and find that
> on an older and newer node and compare those to see if you see the same
> difference there.  It could be interesting to see if you stop slurmd on
> an older node, move that XML file out of the way start slurmd whether
> that changes how it reports the node.
>
> Also I saw you posted "slurmd -G" on the new one, could you post that
> from an older one too please?
>
> Best of luck,
> Chris
> --
> Chris Samuel  :  http://www.csamuel.org/  :  Philadelphia, PA, USA
>
> --
> slurm-users mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
>
-- 
slurm-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to