Hi Everyone, Has anyone seen an issue where the CUDA_VISIBLE_DEVICES environmental variable is set to an integer (0, 1, 2 or 3 for us) instead of the UUID (MIG-xxx) when AMD SMT is enabled? Not sure if this is a bug but it feels like one. Certain libraries like pytorch 1.13 cannot find a MiG when CUDA_VISIBLE_DEVICES is set to an integer.
Thanks, -- Zachary Newell Research Computing Engineer NSHE System Computing Services PUBLIC RECORDS NOTICE: In accordance with NRS Chapter 239, this email and responses, unless otherwise made confidential by law, may be subject to the Nevada Public Records laws and may be disclosed to the public upon request.