Mohammed,

Generally, you can think of federation as a way to centrally track and manage your multiple clusters. More of a way to run single 'sreport' and 'sacct' commands. There are added abilities such as being able to specify the cluster to send a job to, but for all intents and purposes, the clusters themselves are independent.

It sounds like what you may want is multiple partitions in the same cluster. You can have 2 that comprise of the sets of nodes and a third that is comprised of all nodes. Or a single partition with all nodes and have features that delineate what node can do what (a node-locked license, for example). Then you can send a job to a specific subset of nodes.

Quite a few other ways to design the ability you describe, but separate clusters is not one of them.

Brian Andrus

On 6/26/2023 6:11 AM, mohammed shambakey wrote:
Hi

Just out of interest, I wonder what the exact difference between slurm multi-cluster and federation (apart from unique job id, and federation limitations) is. Usually, I use the "-Mall" option with multi-cluster. Initially, I thought the federation will send tasks to more than on cluster at once (e.g., I had 2 clusters, each cluster has 2 nodes, and each node has 8 CPUs. I submitted more than 16 tasks, and though the federation will submit 16 tasks to the first cluster, and the additional tasks to the second cluster. But, it seems the federation acts similarly to the multi-cluster as only one cluster receives the tasks).

Regards

--
Mohammed

Reply via email to