Mohammed,
Generally, you can think of federation as a way to centrally track and
manage your multiple clusters. More of a way to run single 'sreport' and
'sacct' commands. There are added abilities such as being able to
specify the cluster to send a job to, but for all intents and purposes,
the clusters themselves are independent.
It sounds like what you may want is multiple partitions in the same
cluster. You can have 2 that comprise of the sets of nodes and a third
that is comprised of all nodes. Or a single partition with all nodes and
have features that delineate what node can do what (a node-locked
license, for example). Then you can send a job to a specific subset of
nodes.
Quite a few other ways to design the ability you describe, but separate
clusters is not one of them.
Brian Andrus
On 6/26/2023 6:11 AM, mohammed shambakey wrote:
Hi
Just out of interest, I wonder what the exact difference between slurm
multi-cluster and federation (apart from unique job id, and federation
limitations) is. Usually, I use the "-Mall" option with multi-cluster.
Initially, I thought the federation will send tasks to more than on
cluster at once (e.g., I had 2 clusters, each cluster has 2 nodes, and
each node has 8 CPUs. I submitted more than 16 tasks, and though the
federation will submit 16 tasks to the first cluster, and the
additional tasks to the second cluster. But, it seems the federation
acts similarly to the multi-cluster as only one cluster receives the
tasks).
Regards
--
Mohammed