Cluster Requirements¶
Note
Clustering is only available in the non-personal CML products.
If you are installing CML-Personal or CML-Personal Plus, this information about clustering does not apply to your CML installation. Personal CML products can only be configured as standalone CML installations.
A CML cluster consists of multiple CML instances, where one CML instance acts as the controller for one or more computes.
When you deploy a cluster, the controller can be either a CML VM or a CML bare metal deployment. All cluster computes should be either VMs or bare metal servers. The supported deployment options for a CML cluster are:
A bare metal cluster
Controller is a CML bare metal deployment, configured as a controller.
Computes are CML bare metal deployments, configured as computes.
The controller may be configured to also run node VMs.
A cluster of VMs
Controller is a CML VM deployment, configured as a controller.
Computes are CML VM deployments, configured as computes.
The controller may be configured to also run node VMs.
Dedicated VM controller with bare metal computes
Controller is a CML VM deployment, configured as a controller that does not also run node VMs.
Computes are CML bare metal deployments, configured as computes.
Best practices for CML cluster deployments include:
Ideally, all computes will be uniform (same speed and number of cores of CPU, same amount of memory, etc.)
The cluster controller should be a dedicated VM or bare metal CML instance. That is, the controller should not also run VMs, if possible.
If you are deploying the cluster computes as VMs on ESXi, then each compute should run on a dedicated ESXi host with no other active VMs.
In particular, it generally does not make sense to run two separate CML compute VMs on the same ESXi host. You could instead just run one larger CML compute VM.
The only time it might make sense to run a smaller CML compute VM on an ESXi is when that is needed to keep all of the computes uniform.
If you are deploying a cluster, your cluster controller and cluster computes must meet the System Requirements of a regular, standalone CML deployment. Additionally, a cluster deployment must meet the following requirements:
The controller and all computes must run the same version of CML.
Computes must support VMX extensions. The controller must also support VMX if the controller also runs node VMs.
A minimum of two computers (physical or VMs) are needed to form a cluster. There must be one controller and one or more computes.
The controller and each compute needs two NICs at a minimum. We recommended 10GB NICs.
Two adapter cards are not required, but the OS must see 2 active interfaces (ports).
One interface is used for the UI / API / management access to the CML server.
The other interface is used for intra-cluster communication.
As with current releases, if separation of UI/API and external traffic is required, then you can use additional NICs on the controller.
There should be at most 5ms latency between all computes in the cluster.
The controller and each compute must be layer-2 adjacent (e.g. connected to the same subnet for the NIC that is used for intra-cluster communication). The front-end / first NIC should normally be in the same subnet, too.
The controller needs disk space to store (and host) the content of the reference platform ISO, which is copied to the controller’s local drive during initial setup.
Computes need additional disk space to store the data for each running node’s VM. If the controller is configured to also run node VMs, then it also needs disk space for the nodes’ data.
Ideally, all computes should have the same CPU type / capabilities.
The intra-cluster network connecting computes should meet these requirements:
The network must support jumbo frames. (With modern NICs and offloading, this might not be a true requirement but can help with performance.)
The network must allow IPv6 link-local traffic between all hosts.
The network should not connect other nodes or carry any other traffic. That is, it should be on a dedicated VLAN.
For maximum performance, a dedicated physical network is recommended.
While not required, the intra-cluster communication network should be isolated from the UI/API network.