diff --git a/enhancements/installer/vsphere-ipi-zonal.md b/enhancements/installer/vsphere-ipi-zonal.md index 428b072aaa..1eb750e5b4 100644 --- a/enhancements/installer/vsphere-ipi-zonal.md +++ b/enhancements/installer/vsphere-ipi-zonal.md @@ -4,7 +4,7 @@ authors: - "@jcpowermac" reviewers: - "@rvanderp3" - - "@bostrt" + - "@vr4manta" - "@JoelSpeed" approvers: - "@rvanderp3" @@ -13,7 +13,7 @@ api-approvers: - "@JoelSpeed" - "@deads2k" creation-date: 2021-09-21 -last-updated: 2022-08-24 +last-updated: 2024-09-26 status: implementable see-also: - "/enhancements/" @@ -23,6 +23,7 @@ superseded-by: - tracking-link: - https://issues.redhat.com/browse/SPLAT-320 +- https://issues.redhat.com/browse/SPLAT-1728 --- @@ -40,55 +41,56 @@ tracking-link: ## Summary The goal of this enhancement is to provide the ability to install in a -vSphere environment with multiple datacenters and clusters. +vSphere environment with multiple types of failure domains. -This will be an opinionated design, the vCenter datacenter will always be a `region` -and a vCenter cluster will always be a `zone`. +The failure domain types include: +- Multiple vCenters, Datacenters and Clusters +- Host and VM Groups - where Clusters are a region and ESXi nodes (Host Group) are the zones. +This is important in vCenter clusters that are stretched over physical datacenters. ## Motivation Users of OpenShift would like the ability to deploy -within multiple datacenters and clusters to increase +within multiple physical and virtual datacenters and clusters to increase reliability. Customers would also like to take advantage of the concept of regions and zoning that this type of deployment would offer. - https://issues.redhat.com/browse/RFE-845 +- https://issues.redhat.com/browse/RFE-4540 +- https://issues.redhat.com/browse/RFE-4803 +- https://issues.redhat.com/browse/RFE-5527 - https://issues.redhat.com/browse/OCPPLAN-4927 +- https://issues.redhat.com/browse/OCPSTRAT-1577 ### Goals -- Support regions and zones in vSphere using multiple datacenters (region) and -clusters (zone) -- Support installation into multiple datacenters and multiple clusters +To be able to install OpenShift on vSphere with a set topology. This includes: + - using multiple vCenters, datacenters and clusters + - using cluster and host groups (including vm-host groups and rules) ### Non-Goals -- Support multiple subnets -- Support multiple vcenters +## Existing and Proposal -Note: The platform spec will be modified to support this at a future date. -Only a single item in `vcenters` will be supported for the initial release. - -## Proposal - -Modification of the installer to support the provisioning of masters in defined -datacenters and clusters. +Modification of the installer to support: +- the provisioning of control plane nodes in defined datacenters and clusters. +- provisioning in a stretched vSphere cluster using a cluster as a region and hosts as zones ### Workflow Description ### api -#### Infrastructure +#### Infrastructure spec Using the upstream [vSphere cluster api](https://github.com/kubernetes-sigs/cluster-api-provider-vsphere/blob/main/apis/v1beta1/vspherefailuredomain_types.go) and the upstream [vSphere cloud config manager](https://github.com/kubernetes/cloud-provider-vsphere/blob/master/pkg/common/config/types_yaml.go) as examples to implement parameters of `VSpherePlatformSpec`. These parameters include the optional and required information to manage a OpenShift cluster on vSphere. +Current platform spec before additions for vm-host based zonal ```golang -// VSpherePlatformFailureDomainSpec holds the region and zone failure domain and -// the vCenter topology of that failure domain. +// VSpherePlatformFailureDomainSpec holds the region and zone failure domain and the vCenter topology of that failure domain. type VSpherePlatformFailureDomainSpec struct { // name defines the arbitrary but unique name // of a failure domain. @@ -145,14 +147,18 @@ type VSpherePlatformTopology struct { ComputeCluster string `json:"computeCluster"` // networks is the list of port group network names within this failure domain. - // Currently, we only support a single interface per RHCOS virtual machine. + // If feature gate VSphereMultiNetworks is enabled, up to 10 network adapters may be defined. + // 10 is the maximum number of virtual network devices which may be attached to a VM as defined by: + // https://configmax.esp.vmware.com/guest?vmwareproduct=vSphere&release=vSphere%208.0&categories=1-0 // The available networks (port groups) can be listed using // `govc ls 'network/*'` - // The single interface should be the absolute path of the form + // Networks should be in the form of an absolute path: // //network/. // +kubebuilder:validation:Required - // +kubebuilder:validation:MaxItems=1 + // +openshift:validation:FeatureGateAwareMaxItems:featureGate="",maxItems=1 + // +openshift:validation:FeatureGateAwareMaxItems:featureGate=VSphereMultiNetworks,maxItems=10 // +kubebuilder:validation:MinItems=1 + // +listType=atomic Networks []string `json:"networks"` // datastore is the absolute path of the datastore in which the @@ -180,6 +186,22 @@ type VSpherePlatformTopology struct { // +kubebuilder:validation:Pattern=`^/.*?/vm/.*?` // +optional Folder string `json:"folder,omitempty"` + + // template is the full inventory path of the virtual machine or template + // that will be cloned when creating new machines in this failure domain. + // The maximum length of the path is 2048 characters. + // + // When omitted, the template will be calculated by the control plane + // machineset operator based on the region and zone defined in + // VSpherePlatformFailureDomainSpec. + // For example, for zone=zonea, region=region1, and infrastructure name=test, + // the template path would be calculated as //vm/test-rhcos-region1-zonea. + // +openshift:enable:FeatureGate=VSphereControlPlaneMachineSet + // +kubebuilder:validation:MinLength=1 + // +kubebuilder:validation:MaxLength=2048 + // +kubebuilder:validation:Pattern=`^/.*?/vm/.*?` + // +optional + Template string `json:"template,omitempty"` } // VSpherePlatformVCenterSpec stores the vCenter connection fields. @@ -210,6 +232,7 @@ type VSpherePlatformVCenterSpec struct { // a topology. // +kubebuilder:validation:Required // +kubebuilder:validation:MinItems=1 + // +listType=set Datacenters []string `json:"datacenters"` } @@ -223,6 +246,7 @@ type VSpherePlatformNodeNetworkingSpec struct { // that will be used in respective status.addresses fields. // --- // + Validation is applied via a patch, we validate the format as cidr + // +listType=set // +optional NetworkSubnetCIDR []string `json:"networkSubnetCidr,omitempty"` @@ -239,6 +263,7 @@ type VSpherePlatformNodeNetworkingSpec struct { // the IP address from the VirtualMachine's VM for use in the status.addresses fields. // --- // + Validation is applied via a patch, we validate the format as cidr + // +listType=atomic // +optional ExcludeNetworkSubnetCIDR []string `json:"excludeNetworkSubnetCidr,omitempty"` } @@ -256,21 +281,32 @@ type VSpherePlatformNodeNetworking struct { // VSpherePlatformSpec holds the desired state of the vSphere infrastructure provider. // In the future the cloud provider operator, storage operator and machine operator will // use these fields for configuration. +// +kubebuilder:validation:XValidation:rule="!has(oldSelf.apiServerInternalIPs) || has(self.apiServerInternalIPs)",message="apiServerInternalIPs list is required once set" +// +kubebuilder:validation:XValidation:rule="!has(oldSelf.ingressIPs) || has(self.ingressIPs)",message="ingressIPs list is required once set" +// +kubebuilder:validation:XValidation:rule="!has(oldSelf.vcenters) && has(self.vcenters) ? size(self.vcenters) < 2 : true",message="vcenters can have at most 1 item when configured post-install" type VSpherePlatformSpec struct { // vcenters holds the connection details for services to communicate with vCenter. - // Currently, only a single vCenter is supported. + // Currently, only a single vCenter is supported, but in tech preview 3 vCenters are supported. + // Once the cluster has been installed, you are unable to change the current number of defined + // vCenters except in the case where the cluster has been upgraded from a version of OpenShift + // where the vsphere platform spec was not present. You may make modifications to the existing + // vCenters that are defined in the vcenters list in order to match with any added or modified + // failure domains. // --- // + If VCenters is not defined use the existing cloud-config configmap defined // + in openshift-config. - // +openshift:enable:FeatureSets=TechPreviewNoUpgrade - // +kubebuilder:validation:MaxItems=1 // +kubebuilder:validation:MinItems=0 + // +openshift:validation:FeatureGateAwareMaxItems:featureGate="",maxItems=1 + // +openshift:validation:FeatureGateAwareMaxItems:featureGate=VSphereMultiVCenters,maxItems=3 + // +kubebuilder:validation:XValidation:rule="size(self) != size(oldSelf) ? size(oldSelf) == 0 && size(self) < 2 : true",message="vcenters cannot be added or removed once set" + // +listType=atomic // +optional VCenters []VSpherePlatformVCenterSpec `json:"vcenters,omitempty"` // failureDomains contains the definition of region, zone and the vCenter topology. // If this is omitted failure domains (regions and zones) will not be used. - // +openshift:enable:FeatureSets=TechPreviewNoUpgrade + // +listType=map + // +listMapKey=name // +optional FailureDomains []VSpherePlatformFailureDomainSpec `json:"failureDomains,omitempty"` @@ -279,53 +315,261 @@ type VSpherePlatformSpec struct { // If this field is omitted, networking defaults to the legacy // address selection behavior which is to only support a single address and // return the first one found. - // +openshift:enable:FeatureSets=TechPreviewNoUpgrade // +optional NodeNetworking VSpherePlatformNodeNetworking `json:"nodeNetworking,omitempty"` + + // apiServerInternalIPs are the IP addresses to contact the Kubernetes API + // server that can be used by components inside the cluster, like kubelets + // using the infrastructure rather than Kubernetes networking. These are the + // IPs for a self-hosted load balancer in front of the API servers. + // In dual stack clusters this list contains two IP addresses, one from IPv4 + // family and one from IPv6. + // In single stack clusters a single IP address is expected. + // When omitted, values from the status.apiServerInternalIPs will be used. + // Once set, the list cannot be completely removed (but its second entry can). + // + // +kubebuilder:validation:MaxItems=2 + // +kubebuilder:validation:XValidation:rule="size(self) == 2 && isIP(self[0]) && isIP(self[1]) ? ip(self[0]).family() != ip(self[1]).family() : true",message="apiServerInternalIPs must contain at most one IPv4 address and at most one IPv6 address" + // +listType=atomic + // +optional + APIServerInternalIPs []IP `json:"apiServerInternalIPs"` + + // ingressIPs are the external IPs which route to the default ingress + // controller. The IPs are suitable targets of a wildcard DNS record used to + // resolve default route host names. + // In dual stack clusters this list contains two IP addresses, one from IPv4 + // family and one from IPv6. + // In single stack clusters a single IP address is expected. + // When omitted, values from the status.ingressIPs will be used. + // Once set, the list cannot be completely removed (but its second entry can). + // + // +kubebuilder:validation:MaxItems=2 + // +kubebuilder:validation:XValidation:rule="size(self) == 2 && isIP(self[0]) && isIP(self[1]) ? ip(self[0]).family() != ip(self[1]).family() : true",message="ingressIPs must contain at most one IPv4 address and at most one IPv6 address" + // +listType=atomic + // +optional + IngressIPs []IP `json:"ingressIPs"` + + // machineNetworks are IP networks used to connect all the OpenShift cluster + // nodes. Each network is provided in the CIDR format and should be IPv4 or IPv6, + // for example "10.0.0.0/8" or "fd00::/8". + // +listType=atomic + // +kubebuilder:validation:MaxItems=32 + // +kubebuilder:validation:XValidation:rule="self.all(x, self.exists_one(y, x == y))" + // +optional + MachineNetworks []CIDR `json:"machineNetworks"` } + ``` -### CCCMO +vm-host zonal changes required to vsphere infrastructure objects +```diff +diff --git a/config/v1/types_infrastructure.go b/config/v1/types_infrastructure.go +index 0daa62d30..09189861f 100644 +--- a/config/v1/types_infrastructure.go ++++ b/config/v1/types_infrastructure.go +@@ -1162,8 +1162,31 @@ type VSpherePlatformLoadBalancer struct { + Type PlatformLoadBalancerType `json:"type,omitempty"` + } + +-// VSpherePlatformFailureDomainSpec holds the region and zone failure domain and +-// the vCenter topology of that failure domain. ++// The VSphereFailureDomainZoneType is a string representation of a failure domain ++// zone type. There are two supportable types HostGroup and ComputeCluster ++// +enum ++type VSphereFailureDomainZoneType string ++ ++// The VSphereFailureDomainRegionType is a string representation of a failure domain ++// region type. There are two supportable types ComputeCluster and Datacenter ++// +enum ++type VSphereFailureDomainRegionType string ++ ++const ( ++ // HostGroupFailureDomainZone is a failure domain zone for a vCenter vm-host group. ++ HostGroupFailureDomainZone VSphereFailureDomainZoneType = "HostGroup" ++ // ComputeClusterFailureDomainZone is a failure domain zone for a vCenter compute cluster. ++ ComputeClusterFailureDomainZone VSphereFailureDomainZoneType = "ComputeCluster" ++ // DatacenterFailureDomainRegion is a failure domain region for a vCenter datacenter. ++ DatacenterFailureDomainRegion VSphereFailureDomainRegionType = "Datacenter" ++ // ComputeClusterFailureDomainRegion is a failure domain region for a vCenter compute cluster. ++ ComputeClusterFailureDomainRegion VSphereFailureDomainRegionType = "ComputeCluster" ++) ++ ++// VSpherePlatformFailureDomainSpec holds the region and zone failure domain and the vCenter topology of that failure domain. ++// +openshift:validation:FeatureGateAwareXValidation:featureGate=VSphereHostVMGroupZonal,rule="has(self.zoneAffinity) && self.zoneAffinity.type == 'HostGroup' ? has(self.regionAffinity) && self.regionAffinity.type == 'ComputeCluster' : true",message="when zoneAffinity type is HostGroup, regionAffinity type must be ComputeCluster" ++// +openshift:validation:FeatureGateAwareXValidation:featureGate=VSphereHostVMGroupZonal,rule="has(self.zoneAffinity) && self.zoneAffinity.type == 'ComputeCluster' ? has(self.regionAffinity) && self.regionAffinity.type == 'Datacenter' : true",message="when zoneAffinity type is ComputeCluster, regionAffinity type must be Datacenter" ++// +openshift:validation:FeatureGateAwareXValidation:featureGate=VSphereHostVMGroupZonal,rule="has(self.zoneAffinity) && self.zoneAffinity.type == 'HostGroup' ? has(self.zoneAffinity) && has(self.zoneAffinity.hostGroup) : true",message="when zoneAffinity type is HostGroup, hostGroup must be defined" + type VSpherePlatformFailureDomainSpec struct { + // name defines the arbitrary but unique name + // of a failure domain. +@@ -1188,6 +1211,21 @@ type VSpherePlatformFailureDomainSpec struct { + // +kubebuilder:validation:Required + Zone string `json:"zone"` + ++ // regionAffinity holds the type of region, Datacenter or ComputeCluster. ++ // When set to Datacenter, this means the region is a vCenter Datacenter as defined in topology. ++ // When set to ComputeCluster, this means the region is a vCenter Cluster as defined in topology. ++ // +openshift:validation:featureGate=VSphereHostVMGroupZonal ++ // +optional ++ RegionAffinity *VSphereFailureDomainRegionAffinity `json:"regionAffinity,omitempty"` ++ ++ // zoneAffinity holds the type of the zone and the hostGroup which ++ // vmGroup and the hostGroup names in vCenter corresponds to ++ // a vm-host group of type Virtual Machine and Host respectively. Is also ++ // contains the vmHostRule which is an affinity vm-host rule in vCenter. ++ // +openshift:validation:featureGate=VSphereHostVMGroupZonal ++ // +optional ++ ZoneAffinity *VSphereFailureDomainZoneAffinity `json:"zoneAffinity,omitempty"` ++ + // server is the fully-qualified domain name or the IP address of the vCenter server. + // +kubebuilder:validation:Required + // +kubebuilder:validation:MinLength=1 +@@ -1277,6 +1315,82 @@ type VSpherePlatformTopology struct { + Template string `json:"template,omitempty"` + } + ++// VSphereFailureDomainZoneAffinity contains the vCenter cluster vm-host group (virtual machine and host types) ++// and the vm-host affinity rule that together creates an affinity configuration for vm-host based zonal. ++// This configuration within vCenter creates the required association between a failure domain, virtual machines ++// and ESXi hosts to create a vm-host based zone. ++// +kubebuilder:validation:XValidation:rule="has(self.type) && self.type == 'HostGroup' ? has(self.hostGroup.hostGroup) : !has(self.hostGroup) || !has(self.hostGroup.hostGroup)",message="hostGroup is required when type is HostGroup, and forbidden otherwise" ++// +kubebuilder:validation:XValidation:rule="has(self.type) && self.type == 'HostGroup' ? has(self.hostGroup.vmGroup) : !has(self.hostGroup) || !has(self.hostGroup.vmGroup)",message="vmGroup is required when type is HostGroup, and forbidden otherwise" ++// +kubebuilder:validation:XValidation:rule="has(self.type) && self.type == 'HostGroup' ? has(self.hostGroup.vmHostRule) : !has(self.hostGroup) || !has(self.hostGroup.vmHostRule)",message="vmHostRule is required when type is HostGroup, and forbidden otherwise" ++// +union ++type VSphereFailureDomainZoneAffinity struct { ++ // type is the string representation of the VSphereFailureDomainZoneType with available options of ++ // ComputeCluster and HostGroup. ++ // When set to ComputeCluster, this means the vCenter cluster defined is the zone. ++ // When set to HostGroup, hostGroup must be configured with hostGroup, vmGroup and vmHostRule and ++ // this means the zone is defined by the grouping of those fields. ++ // +kubebuilder:validation:Enum:=HostGroup;ComputeCluster ++ // +kubebuilder:validation:MinLength=9 ++ // +kubebuilder:validation:MaxLength=14 ++ // +kubebuilder:validation:Type:=string ++ // +kubebuilder:validation:Required ++ // +unionDiscriminator ++ Type VSphereFailureDomainZoneType `json:"type"` ++ ++ // hostGroup holds the vmGroup and the hostGroup names in vCenter ++ // corresponds to a vm-host group of type Virtual Machine and Host respectively. Is also ++ // contains the vmHostRule which is an affinity vm-host rule in vCenter. ++ // +unionMember,optional ++ HostGroup *VSphereFailureDomainHostGroup `json:"hostGroup,omitempty"` ++} ++ ++// VSphereFailureDomainRegionAffinity contains the region type which is the string representation of the ++// VSphereFailureDomainRegionType with available options of Datacenter and ComputeCluster. ++// +kubebuilder:validation:XValidation:rule="has(self.type) && self.type == 'Datacenter' || self.type == 'ComputeCluster'",message="regionAffinity type must be either Datacenter or ComputeCluster, and forbidden otherwise" ++// +union ++type VSphereFailureDomainRegionAffinity struct { ++ // type is the string representation of the VSphereFailureDomainRegionType with available options of ++ // Datacenter and ComputeCluster. ++ // When set to Datacenter, this means the vCenter Datacenter defined is the region. ++ // When set to ComputeCluster, this means the vCenter cluster defined is the region. ++ // +kubebuilder:validation:MinLength=9 ++ // +kubebuilder:validation:MaxLength=14 ++ // +kubebuilder:validation:Enum:=ComputeCluster;Datacenter ++ // +kubebuilder:validation:Type:=string ++ // +kubebuilder:validation:Required ++ // +unionDiscriminator ++ Type VSphereFailureDomainRegionType `json:"type"` ++} ++ ++// VSphereFailureDomainHostGroup holds the vmGroup and the hostGroup names in vCenter ++// corresponds to a vm-host group of type Virtual Machine and Host respectively. Is also ++// contains the vmHostRule which is an affinity vm-host rule in vCenter. ++type VSphereFailureDomainHostGroup struct { ++ // vmGroup is the name of the vm-host group of type virtual machine within vCenter for this failure domain. ++ // vmGroup is limited to 80 characters. ++ // This field is required when the VSphereFailureDomain ZoneType is HostGroup ++ // +kubebuilder:validation:MinLength=1 ++ // +kubebuilder:validation:MaxLength=80 ++ // +kubebuilder:validation:Required ++ VMGroup string `json:"vmGroup"` ++ ++ // hostGroup is the name of the vm-host group of type host within vCenter for this failure domain. ++ // hostGroup is limited to 80 characters. ++ // This field is required when the VSphereFailureDomain ZoneType is HostGroup ++ // +kubebuilder:validation:MinLength=1 ++ // +kubebuilder:validation:MaxLength=80 ++ // +kubebuilder:validation:Required ++ HostGroup string `json:"hostGroup"` ++ ++ // vmHostRule is the name of the affinity vm-host rule within vCenter for this failure domain. ++ // vmHostRule is limited to 80 characters. ++ // This field is required when the VSphereFailureDomain ZoneType is HostGroup ++ // +kubebuilder:validation:MinLength=1 ++ // +kubebuilder:validation:MaxLength=80 ++ // +kubebuilder:validation:Required ++ VMHostRule string `json:"vmHostRule"` ++} ++ + // VSpherePlatformVCenterSpec stores the vCenter connection fields. + // This is used by the vSphere CCM. + type VSpherePlatformVCenterSpec struct { +``` -The CCCMO cloud-config transformation already exists but will -need to be modified to support the VSpherePlatformSpec. +### Cluster Cloud Controller Manager Operator (CCCMO or 3cmo) -### installer +The 3cmo translates existing legacy in-tree cloud provider config to the external CCM config. +If the Infrastructure vSphere spec is provided the 3cmo will include those fields in the configuration: +internal and external network, and vCenter configuration. +This allows the day two configuration of vSphere zonal if the failure domains length is greater than +one the CCM config label section is defined. -#### Platform Spec +### Cluster API and CAPV provider (near future) -The platform spec needs to be modified to support our initial goals of multiple -datacenters (regions) and clusters (zones) and vcenters. This platform spec -design is based on the changes suggested in openshift/api, Cluster API for vSphere -and out-of-tree CCM. +The introduction of CAPI vSphere-based machines will require additional manifests to be created and applied +to the cluster being installed (meaning to not only the cluster-api instance but the cluster being built). +These new manifests will need to include the following objects: -We are adding multiple additional parameters to the Platform struct: +- VSphereCluster +- VSphereDeploymentZone +- VSphereFailureDomain +- VSphereMachine -- `VCenters` -- `FailureDomains` +`VSphereCluster` already exists for the installer to utilize CAPV. The two new objects +`VSphereDeploymentZone` and `VSphereFailureDomain` were added to the installer to support +vm-host group zonal. CAPV makes it significantly easier to deploy into vm-host group zonal +with just manifest creation. + +### installer +#### Platform Spec + +The platform spec needs support vSphere topology and zonal deployment. +This platform spec design is based on the changes api, Cluster API for vSphere +and out-of-tree CCM. `VCenters` will contain the connections and configuration for each vCenter -that is required for the out-of-tree CCM. Note: Only a single vCenter will -be supported by this effort. While the out-of-tree CCM supports multiple vCenters -the out-of-tree CSI does not. +that is required for the out-of-tree CCM. `FailureDomains` will define the configuration of a region, zone, and topology. `Topology` defines the vCenter objects that make up a region or zone -including `Datacenter`, `ComputeCluster`, `Hosts`, `Networks` and a `Datastore`. +including `Datacenter`, `ComputeCluster`, `Hosts`, `Networks`, `Datastore` and +optionally `ResourcePool`, `Folder`, `Template`, and `TagIDs`. -The existing platform spec vcenter parameters will be deprecated -but _not_ removed or remove support for using those parameters. +The vm-host group zonal will add a field of `hostGroup` which needs to pre-exist prior to installation. -This is an extension of the existing platform spec. The parameters below -will be added to the existing platform struct. +The existing platform spec vcenter parameters are deprecated +but not removed or remove support for using those parameters. The deprecated +platform spec though will not gain the new features that failure domains provides. ```golang package vsphere +import ( + configv1 "github.com/openshift/api/config/v1" +) + // DiskType is a disk provisioning type for vsphere. // +kubebuilder:validation:Enum="";thin;thick;eagerZeroedThick type DiskType string +// FailureDomainType is the string representation name of the failure domain type. +// There are two defined failure domains currently, Datacenter and ComputeCluster. +// Each represents a vCenter object type within a vSphere environment. +// +kubebuilder:validation:Enum=HostGroup;Datacenter;ComputeCluster +type FailureDomainType string + const ( // DiskTypeThin uses Thin disk provisioning type for vsphere in the cluster. DiskTypeThin DiskType = "thin" @@ -335,28 +579,62 @@ const ( // DiskTypeEagerZeroedThick uses EagerZeroedThick disk provisioning type for vsphere in the cluster. DiskTypeEagerZeroedThick DiskType = "eagerZeroedThick" + + // TagCategoryRegion the tag category associated with regions. + TagCategoryRegion = "openshift-region" + + // TagCategoryZone the tag category associated with zones. + TagCategoryZone = "openshift-zone" +) + +const ( + // ControlPlaneRole represents control-plane nodes. + ControlPlaneRole = "control-plane" + // ComputeRole represents worker nodes. + ComputeRole = "compute" + // BootstrapRole represents bootstrap nodes. + BootstrapRole = "bootstrap" +) + +const ( + // HostGroupFailureDomain is a failure domain for a vCenter vm-host group. + HostGroupFailureDomain FailureDomainType = "HostGroup" + // ComputeClusterFailureDomain is a failure domain for a vCenter compute cluster. + ComputeClusterFailureDomain FailureDomainType = "ComputeCluster" + // DatacenterFailureDomain is a failure domain for a vCenter datacenter. + DatacenterFailureDomain FailureDomainType = "Datacenter" ) -// Platform stores any global configuration used for vsphere platforms +// Platform stores any global configuration used for vsphere platforms. type Platform struct { // VCenter is the domain name or IP address of the vCenter. - VCenter string `json:"vCenter"` + // Deprecated: Use VCenters.Server + DeprecatedVCenter string `json:"vCenter,omitempty"` // Username is the name of the user to use to connect to the vCenter. - Username string `json:"username"` + // Deprecated: Use VCenters.Username + DeprecatedUsername string `json:"username,omitempty"` // Password is the password for the user to use to connect to the vCenter. - Password string `json:"password"` + // Deprecated: Use VCenters.Password + DeprecatedPassword string `json:"password,omitempty"` // Datacenter is the name of the datacenter to use in the vCenter. - Datacenter string `json:"datacenter"` + // Deprecated: Use FailureDomains.Topology.Datacenter + DeprecatedDatacenter string `json:"datacenter,omitempty"` // DefaultDatastore is the default datastore to use for provisioning volumes. - DefaultDatastore string `json:"defaultDatastore"` + // Deprecated: Use FailureDomains.Topology.Datastore + DeprecatedDefaultDatastore string `json:"defaultDatastore,omitempty"` // Folder is the absolute path of the folder that will be used and/or created for // virtual machines. The absolute path is of the form //vm//. - Folder string `json:"folder,omitempty"` + // +kubebuilder:validation:Pattern=`^/.*?/vm/.*?` + // +optional + // Deprecated: Use FailureDomains.Topology.Folder + DeprecatedFolder string `json:"folder,omitempty"` // Cluster is the name of the cluster virtual machines will be cloned into. - Cluster string `json:"cluster,omitempty"` + // Deprecated: Use FailureDomains.Topology.Cluster + DeprecatedCluster string `json:"cluster,omitempty"` // ResourcePool is the absolute path of the resource pool where virtual machines will be // created. The absolute path is of the form //host//Resources/. - ResourcePool string `json:"resourcePool,omitempty"` + // Deprecated: Use FailureDomains.Topology.ResourcePool + DeprecatedResourcePool string `json:"resourcePool,omitempty"` // ClusterOSImage overrides the url provided in rhcos.json to download the RHCOS OVA ClusterOSImage string `json:"clusterOSImage,omitempty"` @@ -398,30 +676,47 @@ type Platform struct { // +optional DefaultMachinePlatform *MachinePool `json:"defaultMachinePlatform,omitempty"` // Network specifies the name of the network to be used by the cluster. - Network string `json:"network,omitempty"` + // Deprecated: Use FailureDomains.Topology.Network + DeprecatedNetwork string `json:"network,omitempty"` // DiskType is the name of the disk provisioning type, // valid values are thin, thick, and eagerZeroedThick. When not // specified, it will be set according to the default storage policy // of vsphere. DiskType DiskType `json:"diskType,omitempty"` - // vcenters holds the connection details for services to communicate with vCenter. + // VCenters holds the connection details for services to communicate with vCenter. // Currently only a single vCenter is supported. // +kubebuilder:validation:Optional - // +kubebuilder:validation:MaxItems=1 + // +kubebuilder:validation:MaxItems=3 // +kubebuilder:validation:MinItems=1 VCenters []VCenter `json:"vcenters,omitempty"` - // failureDomains holds the VSpherePlatformFailureDomainSpec which contains + // FailureDomains holds the VSpherePlatformFailureDomainSpec which contains // the definition of region, zone and the vCenter topology. // If this is omitted failure domains (regions and zones) will not be used. // +kubebuilder:validation:Optional FailureDomains []FailureDomain `json:"failureDomains,omitempty"` + + // nodeNetworking contains the definition of internal and external network constraints for + // assigning the node's networking. + // If this field is omitted, networking defaults to the legacy + // address selection behavior which is to only support a single address and + // return the first one found. + // +optional + NodeNetworking *configv1.VSpherePlatformNodeNetworking `json:"nodeNetworking,omitempty"` + + // LoadBalancer defines how the load balancer used by the cluster is configured. + // LoadBalancer is available in TechPreview. + // +optional + LoadBalancer *configv1.VSpherePlatformLoadBalancer `json:"loadBalancer,omitempty"` + // Hosts defines network configurations to be applied by the installer. Hosts is available in TechPreview. + Hosts []*Host `json:"hosts,omitempty"` } // FailureDomain holds the region and zone failure domain and // the vCenter topology of that failure domain. type FailureDomain struct { - // name defines the name of the FailureDomain. - // This name is abritrary. + // name defines the name of the FailureDomain + // This name is arbitrary but will be used + // in VSpherePlatformDeploymentZone for association. // +kubebuilder:validation:Required // +kubebuilder:validation:MinLength=1 // +kubebuilder:validation:MaxLength=256 @@ -444,6 +739,15 @@ type FailureDomain struct { // Topology describes a given failure domain using vSphere constructs // +kubebuilder:validation:Required Topology Topology `json:"topology"` + + // Type is the type of failure domain, the current values are "Datacenter", "ComputeCluster" and "HostGroup" + // +kubebuilder:validation:Enum=Datacenter;ComputeCluster + // +optional + RegionType FailureDomainType `json:"regionType,omitempty"` + // Type is the type of failure domain, the current values are "Datacenter", "ComputeCluster" and "HostGroup" + // +kubebuilder:validation:Enum=ComputeCluster;HostGroup + // +optional + ZoneType FailureDomainType `json:"zoneType,omitempty"` } // Topology holds the required and optional vCenter objects - datacenter, @@ -462,7 +766,6 @@ type Topology struct { // +kubebuilder:validation:MaxLength=2048 ComputeCluster string `json:"computeCluster"` // networks is the list of networks within this failure domain - // +kubebuilder:validation:Optional Networks []string `json:"networks,omitempty"` // datastore is the name or inventory path of the datastore in which the // virtual machine is created/located. @@ -477,12 +780,31 @@ type Topology struct { // +kubebuilder:validation:Pattern=`^/.*?/host/.*?/Resources.*` // +optional ResourcePool string `json:"resourcePool,omitempty"` - // folder is the name or inventory path of the folder in which the + // folder is the inventory path of the folder in which the // virtual machine is created/located. // +kubebuilder:validation:MinLength=1 // +kubebuilder:validation:MaxLength=2048 + // +kubebuilder:validation:Pattern=`^/.*?/vm/.*?` // +optional Folder string `json:"folder,omitempty"` + // template is the inventory path of the virtual machine or template + // that will be used for cloning. + // +kubebuilder:validation:MinLength=1 + // +kubebuilder:validation:MaxLength=2048 + // +kubebuilder:validation:Pattern=`^/.*?/vm/.*?` + // +optional + Template string `json:"template,omitempty"` + // tagIDs is an optional set of tags to add to an instance. Specified tagIDs + // must use URN-notation instead of display names. A maximum of 10 tag IDs may be specified. + // +kubebuilder:example=`urn:vmomi:InventoryServiceTag:5736bf56-49f5-4667-b38c-b97e09dc9578:GLOBAL` + // +optional + TagIDs []string `json:"tagIDs,omitempty"` + + // hostGroup is the name of the vm-host group of type host within vCenter for this failure domain. + // This field is required when the FailureDomain zoneType is HostGroup + // +kubebuilder:validation:MaxLength=80 + // +optional + HostGroup string `json:"hostGroup,omitempty"` } // VCenter stores the vCenter connection fields @@ -499,7 +821,7 @@ type VCenter struct { // +kubebuilder:validation:Minimum=1 // +kubebuilder:validation:Maximum=32767 // +kubebuilder:default=443 - Port uint `json:"port,omitempty"` + Port int32 `json:"port,omitempty"` // Username is the username that will be used to connect to vCenter // +kubebuilder:validation:Required Username string `json:"user"` @@ -511,11 +833,63 @@ type VCenter struct { // +kubebuilder:validation:MinItems=1 Datacenters []string `json:"datacenters"` } -``` -Add [platform validation](https://github.com/openshift/installer/blob/master/pkg/types/vsphere/validation/platform.go) -for the new struct fields that are required. +// Host defines host VMs to generate as part of the installation. +type Host struct { + // FailureDomain refers to the name of a FailureDomain as described in https://github.com/openshift/enhancements/blob/master/enhancements/installer/vsphere-ipi-zonal.md + // +optional + FailureDomain string `json:"failureDomain"` + // NetworkDeviceSpec to be applied to the host + // +kubebuilder:validation:Required + NetworkDevice *NetworkDeviceSpec `json:"networkDevice"` + // Role defines the role of the node + // +kubebuilder:validation:Enum="";bootstrap;control-plane;compute + // +kubebuilder:validation:Required + Role string `json:"role"` +} +// NetworkDeviceSpec defines network config for static IP assignment. +type NetworkDeviceSpec struct { + // gateway is an IPv4 or IPv6 address which represents the subnet gateway, + // for example, 192.168.1.1. + // +kubebuilder:validation:Format=ipv4 + // +kubebuilder:validation:Format=ipv6 + Gateway string `json:"gateway,omitempty"` + + // ipAddrs is a list of one or more IPv4 and/or IPv6 addresses and CIDR to assign to + // this device, for example, 192.168.1.100/24. IP addresses provided via ipAddrs are + // intended to allow explicit assignment of a machine's IP address. + // +kubebuilder:validation:Format=ipv4 + // +kubebuilder:validation:Format=ipv6 + // +kubebuilder:example=`192.168.1.100/24` + // +kubebuilder:example=`2001:DB8:0000:0000:244:17FF:FEB6:D37D/64` + // +kubebuilder:validation:Required + IPAddrs []string `json:"ipAddrs"` + + // nameservers is a list of IPv4 and/or IPv6 addresses used as DNS nameservers, for example, + // 8.8.8.8. a nameserver is not provided by a fulfilled IPAddressClaim. If DHCP is not the + // source of IP addresses for this network device, nameservers should include a valid nameserver. + // +kubebuilder:validation:Format=ipv4 + // +kubebuilder:validation:Format=ipv6 + // +kubebuilder:example=`8.8.8.8` + Nameservers []string `json:"nameservers,omitempty"` +} + +// IsControlPlane checks if the current host is a master. +func (h *Host) IsControlPlane() bool { + return h.Role == ControlPlaneRole +} + +// IsCompute checks if the current host is a worker. +func (h *Host) IsCompute() bool { + return h.Role == ComputeRole +} + +// IsBootstrap checks if the current host is a bootstrap. +func (h *Host) IsBootstrap() bool { + return h.Role == BootstrapRole +} +``` #### Set infrastructure spec @@ -525,8 +899,6 @@ generate VSpherePlatformSpec and provide the values from the install-config plat #### MachinePool -The `MachinePool` [struct](https://github.com/openshift/installer/blob/master/pkg/types/vsphere/machinepool.go#L5-L26) -needs a single change to include the zones. The `Zones` like the other cloud providers will determine the location where the virtual machine will reside within a vSphere environment. @@ -537,11 +909,79 @@ type MachinePool struct { } ``` -Add [machinepool validation](https://github.com/openshift/installer/blob/master/pkg/types/vsphere/validation/machinepool.go) +##### Creating CAPV `VSphereFailureDomains` and `VSphereDeploymentZones` for host zonal + +To implement host zonal in vSphere the virtual machines need to be added to a vm group. +CAPV gives this capability by creating `VSphereFailureDomains` and `VSphereDeploymentZones` + +```golang +for _, failureDomain := range installConfig.Config.VSphere.FailureDomains { + if failureDomain.ZoneType == vsphere.HostGroupFailureDomain { + dz := &capv.VSphereDeploymentZone{ + TypeMeta: metav1.TypeMeta{}, + ObjectMeta: metav1.ObjectMeta{ + Name: failureDomain.Name, + }, + Spec: capv.VSphereDeploymentZoneSpec{ + Server: fmt.Sprintf("https://%s", failureDomain.Server), + FailureDomain: failureDomain.Name, + ControlPlane: ptr.To(true), + PlacementConstraint: capv.PlacementConstraint{ + ResourcePool: failureDomain.Topology.ResourcePool, + Folder: failureDomain.Topology.Folder, + }, + }, + } + + dz.SetGroupVersionKind(capv.GroupVersion.WithKind("VSphereDeploymentZone")) + + fd := &capv.VSphereFailureDomain{ + TypeMeta: metav1.TypeMeta{}, + ObjectMeta: metav1.ObjectMeta{ + Name: failureDomain.Name, + }, + Spec: capv.VSphereFailureDomainSpec{ + Region: capv.FailureDomain{ + Name: failureDomain.Region, + Type: capv.FailureDomainType(failureDomain.RegionType), + TagCategory: "openshift-region", + }, + Zone: capv.FailureDomain{ + Name: failureDomain.Zone, + Type: capv.FailureDomainType(failureDomain.ZoneType), + TagCategory: "openshift-zone", + }, + Topology: capv.Topology{ + Datacenter: failureDomain.Topology.Datacenter, + ComputeCluster: &failureDomain.Topology.ComputeCluster, + Hosts: &capv.FailureDomainHosts{ + VMGroupName: fmt.Sprintf("%s-%s", clusterID.InfraID, failureDomain.Name), + HostGroupName: failureDomain.Topology.HostGroup, + }, + Networks: failureDomain.Topology.Networks, + Datastore: failureDomain.Topology.Datastore, + }, + }, + } + fd.SetGroupVersionKind(capv.GroupVersion.WithKind("VSphereFailureDomain")) + + manifests = append(manifests, &asset.RuntimeFile{ + Object: fd, + File: asset.File{Filename: fmt.Sprintf("01_vsphere-failuredomain-%s.yaml", failureDomain.Name)}, + }) + + manifests = append(manifests, &asset.RuntimeFile{ + Object: dz, + File: asset.File{Filename: fmt.Sprintf("01_vsphere-deploymentzone-%s.yaml", failureDomain.Name)}, + }) + } +} + +``` #### Cloud Config -The out-of-tree CCM is required for this change. The out-of-tree CCM also updates +The external CCM is required for this change. The out-of-tree CCM also updates the cloud-config ini configuration. Below is an example of what the cloud-config will change to. A `VirtualCenter` section will be added per vcenter in `vcenters`. `datacenters` is a comma-delimited string that will contain @@ -561,70 +1001,45 @@ all the datacenters per region. zone = openshift-zone ``` -The external CCM will not be installed by default in 4.12. As a result -for 4.12 a vSphere zonal installation will also require TechPreviewNoUpgrade -to be enabled. The installer will properly configure the cloud-config -based on this requirement. +In a future release the installer will change from ini to yaml once all operators support it. #### Text-based User Interface (TUI) There are too many options to support this configuration. -Deploying into multiple datacenters/clusters will only be supported via +Deploying a topology with failure domains will only be supported via the `install-config.yaml` -#### Terraform - -Terraform will need to change to support cloning the control plane virtual -machines in multiple datacenters and clusters. - -With the added information provided in the platform spec from the -`VCenters` and `FailureDomains` includes -all the parameters we will need to create the appropriate tags, -tag categories and vCenter objects to provision RHCOS instances. +#### The use of Cluster API and the vSphere provider in the Installer -##### Terraform variables and TFVarsSources +The installer uses CAPI and the CAPV provider to provision the bootstrap and control plane nodes for vSphere. +Terraform is no longer used for vSphere installation. -The terraform `config` struct will need to be modified. -`FailureDomains` provide the vSphere objects that are needed -for importing the OVA. Each `FailureDomain` will have an -individual template. `NetworksInFailureDomains` contains -the managed object id for each port group name. -The `ControlPlanes` is a list of the Machine Provider Spec -which contains all the required parameters for provisioning -the control plane RHCOS guests. And finally `DatacentersFolders` -is a map with the key a union of the datacenter and folder name. -`folder` contains `Datacenter` and `Name` of the vCenter folder. +#### Machine and MachineSet +The control plane machines are now created with CAPI/CAPV and a capv machine is used for deployment. +No modification is required to the machine object to support vm-host zonal. -```golang +The compute workspace will add a single field `vmGroup` to indicate that +the guest needs to be added to this vm-host group of type virtual machine. -type config struct { - VSphereURL string `json:"vsphere_url"` - ... +##### OVA import - // vcenters can still remain a map for easy lookups - VCenters map[string]vtypes.VCenter `json:"vsphere_vcenters"` - FailureDomains []vtypes.FailureDomain `json:"vsphere_failure_domains"` - NetworksInFailureDomains map[string]string `json:"vsphere_networks"` - ControlPlanes []*machineapi.VSphereMachineProviderSpec `json:"vsphere_control_planes"` - DatacentersFolders map[string]*folder `json:"vsphere_folders"` -} -``` +For each `FailureDomain` an ova import will need to occur. +If there is only a single zone then a single import will be required. -#### Machine and MachineSet -The control plane -[Machines](https://github.com/openshift/installer/blob/b0b96468893db2240e82ba2aa0935679c8c49201/pkg/asset/machines/vsphere/machines.go#L19-L64) -will need to be modified to create a Machine per `FailureDomain`. +##### vm-host zonal specifics -For each `FailureDomain` an additional `MachineSet` for -the compute nodes will need to be created based on the `MachinePool` -`zones` configuration. +The vCenter vm-host group of type host is required for each zone prior to installation. Each `FailureDomain` has +a `hostGroup` field that is required when `zoneType` is `HostGroup`. The vCenter vm-host group will contain +the list of ESXi hosts that are associated to that zone. -##### OVA import +Tags will continue to also be required prior to installation. The tag category openshift-region +will be associated with a tag created and applied to the vCenter cluster object. The tag category openshift-zone +will be associated with a tag create and applied to each ESXi host in the zone, which is also defined by the vm-host group (type host). -For each `FailureDomain` a ova import will need to occur. -If there is only a single zone then a single import will be required. +The installer will create a vm-host group of type virtual machine per failure domain. +It will also create a vm-host group rule per failure domain. ### User Stories @@ -634,142 +1049,97 @@ If there is only a single zone then a single import will be required. ### API Extensions - ### Risks and Mitigations -- The out-of-tree CCM is required for this work. It will need to be enabled at -installation time. - ## Design Details -The vSphere platform spec and configuration will change to include `vcenters` and `FailureDomains`. The `MachinePool` will also add an -optional `zones` field. - -`FailureDomains` will contain a unique name including the following parameters: - -- region -- zone -- topology - - datacenter - - cluster - - networks - - datastore - - resourcePool - - folder - -The vSphere platform spec will be the default configuration for the cloud-config -adding the additional datacenters from the machinepool. - -The master virtual machines will be provisioned with terraform per datacenter -and cluster. In the case of workers multiple machinesets will need to be -configured - one per datacenter/cluster pair. - -### install-config +### Scenario #1 - Datacenter-based region, cluster-based zone ```yaml -apiVersion: v1 -baseDomain: example.com -controlPlane: - name: "master" - replicas: 3 - platform: - vsphere: - zones: - - "us-east-1" - - "us-east-2" - - "us-east-3" -compute: -- name: "worker" - replicas: 4 - platform: - vsphere: - zones: - - "us-east-1" - - "us-east-2" - - "us-east-3" - - "us-west-1" -platform: +platform: vsphere: - apiVIP: "192.168.0.1" - ingressVIP: "192.168.0.2" - vCenter: "vcenter" - username: "username" - password: "password" - network: port-group - datacenter: datacenter - cluster: vcs-mdcnc-workload-1 - defaultDatastore: workload_share_vcsmdcncworkload_Yfyf6 + apiVIP: 10.38.201.130 + ingressVIP: 10.38.201.131 vcenters: - - server: "vcenter" - user: "vcenter-username" - password: "vcenter-password" + - server: vcenter.ci.ibmc.devcluster.openshift.com + user: '' + password: '' datacenters: - - IBMCloud - - datacenter-2 + - cidatacenter failureDomains: - name: us-east-1 region: us-east zone: us-east-1a + server: vcenter.ci.ibmc.devcluster.openshift.com topology: - computeCluster: /${vsphere_datacenter}/host/vcs-mdcnc-workload-1 + datacenter: cidatacenter + computeCluster: /cidatacenter/host/cicluster networks: - - network1 - datastore: workload_share_vcsmdcncworkload_Yfyf6 + - ci-vlan-1287 + datastore: /cidatacenter/datastore/vsanDatastore - name: us-east-2 region: us-east zone: us-east-2a + server: vcenter.ci.ibmc.devcluster.openshift.com topology: - computeCluster: /${vsphere_datacenter}/host/vcs-mdcnc-workload-2 + datacenter: cidatacenter + computeCluster: /cidatacenter/host/cicluster2 networks: - - network1 - datastore: workload_share_vcsmdcncworkload2_vyC6a - - name: us-east-3 - region: us-east - zone: us-east-3a + - ci-vlan-1287 + datastore: /cidatacenter/datastore/vsanDatastore +``` + +### Scenario #2 - Cluster-based region, Host-based zone + +```yaml +platform: + vsphere: + apiVIP: 10.93.60.130 + ingressVIP: 10.93.60.131 + vcenters: + - server: 10.93.60.138 + user: administrator@vsphere.local + password: '' + datacenters: + - nested8-datacenter + failureDomains: + - name: us-east-1 + region: us-east + regionType: ComputeCluster + zone: us-east-1a + zoneType: HostGroup + server: 10.93.60.138 topology: - computeCluster: /${vsphere_datacenter}/host/vcs-mdcnc-workload-3 + datacenter: nested8-datacenter + computeCluster: /nested8-datacenter/host/nested-cluster networks: - - network1 - datastore: workload_share_vcsmdcncworkload3_joYiR - - name: us-west-1 - region: us-west - zone: us-west-1a + - VM Network + datastore: /nested8-datacenter/datastore/fs-cicluster-nfs + hostGroup: us-east-1a + - name: us-east-2 + region: us-east + regionType: ComputeCluster + zone: us-east-2a + zoneType: HostGroup + server: 10.93.60.138 topology: - datacenter: datacenter-2 - computeCluster: /datacenter-2/host/vcs-mdcnc-workload-4 + datacenter: nested8-datacenter + computeCluster: /nested8-datacenter/host/nested-cluster networks: - - network1 - datastore: workload_share_vcsmdcncworkload3_joYiR + - VM Network + datastore: /nested8-datacenter/datastore/fs-cicluster-nfs + hostGroup: us-east-2a ``` -Each vcenter datacenter defined in either master or worker machinepool will -need to be added to the out-of-tree CCM configuration. This allows the CCM -to find the virtual machines across multiple vcenter datacenters. - ### Open Questions ### Test Plan -- Configure only to run IBM, unable to run in VMC without multiple datacenters -or clusters. The existing CI vSphere infrastructure -can be extended to include an additional datacenter -and cluster. - -- The additional vSphere-specific jobs should run with this configuration -(csi,etc) ### Graduation Criteria -#### Dev Preview -> Tech Preview - -- CI: A new vSphere-specific job will need to be added for the installer -and periodics to support the new configuration -of regions, zones and multiple datacenters and clusters. - #### Tech Preview -> GA -- More testing (upgrade, downgrade, scale) -- Sufficient time for feedback #### Removing a deprecated feature @@ -800,5 +1170,4 @@ configuration - maybe documentation. ## Infrastructure Needed -- Multiple datacenter and multiple cluster vSphere environment for development -and CI. +Using existing infrastructure but nested provisioning may be required to test host-based zonal