Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

K8s requests/limits for Calico components? #9587

Open
rafalr-ntropy opened this issue Dec 11, 2024 · 5 comments
Open

K8s requests/limits for Calico components? #9587

rafalr-ntropy opened this issue Dec 11, 2024 · 5 comments

Comments

@rafalr-ntropy
Copy link

rafalr-ntropy commented Dec 11, 2024

I'm using tigera-operator-v3.28.2 helm chart and I wasn't able to find easily digestible documentation how to set cpu/memory limits for calico components (tigera-operator, api-server, typha, node, csi-node-driver, kube-controllers). Values available in https://artifacthub.io/packages/helm/projectcalico/tigera-operator and in https://github.com/projectcalico/calico/blob/master/charts/tigera-operator/values.yaml aren't enough.

I tried to set values.yaml according to https://docs.tigera.io/calico/latest/reference/installation/api#operator.tigera.io/v1.APIServerDeploymentPodSpec to:

apiServer:                                                                                                                            
  enabled: true                                           
  spec:                                           
    apiServerDeployment:
      spec:                                   
        template:                                                                                                                     
          spec:
            containers:             
            - name: calico-apiserver
              resources:           
                limits:            
                  cpu: 200m        
                  memory: 192Mi    
                requests:
                  cpu: 100m
                  memory: 192Mi                         
            topologySpreadConstraints:
            - maxSkew: 1
              topologyKey: topology.kubernetes.io/hostname
              whenUnsatisfiable: ScheduleAnyway                                                                                       
            - maxSkew: 1         
              topologyKey: topology.kubernetes.io/zone
              whenUnsatisfiable: ScheduleAnyway
installation:                    
  calicoNetwork:                      
    bgp: Enabled                                   
    ipPools:                                     
    - blockSize: 25                               
      cidr: 172.19.0.0/16          
      encapsulation: IPIP
      name: default                                                                                                                                                                                                                                                          
  cni:
    ipam:                             
      type: Calico
    type: Calico                                                                                                                                                                                                                                                             
  controlPlaneReplicas: 2   
  enabled: true                                                                                                                       
  kubernetesProvider: EKS                                                                                                             
  logging:                                                                                                                            
    cni:
      logSeverity: Info                                                                                                                                                                                                                                                      
  nodeUpdateStrategy:
    rollingUpdate:                    
      maxUnavailable: 1
    type: RollingUpdate                                                                                                                                                                                                                                                      
resources:               
  limits:                                                                                                                             
    cpu: 200m                                                                                                                         
    memory: 128Mi                                                                                                                     
  requests:                                                                                                                           
    cpu: 100m                                                                                                                         
    memory: 128Mi

As a result calico-apiserver doesn't have any requests/limits set. They are set only for tigera-operator.

How should I set cpu/memory requests limits for all calico components?

Expected Behavior

A clear example in the documentation or in values.yaml for the helm chart should be provided to foster much needed setting of cpu/memory limits/requests for all calico components.

Your Environment

  • Calico version
    v3.28.2
  • Calico dataplane (iptables, windows etc.)
    iptables
  • Orchestrator version (e.g. kubernetes, mesos, rkt):
    EKS 1.31 platform version eks.12
  • Operating System and version:
    Amazon Linux 2, kernel 5.10.225-213.878.amzn2.x86_64
  • Link to your project (optional):
@caseydavenport
Copy link
Member

@rafalr-ntropy I think you're on the right track. Unfortunately our documentation here is pretty bad and needs to be fixed.

I think the reason it's not working for you is an extra spec section in your apiserver block. It should look like this IIUC:

apiServer:                                                                                                                            
  enabled: true                                           
  apiServerDeployment:
    spec:                                   
      template:                                                                                                                     
        spec:
          containers:             
          - name: calico-apiserver
            resources:           
              limits:            
                cpu: 200m        
                memory: 192Mi    
              requests:
                cpu: 100m
                memory: 192Mi                         
          topologySpreadConstraints:
          - maxSkew: 1
            topologyKey: topology.kubernetes.io/hostname
            whenUnsatisfiable: ScheduleAnyway                                                                                       
          - maxSkew: 1         
            topologyKey: topology.kubernetes.io/zone
            whenUnsatisfiable: ScheduleAnyway

@rafalr-ntropy
Copy link
Author

hi @caseydavenport
thanks for a quick response!
your proposed solution works as expected
in https://docs.tigera.io/calico/latest/reference/installation/api#operator.tigera.io/v1.APIServer I noticed that spec field is present for APIServer object

do you maybe have similar examples how to set limits/requests for calico components other than the apiserver?

@caseydavenport
Copy link
Member

@rafalr-ntropy glad it worked!

I believe there should be equivalent fields you can specify in the Installation object for the remaining components: https://docs.tigera.io/calico/latest/reference/installation/api#operator.tigera.io/v1.InstallationSpec

For example calicoNodeDaemonSet, calicoKubeControllersDeployment, and typhaDeployment

@rafalr-ntropy
Copy link
Author

thanks @caseydavenport for the clarification. I started with csiNodeDriverDaemonSet field of the installation key in values provided to the helm chart. When I use below configuration (calico-csi, csi-node-driver-registrar are 2 containers deployed as part of csi-node-driver ds):

installation:
  calicoNetwork:
    bgp: Enabled
    ipPools:
    - blockSize: 25
      cidr: 172.16.0.0/16
      encapsulation: IPIP
      name: default
  cni:
    ipam:
      type: Calico
    type: Calico
  controlPlaneReplicas: 2
  csiNodeDriverDaemonSet:
    spec:
      template:
        spec:
          containers:
          - name: calico-csi
            resources:
              limits:
                cpu: 200m
                memory: 128Mi
              requests:
                cpu: 100m
                memory: 128Mi
          - name: csi-node-driver-registrar
            resources:
              limits:
                cpu: 200m
                memory: 128Mi
              requests:
                cpu: 100m
                memory: 128Mi
  enabled: true
  kubernetesProvider: EKS
  logging:
    cni:
      logSeverity: Info
  nodeUpdateStrategy:
    rollingUpdate:
      maxUnavailable: 1
    type: RollingUpdate

I get below error:

│ Error: cannot patch "default" with kind Installation: Installation.operator.tigera.io "default" is invalid: [spec.csiNodeDriverDaemonSet.spec.template
.spec.containers[0].name: Unsupported value: "csi-node-driver-registrar": supported values: "csi-node-driver", spec.csiNodeDriverDaemonSet.spec.template
.spec.containers[1].name: Unsupported value: "calico-csi": supported values: "csi-node-driver"]

When I use the supported value of csi-node-driver (which doesn't seem to be proper according to description of the name field in https://docs.tigera.io/calico/latest/reference/installation/api#operator.tigera.io/v1.CSINodeDriverDaemonSetContainer) as visible below:

installation:
  calicoNetwork:
    bgp: Enabled
    ipPools:
    - blockSize: 25
      cidr: 172.16.0.0/16
      encapsulation: IPIP
      name: default
  cni:
    ipam:
      type: Calico
    type: Calico
  controlPlaneReplicas: 2
  csiNodeDriverDaemonSet:
    spec:
      template:
        spec:
          containers:
          - name: csi-node-driver
            resources:
              limits:
                cpu: 200m
                memory: 128Mi
              requests:
                cpu: 100m
                memory: 128Mi
  enabled: true
  kubernetesProvider: EKS
  logging:
    cni:
      logSeverity: Info
  nodeUpdateStrategy:
    rollingUpdate:
      maxUnavailable: 1
    type: RollingUpdate

I don't get an error but cpu/mem requests/limits aren't applied to the containers/pod in the csi-node-driver.
Can you help me with this problem?

@caseydavenport
Copy link
Member

Hm, I think this issue you're describing was fixed in this PR: tigera/operator#3254

What version of the operator do you have installed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants