Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calico ClusterRole missing perms which causes calicoctl to error #11683

Open
bvierra opened this issue Nov 3, 2024 · 3 comments
Open

Calico ClusterRole missing perms which causes calicoctl to error #11683

bvierra opened this issue Nov 3, 2024 · 3 comments
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@bvierra
Copy link

bvierra commented Nov 3, 2024

What happened?

New install with calico used as cni. Log into a k8s node and run calicoctl.sh ipam check and you get a perm issue

root@k8s-worker-1:/home/ansible# calicoctl.sh ipam check
Checking IPAM for inconsistencies...

Loading all IPAM blocks...
Found 5 IPAM blocks.
 IPAM block 10.233.110.128/26 affinity=host:k8s-worker-1:
 IPAM block 10.233.113.0/26 affinity=host:k8s-control-1:
 IPAM block 10.233.66.0/26 affinity=host:k8s-worker-4:
 IPAM block 10.233.85.64/26 affinity=host:k8s-control-2:
 IPAM block 10.233.93.192/26 affinity=host:k8s-control-3:
IPAM blocks record 5 allocations.

Loading all IPAM pools...
  10.233.64.0/18
Found 1 active IP pools.

Loading all nodes.
failed to list nodes: connection is unauthorized: nodes is forbidden: User "system:serviceaccount:kube-system:calico-cni-plugin" cannot list resource "nodes" in API group "" at the cluster scope

The fix was to apply:

diff --git a/roles/network_plugin/calico/templates/calico-cr.yml.j2 b/roles/network_plugin/calico/templates/calico-cr.yml.j2
index 7ddec1698..5e6651761 100644
--- a/roles/network_plugin/calico/templates/calico-cr.yml.j2
+++ b/roles/network_plugin/calico/templates/calico-cr.yml.j2
@@ -11,6 +11,7 @@ rules:
       - namespaces
     verbs:
       - get
+      - list
   - apiGroups: [""]
     resources:
       - pods/status

Note that I added list to pods, nodes, and namespaces because they also needed the permission (separated out nodes for list and get, then namespaces errored on the next run)

What did you expect to happen?

I expected it not to error and get a result similar to the following:

root@k8s-worker-1:/home/ansible# calicoctl.sh ipam check
Checking IPAM for inconsistencies...

Loading all IPAM blocks...
Found 5 IPAM blocks.
 IPAM block 10.233.110.128/26 affinity=host:k8s-worker-1:
 IPAM block 10.233.113.0/26 affinity=host:k8s-control-1:
 IPAM block 10.233.66.0/26 affinity=host:k8s-worker-4:
 IPAM block 10.233.85.64/26 affinity=host:k8s-control-2:
 IPAM block 10.233.93.192/26 affinity=host:k8s-control-3:
IPAM blocks record 5 allocations.

Loading all IPAM pools...
  10.233.64.0/18
Found 1 active IP pools.

Loading all nodes.
Found 0 node tunnel IPs.

Loading all workload endpoints.
Found 5 workload IPs.
Workloads and nodes are using 5 IPs.

Loading all handles
Looking for top (up to 20) nodes by allocations...
  k8s-worker-4 has 1 allocations
  k8s-control-2 has 1 allocations
  k8s-control-3 has 1 allocations
  k8s-worker-1 has 1 allocations
  k8s-control-1 has 1 allocations
Node with most allocations has 1; median is 1

Scanning for IPs that are allocated but not actually in use...
Found 0 IPs that are allocated in IPAM but not actually in use.
Scanning for IPs that are in use by a workload or node but not allocated in IPAM...
Found 0 in-use IPs that are not in active IP pools.
Found 0 in-use IPs that are in active IP pools but have no corresponding IPAM allocation.

Scanning for IPAM handles with no matching IPs...
Found 0 handles with no matching IPs (and 5 handles with matches).
Scanning for IPs with missing handle...
Found 0 handles mentioned in blocks with no matching handle resource.
Check complete; found 0 problems.

How can we reproduce it (as minimally and precisely as possible)?

Do an install with calico setup (will include my calico vars below). Log into a k8s node and run calicoctl.sh ipam check. Apply the diff from above and then run the same command and it works.

My group_vars/k8s_cluster/k8s-net-calico.yml is as follows, however you should not need all of the bgp settings.

---
calico_cni_name: k8s-pod-network
peer_with_router: true
nat_outgoing: true
nat_outgoing_ipv6: false
calico_pool_name: "default-pool"
calico_pool_blocksize: 26
calico_pool_cidr: 10.233.64.0/18
calico_cni_pool: true
global_as_num: "64513"
calico_mtu: 1500
calico_veth_mtu: 1500
calico_advertise_cluster_ips: true
calico_datastore: "kdd"
typha_enabled: true
calico_network_backend: 'bird'
calico_ipip_mode: 'Never'
calico_vxlan_mode: 'Never'
calico_apiserver_enabled: true
peers:
  - router_id: "10.10.130.1"
    as: "64512"

OS

Linux 6.8.0-48-generic x86_64
PRETTY_NAME="Ubuntu 24.04.1 LTS"
NAME="Ubuntu"
VERSION_ID="24.04"
VERSION="24.04.1 LTS (Noble Numbat)"
VERSION_CODENAME=noble
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=noble
LOGO=ubuntu-logo

Version of Ansible

ansible [core 2.16.12]
  config file = None
  configured module search path = ['/home/bvierra/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /home/bvierra/p/homelab/.direnv/python-3.12.3/lib/python3.12/site-packages/ansible
  ansible collection location = /home/bvierra/.ansible/collections:/usr/share/ansible/collections
  executable location = /home/bvierra/p/homelab/.direnv/python-3.12.3/bin/ansible
  python version = 3.12.3 (main, Sep 11 2024, 14:17:37) [GCC 13.2.0] (/home/bvierra/p/homelab/.direnv/python-3.12.3/bin/python)
  jinja version = 3.1.4
  libyaml = True

Version of Python

Python 3.12.3

Version of Kubespray (commit)

bb7b4e0

Network plugin used

calico

Full inventory with variables

If needed I can add, but it appeared there was some stuff I would have to redact and it was large :)

Command used to invoke ansible

ansible-playbook -i inventory/mycluster/hosts.yaml cluster.yml

Output of ansible run

as above

Anything else we need to know

No response

@bvierra bvierra added the kind/bug Categorizes issue or PR as related to a bug. label Nov 3, 2024
@tico88612
Copy link
Member

I compared with calico official setting, I didn't see any need to add more additional permission. Could you provide more information about this?

@bvierra
Copy link
Author

bvierra commented Nov 5, 2024

Hey,
Thanks for looking into this. I do agree that their official settings are as stated, however I can confirm that on a brand new cluster (new Ubuntu vm's created from a fresh install via iso) using kubespray with the configs stated above do not work without the change to the CR.

Over the next few days I will stand up a cluster without using kubespray and following the calico install instructions to see if I can confirm its an error with calico's default permissions or if it is a conflict with something else kubespray does.

@mzaian
Copy link
Contributor

mzaian commented Nov 8, 2024

Hey @bvierra ,

Which version of kubespray and calico you're using in your setup?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

3 participants