Skip to content

Commit

Permalink
feat: add known issue about fleet upgrade stuck (#697)
Browse files Browse the repository at this point in the history
* feat: add known issue about fleet upgrade stuck

Signed-off-by: PoAn Yang <[email protected]>

* Update docs/upgrade/v1-3-2-to-v1-4-0.md

Co-authored-by: Jillian <[email protected]>

---------

Signed-off-by: PoAn Yang <[email protected]>
Co-authored-by: Jillian <[email protected]>
  • Loading branch information
FrankYang0529 and jillian-maroket authored Dec 30, 2024
1 parent c331c22 commit 45ea47b
Show file tree
Hide file tree
Showing 3 changed files with 86 additions and 0 deletions.
32 changes: 32 additions & 0 deletions docs/upgrade/v1-3-2-to-v1-4-0.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,3 +82,35 @@ kubectl delete svc longhorn-replica-manager -n longhorn-system --ignore-not-foun
```

---

### 3. Upgrade stuck on waiting for Fleet

When upgrading from v1.3.2 to v1.4.0, the upgrade process may become stuck on waiting for Fleet to become ready. This issue is caused by a race condition when Rancher is redeployed.

Check the Harvester logs and Fleet history for the following indicators:

- The manifest pod is stuck in the `deployed` status.
- The upgrade is pending with a chart version that has been deployed.

Example:

```shell
> kubectl logs -n harvester-system -l harvesterhci.io/upgradeComponent=manifest
wait helm release cattle-fleet-system fleet fleet-104.0.2+up0.10.2 0.10.2 deployed

> helm history -n cattle-fleet-system fleet
REVISION UPDATED STATUS CHART APP VERSION DESCRIPTION
26 Tue Dec 10 03:09:13 2024 superseded fleet-103.1.5+up0.9.5 0.9.5 Upgrade complete
27 Sun Dec 15 09:26:54 2024 superseded fleet-103.1.5+up0.9.5 0.9.5 Upgrade complete
28 Sun Dec 15 09:27:03 2024 superseded fleet-103.1.5+up0.9.5 0.9.5 Upgrade complete
29 Mon Dec 16 05:57:03 2024 deployed fleet-103.1.5+up0.9.5 0.9.5 Upgrade complete
30 Mon Dec 16 05:57:13 2024 pending-upgrade fleet-103.1.5+up0.9.5 0.9.5 Preparing upgrade
```

You can run the following command to fix the issue.

```shell
helm rollback fleet -n cattle-fleet-system <last-deployed-revision>
```

---
27 changes: 27 additions & 0 deletions versioned_docs/version-v1.3/upgrade/v1-3-2-to-v1-4-0.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,3 +82,30 @@ kubectl delete svc longhorn-replica-manager -n longhorn-system --ignore-not-foun
```

---

### 3. Upgrade stuck on waiting for Fleet

When upgrading from v1.3.2 to v1.4.0, the upgrade process may become stuck on waiting for the Fleet to become ready. This issue is caused by a race condition when the Rancher is redeployed.

The following error messages indicate that the issue exists. The manifest pod is stuck in fleet deployed and the Fleet history shows the upgrade is pending with a chart version which has been deployed.

```shell
> kubectl logs -n harvester-system -l harvesterhci.io/upgradeComponent=manifest
wait helm release cattle-fleet-system fleet fleet-104.0.2+up0.10.2 0.10.2 deployed

> helm history -n cattle-fleet-system fleet
REVISION UPDATED STATUS CHART APP VERSION DESCRIPTION
26 Tue Dec 10 03:09:13 2024 superseded fleet-103.1.5+up0.9.5 0.9.5 Upgrade complete
27 Sun Dec 15 09:26:54 2024 superseded fleet-103.1.5+up0.9.5 0.9.5 Upgrade complete
28 Sun Dec 15 09:27:03 2024 superseded fleet-103.1.5+up0.9.5 0.9.5 Upgrade complete
29 Mon Dec 16 05:57:03 2024 deployed fleet-103.1.5+up0.9.5 0.9.5 Upgrade complete
30 Mon Dec 16 05:57:13 2024 pending-upgrade fleet-103.1.5+up0.9.5 0.9.5 Preparing upgrade
```

You can run the following command to fix the issue.

```shell
helm rollback fleet -n cattle-fleet-system <last-deployed-revision>
```

---
27 changes: 27 additions & 0 deletions versioned_docs/version-v1.4/upgrade/v1-3-2-to-v1-4-0.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,3 +82,30 @@ kubectl delete svc longhorn-replica-manager -n longhorn-system --ignore-not-foun
```

---

### 3. Upgrade stuck on waiting for Fleet

When upgrading from v1.3.2 to v1.4.0, the upgrade process may become stuck on waiting for the Fleet to become ready. This issue is caused by a race condition when the Rancher is redeployed.

The following error messages indicate that the issue exists. The manifest pod is stuck in fleet deployed and the Fleet history shows the upgrade is pending with a chart version which has been deployed.

```shell
> kubectl logs -n harvester-system -l harvesterhci.io/upgradeComponent=manifest
wait helm release cattle-fleet-system fleet fleet-104.0.2+up0.10.2 0.10.2 deployed

> helm history -n cattle-fleet-system fleet
REVISION UPDATED STATUS CHART APP VERSION DESCRIPTION
26 Tue Dec 10 03:09:13 2024 superseded fleet-103.1.5+up0.9.5 0.9.5 Upgrade complete
27 Sun Dec 15 09:26:54 2024 superseded fleet-103.1.5+up0.9.5 0.9.5 Upgrade complete
28 Sun Dec 15 09:27:03 2024 superseded fleet-103.1.5+up0.9.5 0.9.5 Upgrade complete
29 Mon Dec 16 05:57:03 2024 deployed fleet-103.1.5+up0.9.5 0.9.5 Upgrade complete
30 Mon Dec 16 05:57:13 2024 pending-upgrade fleet-103.1.5+up0.9.5 0.9.5 Preparing upgrade
```

You can run the following command to fix the issue.

```shell
helm rollback fleet -n cattle-fleet-system <last-deployed-revision>
```

---

0 comments on commit 45ea47b

Please sign in to comment.