Skip to content

Commit

Permalink
fix: clean up netdr and solve image and version issues
Browse files Browse the repository at this point in the history
Signed-off-by: ONE7live <[email protected]>
  • Loading branch information
ONE7live committed Nov 27, 2024
1 parent 3a6a818 commit 6cea4dc
Show file tree
Hide file tree
Showing 20 changed files with 477 additions and 662 deletions.
22 changes: 7 additions & 15 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ REGISTRY_PASSWORD?=""
REGISTRY_SERVER_ADDRESS?=""
KIND_IMAGE_TAG?="v1.25.3"

TARGETS := clusterlink-floater \
TARGETS := netdr-floater \

CTL_TARGETS := netctl

Expand All @@ -22,8 +22,8 @@ CTL_TARGETS := netctl
# Example:
# make
# make all
# make clusterlink-controller-manager
# make clusterlink-controller-manager GOOS=linux
# make netdr-floater
# make netdr-floater GOOS=linux
CMD_TARGET=$(TARGETS) $(CTL_TARGETS)

.PHONY: all
Expand All @@ -41,8 +41,8 @@ $(CMD_TARGET):
#
# Example:
# make images
# make image-clusterlink-controller-manager
# make image-clusterlink-controller-manager GOARCH=arm64
# make image-netdr-floater
# make image-netdr-floater GOARCH=arm64
IMAGE_TARGET=$(addprefix image-, $(TARGETS))
.PHONY: $(IMAGE_TARGET)
$(IMAGE_TARGET):
Expand All @@ -57,7 +57,7 @@ images: $(IMAGE_TARGET)
#
# Example
# make multi-platform-images
# make mp-image-clusterlink-controller-manager
# make mp-image-netdr
MP_TARGET=$(addprefix mp-image-, $(TARGETS))
.PHONY: $(MP_TARGET)
$(MP_TARGET):
Expand Down Expand Up @@ -92,15 +92,7 @@ test:

upload-images: images
@echo "push images to $(REGISTRY)"
docker push ${REGISTRY}/clusterlink-controller-manager:${VERSION}
docker push ${REGISTRY}/kosmos-operator:${VERSION}
docker push ${REGISTRY}/clusterlink-agent:${VERSION}
docker push ${REGISTRY}/clusterlink-proxy:${VERSION}
docker push ${REGISTRY}/clusterlink-network-manager:${VERSION}
docker push ${REGISTRY}/clusterlink-floater:${VERSION}
docker push ${REGISTRY}/clusterlink-elector:${VERSION}
docker push ${REGISTRY}/clustertree-cluster-manager:${VERSION}
docker push ${REGISTRY}/scheduler:${VERSION}
docker push ${REGISTRY}/netdr-floater:${VERSION}

.PHONY: release
release:
Expand Down
38 changes: 23 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ I0205 16:27:26.258964 2765415 init.go:69] write opts success
$ cat config.json
{
"namespace": "kosmos-system",
"version": "v0.2.0",
"version": "v0.0.2",
"protocol": "tcp",
"podWaitTime": 30,
"port": "8889",
Expand All @@ -58,12 +58,12 @@ $ cat config.json
* `netctl check` command will read `config.json`, then create a `DaemonSet` named `Floater` and some related resources, and then obtain all the `IP` information of `Floater`, and then enter in sequence Go to `Pod` and execute the `Ping` or `Curl` command. It should be noted that this operation is executed concurrently, and the degree of concurrency changes dynamically according to the `maxNum` parameter in `config.json`.
````bash
$ netctl check
I0205 16:34:06.147671 2769373 check.go:61] use config from file!!!!!!
I0205 16:34:06.148619 2769373 floater.go:73] create Clusterlink floater, namespace: kosmos-system
I0205 16:34:06.157582 2769373 floater.go:83] create Clusterlink floater, apply RBAC
I0205 16:34:06.167799 2769373 floater.go:94] create Clusterlink floater, version: v0.2.0
I0205 16:34:09.178566 2769373 verify.go:79] pod: clusterlink-floater-9dzsg is ready. status: Running
I0205 16:34:09.179593 2769373 verify.go:79] pod: clusterlink-floater-cscdh is ready. status: Running
I1127 11:18:16.689718 1257705 check.go:65] use config from file!!!!!!
I1127 11:18:16.690956 1257705 floater.go:73] create NetDoctor floater, namespace: kosmos-system
I1127 11:18:16.704187 1257705 floater.go:83] create NetDoctor floater, apply RBAC resources.
I1127 11:18:16.721158 1257705 floater.go:94] create NetDoctor floater, version: v0.0.2
I1127 11:18:19.751548 1257705 verify.go:79] pod: netdr-floater-9fzhs is ready. status: Running
I1127 11:18:19.754697 1257705 verify.go:79] pod: netdr-floater-t6b7z is ready. status: Running
Do check... 100% [================================================================================] [0s]
+-----+----------------+----------------+-----------+-----------+
| S/N | SRC NODE NAME | DST NODE NAME | TARGET IP | RESULT |
Expand All @@ -78,7 +78,7 @@ Do check... 100% [==============================================================
| 1 | ecs-net-dr-002 | ecs-net-dr-001 | 10.0.1.86 | EXCEPTION |exec error: unable to upgrade |
| 2 | ecs-net-dr-001 | ecs-net-dr-002 | 10.0.2.29 | EXCEPTION |connection: container not......|
+-----+----------------+----------------+-----------+-----------+-------------------------------+
I0205 16:34:09.280220 2769373 do.go:93] write opts success
I1127 11:18:19.995105 1257705 do.go:154] write opts success
````

* During the execution of the `check` command, a progress bar will display the verification progress. After the command is executed, the check results will be printed and saved in the file `resume.json`.
Expand Down Expand Up @@ -106,7 +106,7 @@ I0205 16:34:09.280220 2769373 do.go:93] write opts success
$ vim config.json
{
"namespace": "kosmos-system",
"version": "v0.2.0",
"version": "v0.0.2",
"protocol": "tcp",
"podWaitTime": 30,
"port": "8889",
Expand All @@ -122,12 +122,12 @@ $ vim config.json
* `netctl resume` command is used to check only the cluster nodes with problems during the first inspection during retesting. Because there are a large number of nodes in the online environment, a single inspection may take a long time to generate results, so we hope to retest only the nodes that were abnormal in the previous inspection. The `resume` command was developed for this reason. This command will read the `resume.json` file and recheck the previous abnormal node. We can repeatedly execute this command until there are no abnormal results and then perform a full check.
````bash
$ netctl resume
I0205 16:34:06.147671 2769373 check.go:61] use config from file!!!!!!
I0205 16:34:06.148619 2769373 floater.go:73] create Clusterlink floater, namespace: kosmos-system
I0205 16:34:06.157582 2769373 floater.go:83] create Clusterlink floater, apply RBAC
I0205 16:34:06.167799 2769373 floater.go:94] create Clusterlink floater, version: v0.2.0
I0205 16:34:09.178566 2769373 verify.go:79] pod: clusterlink-floater-9dzsg is ready. status: Running
I0205 16:34:09.179593 2769373 verify.go:79] pod: clusterlink-floater-cscdh is ready. status: Running
I1127 11:18:16.689718 1257705 check.go:65] use config from file!!!!!!
I1127 11:18:16.690956 1257705 floater.go:73] create NetDoctor floater, namespace: kosmos-system
I1127 11:18:16.704187 1257705 floater.go:83] create NetDoctor floater, apply RBAC resources.
I1127 11:18:16.721158 1257705 floater.go:94] create NetDoctor floater, version: v0.0.2
I1127 11:18:19.751548 1257705 verify.go:79] pod: netdr-floater-9fzhs is ready. status: Running
I1127 11:18:19.754697 1257705 verify.go:79] pod: netdr-floater-t6b7z is ready. status: Running
Do check... 100% [================================================================================] [0s]
+-----+----------------+----------------+-----------+-----------+
| S/N | SRC NODE NAME | DST NODE NAME | TARGET IP | RESULT |
Expand All @@ -139,6 +139,14 @@ Do check... 100% [==============================================================

* `netctl clean` command is used to clean up all resources created by `NetDoctor`.

### netdr-floater Image

#### Building from source
# Clone the project source code
$ git clone https://github.com/kosmos-io/netdoctor.git
# Run the make command to build the image ghcr.io/kosmos-io/netdr-floater:latest
$ make image-netdr-floater

## Contribute Code

* We welcome help in any form, including but not limited to improving documentation, asking questions, fixing bugs, and adding features.
Expand Down
40 changes: 25 additions & 15 deletions README_zh.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ I0205 16:27:26.258964 2765415 init.go:69] write opts success
$ cat config.json
{
"namespace": "kosmos-system",
"version": "v0.2.0",
"version": "v0.0.2",
"protocol": "tcp",
"podWaitTime": 30,
"port": "8889",
Expand All @@ -57,12 +57,12 @@ $ cat config.json
* `netctl check`命令会读取`config.json`,然后创建一个名为`Floater``DaemonSet`以及相关联的一些资源,之后会获取所有的`Floater``IP`信息,然后依次进入到`Pod`中执行`Ping`或者`Curl`命令。需要注意的是,这个操作是并发执行的,并发度根据`config.json`中的`maxNum`参数动态变化。
````bash
$ netctl check
I0205 16:34:06.147671 2769373 check.go:61] use config from file!!!!!!
I0205 16:34:06.148619 2769373 floater.go:73] create Clusterlink floater, namespace: kosmos-system
I0205 16:34:06.157582 2769373 floater.go:83] create Clusterlink floater, apply RBAC
I0205 16:34:06.167799 2769373 floater.go:94] create Clusterlink floater, version: v0.2.0
I0205 16:34:09.178566 2769373 verify.go:79] pod: clusterlink-floater-9dzsg is ready. status: Running
I0205 16:34:09.179593 2769373 verify.go:79] pod: clusterlink-floater-cscdh is ready. status: Running
I1127 11:18:16.689718 1257705 check.go:65] use config from file!!!!!!
I1127 11:18:16.690956 1257705 floater.go:73] create NetDoctor floater, namespace: kosmos-system
I1127 11:18:16.704187 1257705 floater.go:83] create NetDoctor floater, apply RBAC resources.
I1127 11:18:16.721158 1257705 floater.go:94] create NetDoctor floater, version: v0.0.2
I1127 11:18:19.751548 1257705 verify.go:79] pod: netdr-floater-9fzhs is ready. status: Running
I1127 11:18:19.754697 1257705 verify.go:79] pod: netdr-floater-t6b7z is ready. status: Running
Do check... 100% [================================================================================] [0s]
+-----+----------------+----------------+-----------+-----------+
| S/N | SRC NODE NAME | DST NODE NAME | TARGET IP | RESULT |
Expand All @@ -77,7 +77,7 @@ Do check... 100% [==============================================================
| 1 | ecs-net-dr-002 | ecs-net-dr-001 | 10.0.1.86 | EXCEPTION |exec error: unable to upgrade |
| 2 | ecs-net-dr-001 | ecs-net-dr-002 | 10.0.2.29 | EXCEPTION |connection: container not......|
+-----+----------------+----------------+-----------+-----------+-------------------------------+
I0205 16:34:09.280220 2769373 do.go:93] write opts success
I1127 11:18:19.995105 1257705 do.go:154] write opts success
````
*`check`命令执行的过程中,会有进度条显示校验进度。命令执行完成后,会打印检查结果,并将结果保存在文件`resume.json`中。
````bash
Expand All @@ -104,7 +104,7 @@ I0205 16:34:09.280220 2769373 do.go:93] write opts success
$ vim config.json
{
"namespace": "kosmos-system",
"version": "v0.2.0",
"version": "v0.0.2",
"protocol": "tcp",
"podWaitTime": 30,
"port": "8889",
Expand All @@ -120,12 +120,12 @@ $ vim config.json
* `netctl resume`命令用于复测时只检验第一次检查时有问题的集群节点。因为线上环境节点数量很多,单次检查可能会需要比较长的时间才能生成结果,所以我们希望仅对前一次检查异常的节点进行复测。`resume`命令因此被开发,该命令会读取`resume.json`文件,并对前一次异常的节点进行再次检查,我们可以重复执行此命令至没有异常的结果后再执行全量检查。
````bash
$ netctl resume
I0205 16:34:06.147671 2769373 check.go:61] use config from file!!!!!!
I0205 16:34:06.148619 2769373 floater.go:73] create Clusterlink floater, namespace: kosmos-system
I0205 16:34:06.157582 2769373 floater.go:83] create Clusterlink floater, apply RBAC
I0205 16:34:06.167799 2769373 floater.go:94] create Clusterlink floater, version: v0.2.0
I0205 16:34:09.178566 2769373 verify.go:79] pod: clusterlink-floater-9dzsg is ready. status: Running
I0205 16:34:09.179593 2769373 verify.go:79] pod: clusterlink-floater-cscdh is ready. status: Running
I1127 11:18:16.689718 1257705 check.go:65] use config from file!!!!!!
I1127 11:18:16.690956 1257705 floater.go:73] create NetDoctor floater, namespace: kosmos-system
I1127 11:18:16.704187 1257705 floater.go:83] create NetDoctor floater, apply RBAC resources.
I1127 11:18:16.721158 1257705 floater.go:94] create NetDoctor floater, version: v0.0.2
I1127 11:18:19.751548 1257705 verify.go:79] pod: netdr-floater-9fzhs is ready. status: Running
I1127 11:18:19.754697 1257705 verify.go:79] pod: netdr-floater-t6b7z is ready. status: Running
Do check... 100% [================================================================================] [0s]
+-----+----------------+----------------+-----------+-----------+
| S/N | SRC NODE NAME | DST NODE NAME | TARGET IP | RESULT |
Expand All @@ -137,6 +137,16 @@ Do check... 100% [==============================================================

* `netctl clean`命令用于清理`NetDoctor`创建的所有资源。

### netdr-floater镜像

#### 从源码构建
````bash
# 下载项目源码
$ git clone https://github.com/kosmos-io/netdoctor.git
# 执行后make后会构建出ghcr.io/kosmos-io/netdr-floater:latest
$ make image-netdr-floater
````

## 贡献代码

* 我们欢迎任何形式的帮助,包括但不限定于完善文档、提出问题、修复 Bug 和增加特性。
Expand Down
2 changes: 1 addition & 1 deletion cmd/floater/app/floater.go
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ func NewFloaterCommand(ctx context.Context) *cobra.Command {
opts := options.NewOptions()

cmd := &cobra.Command{
Use: "clusterlink-floater",
Use: "netdr-floater",
Long: `Environment for executing commands`,
RunE: func(cmd *cobra.Command, args []string) error {
if err := Run(ctx, opts); err != nil {
Expand Down
14 changes: 7 additions & 7 deletions cmd/floater/main.go
Original file line number Diff line number Diff line change
@@ -1,17 +1,17 @@
package main

import (
"context"
"fmt"
"os"

apiserver "k8s.io/apiserver/pkg/server"
"k8s.io/component-base/cli"

"github.com/kosmos.io/netdoctor/cmd/floater/app"
)

func main() {
ctx := context.TODO()
ctx := apiserver.SetupSignalContext()
cmd := app.NewFloaterCommand(ctx)
err := cmd.Execute()
if err != nil {
fmt.Print(err)
}
code := cli.Run(cmd)
os.Exit(code)
}
4 changes: 2 additions & 2 deletions cmd/netdr/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,11 @@ import (
"k8s.io/component-base/cli"
"k8s.io/kubectl/pkg/cmd/util"

app "github.com/kosmos.io/netdoctor/pkg"
"github.com/kosmos.io/netdoctor/pkg/netdr"
)

func main() {
cmd := app.NewNetDoctorCtlCommand()
cmd := netdr.NewNetDoctorCtlCommand()
if err := cli.RunNoErrOutput(cmd); err != nil {
util.CheckErr(err)
}
Expand Down
Loading

0 comments on commit 6cea4dc

Please sign in to comment.