Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[STRMCMP-1659] k8s v1.24 upgrade (#296)
This PR updates the operator to use Kubernetes 1.24 APIs, tested on Kubernetes 1.27. ## Changes 1. Updates project dependencies in `go.mod` to use the K8S 1.24 APIs 2. Updates the WordCount example application to use Flink 1.16 (previously Flink 1.8, which predates [Flink's new memory management configs](https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/memory/mem_setup/)) 3. Configures the `SubmitJobRequest` struct type to consider the fields all as `omitempty` so they're included in the SubmitJob only if non-empty. This resolves a problem with newer versions of Flink that attempted to start the job from an empty-string savepoint URI, which would consistently fail. These fields are [optional](https://nightlies.apache.org/flink/flink-docs-master/docs/ops/rest_api/#jars-jarid-run), so omitting them should be fine. 4. Introduces a Zap logging wrapper since the Prometheus logging package is dead. ###Testing This was developed and tested against a [k3d cluster](https://k3d.io) of Kubernetes 1.27 with the following specs: ``` Server Version: version.Info{Major:"1", Minor:"27", GitVersion:"v1.27.4+k3s1", GitCommit:"36645e7311e9bdbbf2adb79ecd8bd68556bc86f6", GitTreeState:"clean", BuildDate:"2023-07-28T09:46:05Z", GoVersion:"go1.20.6", Compiler:"gc", Platform:"linux/arm64"} ``` ### Steps #### 1. Build Operator ``` scott /Users/scott/go/src/github.com/lyft/flinkk8soperator → make IMAGE_NAME=$REPOSITORY ./boilerplate/lyft/docker_build/docker_build.sh ------------------------------------ DOCKER BUILD ------------------------------------ [+] Building 28.6s (17/17) FINISHED docker:desktop-linux => [internal] load build definition from Dockerfile 0.0s => => transferring dockerfile: 895B 0.0s => [internal] load .dockerignore 0.0s => => transferring context: 2B 0.0s => [internal] load metadata for docker.io/library/alpine:3.17 1.0s => [internal] load metadata for docker.io/library/golang:1.20.2-alpine3.17 1.8s => [auth] library/alpine:pull token for registry-1.docker.io 0.0s => [auth] library/golang:pull token for registry-1.docker.io 0.0s => [builder 1/7] FROM docker.io/library/golang:1.20.2-alpine3.17@sha256:87734b78d26a52260f303cf1b40df45b0797f972bd0250e56937c42114bf472c 0.0s => CACHED [stage-1 1/2] FROM docker.io/library/alpine:3.17@sha256:f71a5f071694a785e064f05fed657bf8277f1b2113a8ed70c90ad486d6ee54dc 0.0s => [internal] load build context 0.1s => => transferring context: 203.13kB 0.1s => CACHED [builder 2/7] RUN apk add git openssh-client make curl bash 0.0s => CACHED [builder 3/7] COPY go.mod go.sum /go/src/github.com/lyft/flinkk8soperator/ 0.0s => CACHED [builder 4/7] WORKDIR /go/src/github.com/lyft/flinkk8soperator 0.0s => CACHED [builder 5/7] RUN go mod download 0.0s => [builder 6/7] COPY . /go/src/github.com/lyft/flinkk8soperator/ 0.2s => [builder 7/7] RUN go mod vendor && make linux_compile 25.8s => [stage-1 2/2] COPY --from=builder /artifacts /bin 0.1s => exporting to image 0.1s => => exporting layers 0.1s => => writing image sha256:01be975d7ac4f16e3dce41a98088a6ea5e7865fd6ab08f40960e1312f8f0f63c 0.0s => => naming to docker.io/library/flinkk8soperator:ed051403f4b52fd3e9c0ee429a2e476b74356e4b 0.0s What's Next? View summary of image vulnerabilities and recommendations → docker scout quickview flinkk8soperator:ed051403f4b52fd3e9c0ee429a2e476b74356e4b built locally. ``` Next, tag the image and push to my local Docker image registry: ``` scott /Users/scott/go/src/github.com/lyft/flinkk8soperator → docker tag flinkk8soperator:ed051403f4b52fd3e9c0ee429a2e476b74356e4b localhost:56230/flinkk8soperator:ed051403f4b52fd3e9c0ee429a2e476b74356e4b scott /Users/scott/go/src/github.com/lyft/flinkk8soperator → docker push localhost:56230/flinkk8soperator:ed051403f4b52fd3e9c0ee429a2e476b74356e4b The push refers to repository [localhost:56230/flinkk8soperator] 1b411adf446d: Pushed d2d9d24a8c2a: Layer already exists ed05140: digest: sha256:3e203d359db91d5ea61b0bbc47b28f2b5bfc79505552b7ed07663f8b8abff94a size: 739 ``` #### 2. Deploy the Operator YAMLs Update the `deploy/flinkk8soperator.yaml` file to use the image tag and name of my local image registry: ``` image: flink-operator-dev-registry:56230/flinkk8soperator:ed051403f4b52fd3e9c0ee429a2e476b74356e4b ``` Apply all the YAMLs in `deploy` using the [Quick Start Guide](https://github.com/lyft/flinkk8soperator/blob/master/docs/quick-start-guide.md#operator-installation). #### 3. Build the Word Count app ``` scott /Users/scott/go/src/github.com/lyft/flinkk8soperator/examples/wordcount → docker build . -t flink-wordcount:latest [+] Building 9.4s (15/15) FINISHED docker:desktop-linux => [internal] load .dockerignore 0.0s => => transferring context: 2B 0.0s => [internal] load build definition from Dockerfile 0.0s => => transferring dockerfile: 362B 0.0s => [internal] load metadata for docker.io/library/flink:1.16.2-scala_2.12-java11 1.0s => [internal] load metadata for docker.io/library/maven:3.9.3 1.0s => [auth] library/flink:pull token for registry-1.docker.io 0.0s => [auth] library/maven:pull token for registry-1.docker.io 0.0s => [builder 1/4] FROM docker.io/library/maven:3.9.3@sha256:de70becda2f183567e105530460b6cbedfc014b64f27c2003d36d7fc85bc222f 0.0s => [internal] load build context 0.0s => => transferring context: 1.74kB 0.0s => CACHED [stage-1 1/3] FROM docker.io/library/flink:1.16.2-scala_2.12-java11@sha256:f11b1cd44c964cd644d0ee6b01281f3c5a57f2c7a519939c9529e6cbbd96f615 0.0s => CACHED [builder 2/4] COPY src /usr/src/app/src 0.0s => [builder 3/4] COPY pom.xml /usr/src/app 0.0s => [builder 4/4] RUN mvn -f /usr/src/app/pom.xml clean package 8.2s => [stage-1 2/3] COPY --from=builder /usr/src/app/target/ /code/target 0.0s => [stage-1 3/3] RUN ln -s /code/target /opt/flink/flink-web-upload 0.1s => exporting to image 0.0s => => exporting layers 0.0s => => writing image sha256:b1615553beb1c657957d2825d7302b472961abe93e63ce87fcb8658b5525b18e 0.0s => => naming to docker.io/library/flink-wordcount:latest 0.0s ``` Tag and push the image to my local registry: ``` scott /Users/scott/go/src/github.com/lyft/flinkk8soperator/examples/wordcount → docker tag flink-wordcount:latest localhost:56230/flink-wordcount:latest scott /Users/scott/go/src/github.com/lyft/flinkk8soperator/examples/wordcount → docker push localhost:56230/flink-wordcount:latest The push refers to repository [localhost:56230/flink-wordcount] 662da9bda804: Pushed 103047c74d48: Pushed 18c49fee7a8c: Pushed 67376cc00cc7: Pushed 63bd38732135: Pushed a70c997c3eb4: Pushed f087e5bef299: Layer already exists d493a7b98dcf: Layer already exists 4fb17f51b932: Layer already exists 5a04c4a9f49c: Layer already exists 4b97ee001b98: Layer already exists 22236f4b6e33: Layer already exists 1b9698f962dc: Layer already exists latest: digest: sha256:bf33ff2f8b9e2fb23ee9845744a174a94775b308bc34884c1335295c17c3574f size: 3040 ``` #### 4. Deploy Word Count Update the `examples/wordcount/flink-operator-custom-resource.yaml` file to reference the new image with: ``` image: flink-operator-dev-registry:56230/flink-wordcount:latest imagePullPolicy: Always ``` ``` scott /Users/scott/go/src/github.com/lyft/flinkk8soperator/examples/wordcount → kubectl get FlinkApplication No resources found in flink-operator namespace. ``` Apply the YAML: ``` scott /Users/scott/go/src/github.com/lyft/flinkk8soperator/examples/wordcount → kc apply -f flink-operator-custom-resource.yaml flinkapplication.flink.k8s.io/wordcount-operator-example created ``` Monitor the Flink Application: ``` → kubectl get FlinkApplication -w NAME PHASE CLUSTER HEALTH JOB HEALTH JOB RESTARTS AGE wordcount-operator-example ClusterStarting 29s wordcount-operator-example Savepointing 49s wordcount-operator-example SubmittingJob 49s wordcount-operator-example SubmittingJob 49s wordcount-operator-example SubmittingJob Green 49s ``` --------- Co-authored-by: Scott Kidder <[email protected]>
- Loading branch information