Before you can begin, you need to have gcloud
, kubectl
, and helm
installed, plus a Google Cloud project in which you are authenticated.
- Crete a Kubernetes cluster and get the credentials
export CLUSTER_NAME=maken-cluster
export CLUSTER_ZONE=europe-north1-a
gcloud container clusters create $CLUSTER_NAME \
--zone $CLUSTER_ZONE \
--node-locations $CLUSTER_ZONE \
--num-nodes 2 \
--enable-autoscaling --min-nodes 2 --max-nodes 5
gcloud container clusters get-credentials $CLUSTER_NAME --zone $CLUSTER_ZONE
- A
DaemonSet
is needed to ensure cluster nodes set a proper value forvm.max_map_count
in order to get OpenDistro to work.
kubectl apply -f max_map_count.yaml
- Provision a node pool with at least 2 nodes with 32GB RAM and 8 CPUs (
e2-standard-8
, recommendedn2-standard-16
for 1M vectors in 2 shard)
gcloud container node-pools create "$CLUSTER_NAME-pool" --cluster $CLUSTER_NAME \
--machine-type e2-standard-8 --num-nodes 2 \
--disk-size 500G --disk-type pd-ssd \
--enable-autoscaling --min-nodes 2 --max-nodes 5 \
--zone $CLUSTER_ZONE
And optionally delete the default node pool
gcloud container node-pools delete default-pool --cluster $CLUSTER_NAME --zone $CLUSTER_ZONE
- Deploy OpenSearch with the custom
values.yaml
.
helm install maken-index --values=values.yaml opensearch/opensearch
- Create a load balancer and get the IP (Ctrl-C when ready)
kubectl apply -f load_balancer.yaml
kubectl get service/maken-index-service --output jsonpath='{.status.loadBalancer.ingress[0].ip}' --watch
In order to expose OpenDistro outside the cluster, the property networking.gke.io/load-balancer-type
in load_balancer.yaml
needs to be set to "External"
and redeployed. In that case, it is strongly recommended to add loadBalancerSourceRanges
to filter by IPs.
- Wait until internal load balancer is available and then create a config map.
export CLUSTER_INDEX_IP=$(kubectl get service/maken-index-service --output jsonpath='{.status.loadBalancer.ingress[0].ip}')
kubectl create configmap maken-index-api-config \
--from-literal ES_HOST=$CLUSTER_INDEX_IP \
--from-literal ES_SCHEME=https
- Deploy the API and get the IP (Ctrl-C when ready)
kubectl apply -f api.yaml
kubectl get service/maken-api-service --output jsonpath='{.status.loadBalancer.ingress[0].ip}' --watch
To be able to manage the index and ingest data, you would need to enable port forwarding locally to the internal load balancer of the index service.
gcloud container clusters get-credentials $CLUSTER_NAME --zone $CLUSTER_ZONE
kubectl port-forward $(kubectl get pod --selector="app.kubernetes.io/component=opensearch-cluster-master,app.kubernetes.io/instance=maken-index,app.kubernetes.io/name=opensearch" --output jsonpath='{.items[0].metadata.name}') 8080:9200