The playbooks in this repo allow you to configurably set up, on AWS
- An Aerospike cluster
- Aerospike java benchmarking clients
- The Aerospike Prometheus/Grafana monitoring stack
This includes setup of all the necessary AWS VPC infrastructure.
Install boto - the python wrapper for the AWS SDK. You may run into compatibility problems - python versions / boto versions. I made use of this tweak and am running under a virtualenv. If you set up your own virtualenv and install boto into it you should be OK as I've baked the other things into the settings in this repo
You need your AWS credentials on disk as per https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-files.html. The user you use will need the AmazonEC2FullAccess roles.
The SSH change given in SSH is recommended
A full list of commands to set up on a linux platform is given in linux-commands-for-setup.md
For macOS install see macos-commands-for-setup.md
Once you have done all this
ansible-playbook aws-setup-plus-aerospike-install.yml
will set up an Aerospike cluster on AWS that you can access from your own machine. By default, this will be 3 nodes in 3 separate AZs, c5d.large with the ephemeral disk partitioned 4 ways.
ansible-playbook aerospike-java-client-setup.yml
will set up a configured java benchmarking client allowing immediate use (IPs taken care of)
ansible-playbook aerospike-monitoring-setup.yml
will set up the monitoring stack including Prometheus and Grafana. You can then simply access the exposed Grafana endpoint.
A video showing end to end setup of the full Aerospike cluster/client/monitoring stack on a fresh Vagrant instance is available.
The first 10min deal with setting up the pre-requisites (virtualenv/ansible/boto/IAM) with the remainder showing use of the playbooks.
The pre-requisites listed above can be troublesome to install given changes in versions of the various requirements.
To that end, a containerised image containing all the pre-requisites can be built or pulled from DockerHub. See this README for details.
ansible-playbook aws-setup.yml
creates the AWS VPC infrasturcture and the cluster instancesansible-playbook install-aerospike.yml
just does the Aerospike install (aws-setup-plus-aerospike-install.yml
simply callsaws-setup.yml
followed byinstall-aerospike.yml
)ansible-playbook remove-aerospike.yml
will remove the Aerospike install (you can use this +install-aerospike.yml
to change your install without having to rebuild the instances)aws-teardown.yml
removes all AWS hosts and VPC componentsreload-config.yml
will upload a new aerospike.conf to all hosts and re-start - useful if iteratively creating a new conf file.
Everything you are likely to want to change can be found in vars/cluster-config.yml
- cluster_identifier - default = aerospike - the prefix used to identify all the AWS assets. By changing this you can run multpile clusters
- cluster_instance_type - default = c5d.large
- cluster_hosts_per_az - default = 1
- partitions_per_device - default = 4 ( nvme volumes will automatically be partitioned for you )
- enterprise - default = false to allow single click setup of Community. If true, a feature key location must be specified ( see below )
- encryption_at_rest - default = false. If true
aerospike.conf
will be appropriately modified and a key file generated - tls_enabled - default = false. If true
aerospike.conf
will be appropriately modified and all certificates appropriately located. Connecting clients will be appropriately configured. - strong_consistency - default = false. If true
aerospike.conf
will be appropriately modified. The roster will be automatically set for you, with rack awareness, assuming each subnet constitutes a separate 'rack' - all_flash - default = false. If true, first partition on each disk, so (1 / partitions_per_device) of the available space, will be allocated for index on device. Both this and partition-tree-sprigs will usually require custom setting. For accurate sizing consult all flash sizing. You can also consult Automated All Flash Setup
- monitoring_enabled - default = false. If true the Aerospike Prometheus agent will be installed, configured and started on the cluster nodes.
- kafka_enabled - default = false. If true, install Kafka Connect on each Aerospike node & configure the cluster so's it is correctly linked to Kafka Connect.
- aerospike_distribution - default = el6. Determines the distribution used.
- aerospike_version - default = latest
- aerospike_tools_version - default = 7.0.3. Determines the version of Aerospike Tools installed on the client machine
- ami_locator_string - the latest version of the AMZN2 AMI is used ( dynamically looked up). Other builds can be used by modifying this string.
- replication_factor - default = 2
- aerospike_mem_pct - fraction of available memory to allocate to the 'test namespace'. Default = 80%
- feature_key - path for an Enterprise feature key. Undefined by default so the setup works out of the box.
- partition_tree_sprigs - partition tree sprig count - used if defined (undefined by default). See Automated All Flash Setup for more detail
- client_instance_type - instance type used for the Aerospike java client - defaults to
- aerospike_client_per_az_count - clients per az in client_az_list
- monitoring_instance_type - instance type used for monitoring instance - defaults to cluster_instance_type
- spark_instance_type - instance type used for Spark workers - defaults to cluster_instance_type
- spark_worker_per_az_count - spark workers per az in cluster_az_list
On the AWS side you can modify via vars/aws-config.yml
- aws_region - default = us-east-1
- cluster_az_list - default = [a,b,d] - c can be a little flaky
- client_az_list - default is first az in the cluster az list
- use_ipify - default = true. the ipify service is used to determine personal ip. It can be unreliable. If getting errors relating to the ipify_facts task, set use_ipify to false and set public_port_access_cidr to a mask that includes your IP e.g. <your_address>/32 or 0.0.0.0/0 ( matches everything )
All the configuration options above can be modified via the command line using the --extra-vars
option and a JSON formatted argument. e.g.
ansible-playbook aws-setup-plus-aerospike-install.yml --extra-vars="{'aerospike-version':'4.8.0.3', cluster_instance_type:'c5d.2xlarge'}"
Alternatively, vars/cluster-config.yml
can be modified.
To use enterprise a feature key argument must be supplied, as well as the specification enterprise = true
ansible-playbook aws-setup-plus-aerospike-install.yml --extra-vars="{'enterprise':true,'feature_key':'/path/to/my/features.conf'}"
The template in assets/aerospike.conf.j2
has the ip addresses of the hosts injected and the device names. If you wish to use a different configuration, edit this file.
- Dedicated SSH Key
- VPC
- Subnets
- Routing
- Security Group
- Selects most recent AMI in a given category ( e.g. Amazon Linux 2 )
- Creates instances using selected AMI
- Creates a local 'quick access' script ( lets you get into your cluster via
scripts/cluster-quick-ssh.sh 1/2/3
etc ) - Partitions nvme volumes ( # of partitions per disk is configurable )
- Takes an Aerospike configuration file ( editable ) and injects AWS local IP addresses of instances for discovery purposes
- Adds partitioned volumes as devices to namespace
- Allows choice of Enterprise / Community
- Sources features.conf
- Installs Aeropspike ( version / distribution can be specified )
- Starts Aerospike
As above a script gets created allowing ready access to the cluster
./scripts/cluster-quick-ssh.sh 1
will get you into node 1 in the cluster and so on. Avoids tedious copying of IP addresses. Note that this will use the key generated by the playbook which is locally in <cluster_identifier>.aws.pem
- your own keys are not used.
source scripts/ip-address-list.sh
to allow referencing of IP addresses thusly at the command line
echo ${AERO_CLUSTER_IPS[0]}
./scripts/client-quick-ssh.sh 1
logs you into your client instance
cd aerospike-client-java/benchmarks
./as-benchmark-w.sh
will load 10m keys into your cluster
./as-benchmark-rw.sh
will run a 50/50 workload
Tuning parameters such as rate, key set size, read/write workload proportion, thread count, object spec can all be set via ./as-benchmark-common.sh
Necessary TLS configuration including installation of a CA and use of correct flags will be automatically configured if tls_enabled is set to true.
At the end of the output for ansible-playbook aerospike-monitoring-setup.yml
you will see the message
Grafana dashboard available at http://<IP>:4000
Copy paste this into your browser. User/Pass is admin/admin. Changing the password is recommended.
Select Home -> Aerospike -> Namespace View to see your first dashboard.
Follow the instructions in Using the benchmarking client to generate read/write activity that you can watch
Note that the Grafana and Prometheus ports (4000 & 9090) are locked to 'your' IP address. If you want to lock to a different address range, uncomment public_port_access_cidr
in vars/aws-config.yml
and change to the required range.
In the recipe section are some assets allowing support of one touch rolling upgrades and cluster moves - as used in my Summit 2020 talk. Watch this space for full scripts.
These scripts can be used to create a full Aerospike stack in GCP. The Ansible tooling doesn't allow easy creation of instances however.
Start by creating a host in GCP to act as your Ansible host and do the setup described in Quick Start.
Make use of the gcp
branch of this repo as some small tweaks were needed to get things to work.
Then create your cluster/client/monitoring instances maybe using VM templates (a GCP thing) to ensure consistency, give them names and then add the host names to
the inventory/hosts
file. Examples of what to do are given in the inventory/hosts
in the gcp
branch.
TLS enabled Aerospike is built using pre-built key pairs, which are exposed in this project - see private. These keys are not to be used for production purposes. You will however see instructions in certificates which tell you how to create your own, which can be used to replace the ones provided.
To use aql with TLS enabled
aql --tls-enable --tls-name=aerospike_ansible_demo_cluster --tls-cafile=/etc/aerospike/certs/ca.crt -p 4333
Similarly, for asadm
asadm --tls-enable --tls-name=aerospike_ansible_demo_cluster --tls-cafile=/etc/aerospike/certs/ca.crt -p 4333
and asinfo
asinfo --tls-enable --tls-name=aerospike_ansible_demo_cluster --tls-cafile=/etc/aerospike/certs/ca.crt -p 4333 <YOUR_COMMAND_HERE>
ansible-playbook spark-cluster-setup.yml
will create a Spark cluster, enabled with Aerospike Spark Connect. The playbook sets up spark_worker_per_az_count instances of type spark_instance_type in each of the cluster_az_list availability zones.
The following can be set in vars/spark-vars.yml
- scala_version
- spark_version
- hadoop_version
- aerospike_spark_connect_version
Note these will change over time. spark_version in particular will need modification when the current Spark version changes (else Spark download will fail).
At Aerospike Connect for Spark you can find an article going through this setup process in detail, including a full, at scale example. It's a 5 minute read.
Note that the Spark web ports (8080 & 8081) are locked to 'your' IP address. If you want to lock to a different address range, uncomment public_port_access_cidr
in vars/aws-config.yml
and change to the required range.
ansible-playbook kafka-cluster-setup.yml
will create a Kafka cluster and will configure and install Aerospike Kafka Connect. The playbook sets up kafka_worker_per_az_count instances of type kafka_instance_type in each of the cluster_az_list availability zones.
The following can be set in vars/kafka-vars.yml
- kafka_version
- kafka_connect_product_version
- default_kafka_topic (set to aerospike by default)
You need to have the following Ansible roles installed - sleighzy.zookeeper & sleighzy.kafka. To do this run
ansible-galaxy install sleighzy.zookeeper sleighzy.kafka
In vars/cluster-config.yml
both Aerospike Enterprise and Kafka Connect must be enabled so make sure you the following set
kafka_connect_enabled: true
enterprise: true
To test, log into a Kafka host and watch the aerospike
topic
./scripts/kafka-quick-ssh.sh
/opt/kafka/bin/kafka-console-consumer.sh --topic aerospike --bootstrap-server localhost:9092
Now log into an Aerospike host and insert a record
./scripts/cluster-quick-ssh.sh
aql
insert into test(PK,value) values(1,1)
You should see the following message from Kafka in the console consumer window
{"msg":"write","key":["test",null,"pEPwXQXZYiArWau0Aq+uFzfb9mo=",null],"gen":1,"exp":0,"lut":1636468874425,"bins":[{"name":"value","type":"int","value":1}]}
A recommended approach is to use the pre-built Ansible client container, particularly if there is any difficulty setting up the Ansible pre-requisites - see this README for details.
If you see
Received disconnect from 18.207.231.181 port 22:2: Too many authentication failures
or similar when using Ansible try adding
IdentitiesOnly=yes to your .ssh/config file
Note that the ssh port (22) is locked to 'your' IP address. If you want to lock to a different address range, uncomment public_port_access_cidr
in vars/aws-config.yml
and change to the required range.
Disk partitioning relies on devices being named /dev/nvme* & we ignore nvme0 as this is usually the boot volume.
Dash is very handy for Ansible documentation
Please use the issues feature.