Automated Aerospike Certification Tool (ACT) testing on AWS and GCP.
Packer is used to create machine images with ACT binaries installed and supporting systemd units to automate running tests.
Terraform is used to provision the cloud resources and apply specific test parameters.
- Packer builds machine images
- Terraform provisions instances based on configured test parameters
- The
cloud-init
configurations:- Partition the volumes (optional)
- Run
act_prep
on the devices (optional) - Copy test-specific parameters and configuration to the instance
- The
systemd
units:- Capture basic system information
- Execute the test with either
act_storage
oract_index
- Capture
iostat
output during the test run - Upload results and logs to object storage (optional)
- Shutdown the instance (optional)
The quick start example will spin up a single m5d.large
EC2 instance in the
us-west-2
AWS region and automatically run a default 1X ACT test for 24 hours.
Before you begin you must
setup AWS credentials
and create an EC2 Key Pair
in the us-west-2
AWS region.
Navigate to the aws-quick-start
example module:
cd terraform/examples/aws-quick-start
View terraform/examples/aws-quick-start/config/act_storage.conf
. This is the
ACT configuration file that will be copied onto each instance. It is documented
on the ACT Github repo.
View terraform/examples/aws-quick-start/config/main.tf
. This is the main
Terraform configuration. The act_simple
module is where the variables are
set that determine the instance type and device configuration for the test. To
see a description of all the available variables see
terraform/modules/aws-aerospike-act/variables.tf
.
Initialize Terraform:
terraform init
Apply the Terraform plan, passing your EC2 key pair name as a variable so that you will be able to SSH into the instance after it has been provisioned.
terraform apply -var aws_ec2_key_pair=<YOUR EC2 KEY PAIR NAME>
Type yes
when prompted.
The output will include the SSH command you can use to login to the instance:
Apply complete! Resources: 3 added, 0 changed, 0 destroyed.
Outputs:
device_names = /dev/nvme1n1
ssh_logins = {
"ACT simple (m5d.large #1)" = "ssh [email protected]"
}
SSH into the instance and verify the cloud-init
script is running act_prep
on the local SSD device by tailing /var/log/cloud-init-output.log
as sudo:
$ sudo tail -f /var/log/cloud-init-output.log
This log will include the act_prep
output which will take a while to run:
Cloud-init v. 19.3-3.amzn2 running 'modules:final' at Sun, 12 Jul 2020 12:23:37 +0000. Up 11.56 seconds.
Leaving devices unpartitioned: /dev/nvme1n1
Running act_prep on devices: /dev/nvme1n1
/dev/nvme1n1 size = 75000000000 bytes, 572204 large blocks
cleaning device /dev/nvme1n1
.....................................................................................................
salting device /dev/nvme1n1
.....................................................................................................
Created symlink from /etc/systemd/system/cloud-init.target.wants/aerospike-act.service to /etc/systemd/system/aerospike-act.service.
Cloud-init v. 19.3-3.amzn2 finished at Sun, 12 Jul 2020 12:59:42 +0000. Datasource DataSourceEc2. Up 2176.80 seconds
When the cloud-init
process is complete the aerospike-act
service will start
automatically. Check the status with:
sudo systemctl status aerospike-act
The service runs a bunch of scripts before running the main act executable so the output is a bit messy, but, the service should be "Loaded" and "Active":
● aerospike-act.service - Aerospike ACT
Loaded: loaded (/etc/systemd/system/aerospike-act.service; enabled; vendor preset: disabled)
Active: active (running) since Sun 2020-07-12 12:59:42 UTC; 6min ago
Process: 2961 ExecStartPre=/bin/bash -c { set -x && cat /etc/system-release && lscpu && lsblk && ifconfig; } >> /var/log/act/$ACT_TEST/sysinfo.txt 2>&1 (code=exited, status=0/SUCCESS)
Process: 2952 ExecStartPre=/bin/bash -c { echo "AWS Instance ID: $(wget -T 2 -q -O - http://169.254.169.254/latest/meta-data/instance-id)" && echo "AWS Instance Type: $(wget -T 2 -q -O - http://169.254.169.254/latest/meta-data/instance-type)" && echo "AWS AMI ID: $(wget -T 2 -q -O - http://169.254.169.254/latest/meta-data/ami-id)"; } >> /var/log/act/$ACT_TEST/sysinfo.txt 2>&1 (code=exited, status=0/SUCCESS)
Process: 2947 ExecStartPre=/bin/bash -c echo "Creating /var/log/act/$ACT_TEST" && /bin/mkdir -p /var/log/act/$ACT_TEST && touch /var/log/act/$ACT_TEST/sysinfo.txt (code=exited, status=0/SUCCESS)
Main PID: 2968 (bash)
CGroup: /system.slice/aerospike-act.service
├─2968 /bin/bash -c echo "Running $ACT_CMD" && /usr/sbin/$ACT_CMD $ACT_CONFIG > /var/log/act/$ACT_TEST/$ACT_CMD.stdout.txt 2> /var/log/act/$ACT_TEST/...
└─2970 /usr/sbin/act_storage /opt/act/act_storage.conf
If for any reason the service failed, you can view the logs with
journalctl -u aerospike-act
. For example, to tail the last 10 lines:
journalctl -u aerospike-act -n 10
As the test is going to run for 24 hours, it should show "Running act_storage" as the last log entry:
-- Logs begin at Sun 2020-07-12 11:40:00 UTC, end at Sun 2020-07-12 13:07:18 UTC. --
Jul 12 12:59:42 ip-172-31-45-102.us-west-2.compute.internal systemd[1]: Starting Aerospike ACT...
Jul 12 12:59:42 ip-172-31-45-102.us-west-2.compute.internal bash[2947]: Creating /var/log/act/quick_start
Jul 12 12:59:42 ip-172-31-45-102.us-west-2.compute.internal systemd[1]: Started Aerospike ACT.
Jul 12 12:59:42 ip-172-31-45-102.us-west-2.compute.internal bash[2968]: Running act_storage
Tail the output of ACT:
tail -f /var/log/act/*/act_storage.stdout.txt
Tail the output of iostat
which is also setup to run periodically and log it's
output during the test:
tail -f /var/log/act/*/iostat.stdout.txt
View a latency report of the test output thus far:
act_latency -l /var/log/act/*/act_storage.stdout.txt -h reads -h large-block-writes -n 3 -e 3 -t 60
When you are done testing, tear down the environment:
terraform destroy
Navigate into the packer/
directory:
cd packer
Build the act-aws.json
template:
packer build act-aws.json
To build in a different region:
packer build -var region=us-east-1 act-aws.json
To build a specific version of ACT specify the git ref and ACT version strings:
ACT 6.3
packer build -var act_version=6.3 -var act_git_ref=bb9b87b <build_template>
ACT 6.2
packer build -var act_version=6.2 -var act_git_ref=9ad31ad <build_template>
ACT 6.1
packer build -var act_version=6.1 -var act_git_ref=ed2584b <build_template>
ACT 6.0
packer build -var act_version=6.0 -var act_git_ref=5637286 <build_template>
ACT 5.3
packer build -var act_version=5.3 -var act_git_ref=7df031f <build_template>
ACT 5.2
packer build -var act_version=5.2 -var act_git_ref=0ea97dc <build_template>
ACT 5.1
packer build -var act_version=5.1 -var act_git_ref=db9961f <build_template>
ACT 5.0
packer build -var act_version=5.0 -var act_git_ref=2cca411 <build_template>
The test will be run automatically when auto_start=true
. However, the test can
be run manually by setting auto_start=false
and either invoking the systemd
units directly or running the ACT binaries directly.
The ACT source code is cloned to /home/ec2-user/act
and the binaries are
installed to /usr/sbin
. Run with:
sudo act_prep ...
sudo act_storage ...
sudo act_index ...
sudo act_latency ...
The ACT config file can be found in /opt/act/
Use systemctl
to start|stop|restart|status
the aerospike-act
service.
View logs with journalctl -u aerospike-act
.
The aerospike-act
service uses environment variables that were installed by
cloud-init
to /opt/act/environment
. They can be manually altered before
(re)starting the service.
The service is setup to store it's output at /var/log/act/<test-name>
.
The systemd
unit files are setup to run after cloud-init
completes. If the
devices are being prepped with act_prep
at boot, which is the default, then
the cloud-init
process can take a while (up to an hour).
To see if cloud-init
is still running:
$ sudo cloud-init status
status: running
To view output from the cloud-init
process, including partitioning and running
act_prep
on the devices, see the cloud-init-output.log
:
$ tail -f /var/log/cloud-init-output.log