Skip to content
This repository has been archived by the owner on Aug 30, 2024. It is now read-only.

autoscaling 3.3 design

Steve Jones edited this page Sep 6, 2017 · 4 revisions

Table of Contents

Overview

This design covers our implementation of AWS Auto Scaling.

Tracking

Status Draft
Updated 2012/12/19 Initial document, focus is preparation for Sprint 1
Updated 2013/01/28 Updated for Sprint 2 functionality (which was previously for Sprint 1)
Updated 2013/02/01 Updated with Sprint 2 final details
Updated 2013/04/03 Updated with Sprint 5 final details
Updated 2013/04/09 Added more developer targetted design info

Out of Scope

Implementation of items related to features that we do not support are out of scope. These features are:

  • Placement groups
  • Spot Instances
  • SNS
  • VPC

Feature Dependencies

  • CW - Listing alarms associated with a scaling policy
  • EC2 - Verifying metadata, managing instances, tagging
  • ELB - Health checks, register/deregister instances

Related Features

This feature relates to the following features in this release:

  • EC2 - Tagging / Filtering
  • EC2 - Instance health checking
  • EC2 - Idempotent instance launch
  • CloudWatch
  • ElasticLoadBalancing
  • VM Types API

Analysis

Open Issues

  • Are we supporting "pagination" for describe actions?
  • Where are ARNs supported instead of names? (are ARNs supported for the policy name in PutScalingPolicy?)
  • What error occurs when setting desired capacity or executing policy is rejected due to cooldown?
  • Are only security group names accepted? (or can group identifiers be used as well?)
  • Is manual termination of instances permitted when termination is suspended?
  • Timing of adding to load balancer (is running "running"?, InService?)

Resolved Issues

  • What is the expected behaviour when the desired capacity is set to less than min or more than max? (fail - 400 - ValidationError - New SetDesiredCapacity value 2 is above max value 1 for the AutoScalingGroup.)
  • Should resource type/id be included in the tag information when describing groups with tags? (Yes)

Design

This section provides design details relevant to developers.

Implementation Summary

DB eucalyptus_autoscaling
Source modules scaling Contents under com.eucalyptus.scaling package
  scaling-common Contents under com.eucalyptus.scaling.common package
Service class AutoScalingService
Component ID class AutoScaling
Service type name autoscaling
Internal URI /internal/AutoScaling Internal SOAP

Entities

The following entities will be added:

  • Launch configuration
  • Auto scaling group
  • Auto scaling instance
  • Auto scaling policy
  • Scaling activity
  • Tag / Auto scaling group tag

Service Impact

New SOAP / Query API user facing services are added.

Input Validation

The majority of input validation for the service occurs prior to routing to the service implementation. The AutoScalingMessageValidator class processes all messages (as configured in autoscaling-model.xml) prior to service invocation.

AutoScalingMessage exposes a validate method that returns a list of validation errors. The validation for each field is defined by annotations and validation requiring message context is performed by overriding the validate method. The annotations used are:

  • FieldRegex - A string field must match a provided regular expression
  • FieldRange - A numeric value or length of a collection must match the specified range
Annotations are defined in AutoScalingMessageValidation along with related regular expressions.

Validation requiring (cloud) contextual information is performed in the service implementation.

Service vs Activities

The service implementation is generally disconnected from the back-end scaling activities. The service implementation (AutoScalingService) is similar to other services and follows the Gen2ools patterns.

Activities are performed by ActivityManager which allows a single task to be run for each auto scaling group subject to exponential backoff on failures. Every clock tick (ten seconds by default) work can be performed by the activity manager on the following tasks:

  • Timeout - Timeout scaling activities that presumably failed without a state update
  • Expiry - Delete records for scaling activities older than the maximum retained age
  • ZoneHealth - Update auto scalings record of which Availability Zones are up
  • Recovery - Resume activities for instances in unstable states (mid launch, etc)
  • Scaling - Perform auto scaling work (launch, terminate)
  • InstanceCleanup - Find and terminate unexpected auto scaling instances
  • MetricsSubmission - Submit auto scaling metrics to CloudWatch
The autoscaling.suspendedtasks property can be used to suspend any of the above tasks (e.g. "InstanceCleanup, MetricsSubmission" would suspend two tasks)

In addition to task suspension it is possible to suspend auto scaling processes for all groups using the autoscaling.suspendedprocesses property.

Tasks

ScalingProcessTask represents a task to run (which may be subject to backoff), each task is composed of one or more ScalingActivityTasks. A ScalingActivityTask typically sends a message to another component and does something with the response.

The task framework runs all activities for a task concurrently.

Tasks are executed with an ActivityContext which provides clients to access other (possibly remote) components.

HA

The following items support operation in a high availability environment where there may be primary/secondary auto scaling components.

  • Auto scaling activities are only performed when the AutoScaling component is locally enabled.
  • In memory state is persisted to the database or can be recovered.
  • Activities that do not complete are timed out and will be retried.
Failover can degrade auto scaling performance with activities taking longer to complete than would otherwise be the case.

Testability

Auto scaling uses thin abstraction layers to facilitate mocking for test purposes:

  • DispatchingClient - Allows simulation of message exchanges
  • LaunchConfigurations / PersistenceLaunchConfigurations - Allows simulation of persistence
Unit tests are added for the service implementation and activity manager.

Function

This section provides implementation details and descriptions of functionality beyond the core auto scaling features.

Service Summary

Version 2011-01-01 Version of AutoScaling API supported
Service URI /services/AutoScaling SOAP / Query API
Eucarc variable name AWS_AUTO_SCALING_URL

Supported Actions

This sections describes implemented auto scaling service actions. Actions that are not implemented relate to scheduled scaling and notfications.

For more information on administrative support by action, see Administrative Functionality below.

Action Policy Enforced Admin Enabled Notes
CreateAutoScalingGroup Y Placement group, VPC Zone Identifier not supported.
CreateLaunchConfiguration Y Spot Price, EBS Optimized not supported.
CreateOrUpdateTags Y
DeleteAutoScalingGroup Y Y
DeleteLaunchConfiguration Y Y
DeletePolicy Y Y
DeleteTags Y
DescribeAdjustmentTypes Y
DescribeAutoScalingGroups Y Y
DescribeAutoScalingInstances Y Y
DescribeLaunchConfigurations Y Y
DescribeMetricCollectionTypes Y
DescribePolicies Y Y
DescribeScalingActivities Y Y
DescribeScalingProcessTypes Y
DescribeTags Y
DescribeTerminationPolicyTypes Y
DisableMetricsCollection Y Y
EnableMetricsCollection Y Y
ExecutePolicy Y Y
PutScalingPolicy Y
ResumeProcesses Y Y
SetDesiredCapacity Y Y
SuspendProcesses Y Y
TerminateInstanceInAutoScalingGroup Y Y
UpdateAutoScalingGroup Y Y Placement group, VPC Zone Identifier not supported.

IAM Integration

Item Value Notes
Vendor autoscaling Prefix for actions, e.g. autoscaling:DescribeLaunchConfigurations
Resources autoscalinggroup
  instance
  launchconfiguration
  scalingactivity
  scalingpolicy
  tag
Actions autoscaling:* We will permit use in policy of any actions from the supported API version
Quotas autoscaling:quota-autoscalinggroupnumber
  autoscaling:quota-launchconfigurationnumber
  autoscaling:quota-scalingpolicynumber

The ARN of an Auto Scaling resource has the form:

  arn:aws:autoscaling::<account id>:<resource type>/<resource id>:[<name>/<value>]+

An example policy for service access:

    {
       "Statement":[{
          "Effect":"Allow",
          "Action":"autoscaling:*",
          "Resource":"*"
       }]
    }

Eucalyptus Extensions

IAM

Support for quotas and resource (see above)

An example policy for limiting auto scaling groups:

    {
      "Statement":[{
        "Effect":"Limit",
        "Action":"autoscaling:createautoscalinggroup",
        "Resource":"*",
        "Condition":{
          "NumericLessThanEquals":{
            "autoscaling:quota-autoscalinggroupnumber":"2",
          }
        }
      }]
    }

Administrative Functionality

We will extend the standard functionality for administrative purposes.

Listing Resources

The following actions will support listing of all accounts/users item:

  • DescribeAutoScalingGroups
  • DescribeAutoScalingInstances
  • DescribeLaunchConfigurations
  • DescribePolicies
  • DescribeScalingActivities
To enable this the parameter verbose is passed as a name selector.

Deleting Resources

The following actions will support administrative deletion of resources:

  • DeleteAutoScalingGroup
  • DeleteLaunchConfiguration
  • DeletePolicy
For administrative deletion the ARN of the resource must be used (not the simple name), e.g.:
  arn:aws:autoscaling::013765657871:launchConfiguration:6789b01a-a9c9-489f-9d24-ed39533fca61:launchConfigurationName/Test

Updating Resources

The following actions will support administrative update of resources:

  • DisableMetricsCollection
  • EnableMetricsCollection
  • ExecutePolicy
  • ResumeProcesses
  • SetDesiredCapacity
  • SuspendProcesses
  • TerminateInstanceInAutoScalingGroup
  • UpdateAutoScalingGroup
For administrative modification the ARN of the resource must be used (not the simple name)

Administrative Suspension

Scaling processes for auto scaling groups are automatically placed under administrative suspension when failing to launch instances (subject to the related configuration properties). Scaling processes can be resumed the owning account or by an administrator.

If an administrator manually suspends scaling processes for an auto scaling group then the group is also shown as being administratively suspended.

Configuration

The following properties are added for auto scaling configuration:

Property Default value Notes
autoscaling.activityexpiry 42d 6 weeks
autoscaling.activitytimeout 5m
autoscaling.maxlaunchincrement 20
autoscaling.maxregistrationretries 5
autoscaling.suspendedprocesses
autoscaling.suspendedtasks
autoscaling.suspensionlaunchattemptsthreshold 15
autoscaling.suspensiontimeout 1d
autoscaling.untrackedinstancetimeout 5m
autoscaling.zonefailurethreshold 5m

Interval values have a suffix to denote the units:

  • ms - Milliseconds
  • s - Seconds
  • m - Minutes
  • h - Hours
  • d - Days

Upgrade

No upgrade impact noted.

Packaging

No specific packaging requirements.

Documentation

Administrative functionality should be documented.

Security

Use of ARN in place of simple name adds risk of permitting deletion of other accounts resources.

Testing

No specific test cases noted.

References

tag:rls-3.3 tag:autoscaling



Clone this wiki locally