server-group instance replacement does not transfer ENI #720
Unanswered
adamlundrigan
asked this question in
Help
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
We use the
server-group
module ofterraform-aws-asg
to run HAProxy instances for outbound connection proxying. Our EIPs are attached to ENIs and those ENIs are attached to the server by theattach-eni
command (fromterraform-aws-server
) in user-data.This works great when we use Terraform to force a rolling deploy; the old instance is removed, the ENIs detached, a new instance started, and the ENIs reattached.
However, when the ASG decides to replace an instance, it does so in the opposite order - it spins up a new instance then removes the old one. This means that the new instance can't attach the ENIs during first boot. Insufficient error checking meant the ASG didn't know the new instance was "incomplete", so we ended up with a proxy running which could not forward any traffic.
Troubeshooting notes from our internal ticket
The relevant section of the
user-data.sh
script for the ASG looks like this:This finds all the available ENIs tagged for the server and attaches them in order.
In the replacement case the ENIs are still attached to the failing server so their status is not
available
, meaning they don't get attached.I've looked through the ASG documentation and don't see a way to make it behave like the rolling-deploy script - terminate first then spin a new instance.
Any suggestion on what we should do to prevent this issue from reoccurring?
I guess the user-data script could use the AWS CLI to force-detach the ENIs?
https://docs.aws.amazon.com/cli/latest/reference/ec2/detach-network-interface.html
Tracked in ticket #110206
Beta Was this translation helpful? Give feedback.
All reactions