<img src="https://ws.zoominfo.com/pixel/6169bf9791429100154fc0a2" width="1" height="1" style="display: none;">
Curious about how StrongDM works? 🤔 Learn more here!
Search
Close icon
Search bar icon

Managing Access to Ephemeral Infrastructure At Scale

StrongDM manages and audits access to infrastructure.
  • Role-based, attribute-based, & just-in-time access to infrastructure
  • Connect any person or service to any infrastructure, anywhere
  • Logging like you've never seen

Managing a static fleet of StrongDM servers is dead simple. You create the server in the StrongDM console, place the public key file on the box, and it’s done! This scales really well for small deployments, but as your fleet grows, the burden of manual tasks grows with it.

With the advent of automated scaling solutions for our cloud environment like AWS Auto Scaling Groups, we need a way for our StrongDM inventory to change in real-time along with the underlying servers. The solution: automation automation automation!

The devops mindset is key, we want to automate cloud infrastructure so it operates without manual intervention. We can write scripts that hook into instance boot and shutdown events that will automatically adjust our StrongDM inventory accordingly.

The examples in this post are written for AWS, but all major cloud providers should provide a similar API for instance information, lifecycle hooks, and metadata-like tags.

Automation— the hooks

For server access, there are two lifecycle events that we care about: server boot and server shutdown. We are going to write scripts that hook into these events and execute StrongDM CLI commands to perform the necessary actions.

We’ll need API keys to talk to StrongDM and our cloud provider.  In this case: AWS.

StrongDM and AWS API Authentication

StrongDM provides admin tokens that facilitate access to sdm admin CLI calls. The admin token has the following permissions:

On the AWS side, servers were given a programmatic (non-console) user account with one IAM policy attached. This IAM policy can also be attached to an instance role instead of embedding credentials into the script!

The policy contains one statement: allow EC2:DescribeTags. This API call is required to build out the server’s name in StrongDM.

{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": "ec2:DescribeTags",
"Resource": "*"
}
]
}‍

Naming Convention

Server names will be dynamically built using the server’s AWS EC2 tags, with the following schema:

$APP-$ENV-$PROCESS-$INSTANCE_ID

A server with the following tags:

would result in the following name in the StrongDM inventory:

$ sdm status
SERVER                             STATUS            PORT      TYPE
testapp-stage-wweb-00deadbeef      connected         60456     ssh

 

Startup

We've built a custom base AMI based on the Amazon Linux operating system. Our provisioning workflow includes a per-boot CloudInit script that executes when the server is powered on, including restarts. This script utilizes the StrongDM command line tool to register automatically identify the information about the server, register it with StrongDM, and install the on-demand SSH keys.

In /var/lib/cloud/scripts/per-instance/00_register_with_strongdm.sh
INSTANCE_ID="$(curl --silent http://169.254.169.254/latest/meta-data/instance-id)"
INSTANCE_ID_TRIMMED="$(echo $INSTANCE_ID | cut -d '-' -f 2)"
LOCAL_IP="$(curl --silent http://169.254.169.254/latest/meta-data/local-ipv4)"
APP="$(aws ec2 describe-tags --filters Name=resource-id,Values=$INSTANCE_ID Name=key,Values=app --query 'Tags[0].Value' --output text)"
ENV="$(aws ec2 describe-tags --filters Name=resource-id,Values=$INSTANCE_ID Name=key,Values=env --query 'Tags[0].Value' --output text)"
PROCESS="-$(aws ec2 describe-tags --filters Name=resource-id,Values=$INSTANCE_ID Name=key,Values=process --query 'Tags[0].Value' --output text)-"
SDM_SERVER_NAME="$APP-$ENV$PROCESS$INSTANCE_ID_TRIMMED"
curl --silent -o sdm.zip -L https://app.strongdm.com/releases/cli/linux
unzip sdm.zip
mv sdm /usr/local/bin/sdm
rm sdm.zip
mkdir -p "/home/sshuser/.ssh/"
touch "/home/sshuser/.ssh/authorized_keys"
chmod 0700 "/home/sshuser/.ssh/"
chmod 0600 "/home/sshuser/.ssh/authorized_keys"
chown -R "sshuser:sshuser" "/home/sshuser/"
/usr/local/bin/sdm login
PUBLIC_KEY=$(/usr/local/bin/sdm admin servers add -p "$SDM_SERVER_NAME" "sshuser@$LOCAL_IP")
# Touch a "lockfile" so the server can be deregistered when it's shutdown
# (see shared/roles/strongdm_target/templates/remove_from_strongdm.sh.j2:10)
touch /var/lock/strongdm-registered

Shutdown

For removing servers from inventory, we’ve included another script into our base AMI. It is a runlevel0 script that removes the server from the StrongDM inventory when then system is halted (i.e. AWS ASG instance termination, console terminations, or restarts).

In /etc/init.d/remove_from_strongdm
#!/bin/bash
# chkconfig: 0123456 99 01
# description: Deregister server from StrongDM at shutdown
# LOCKFILE is created by the cloudinit script
LOCKFILE=/var/lock/remove_from_strongdm
start(){
# set up the logger
exec 1> >(logger -s -t $(basename $0)) 2>&1
touch ${LOCKFILE}
}
stop(){
# set up the logger
exec 1> >(logger -s -t $(basename $0)) 2>&1
# Remove our lock file
rm ${LOCKFILE}
INSTANCE_ID="$(curl http://169.254.169.254/latest/meta-data/instance-id)"
INSTANCE_ID_TRIMMED="$(echo $INSTANCE_ID | cut -d '-' -f 2)"
APP="$(aws ec2 describe-tags --filters Name=resource-id,Values=$INSTANCE_ID Name=key,Values=app --query 'Tags[0].Value' --output text)"
ENV="$(aws ec2 describe-tags --filters Name=resource-id,Values=$INSTANCE_ID Name=key,Values=env --query 'Tags[0].Value' --output text)"
PROCESS="-$(aws ec2 describe-tags --filters Name=resource-id,Values=$INSTANCE_ID Name=key,Values=process --query 'Tags[0].Value' --output text)-"
SDM_SERVER_NAME="$APP-$ENV$PROCESS$INSTANCE_ID_TRIMMED"
/usr/local/bin/sdm admin servers delete "$SDM_SERVER_NAME"
}
case "$1" in
start) start;;
stop) stop;;
*)
echo $"Usage: $0 {start|stop}"
exit 1
esac
exit 0

User Access Model— StrongDM Okta Sync

At Betterment, all of our access management lives in one single place: our identity provider. We wire up all of our authentication and access control to Okta. With this in mind, we wanted a way for Okta attributes, in our case group memberships, to propagate to server access in StrongDM. When we reached out to the engineers assisting us with the implementation, they came up with an amazing solution.

They wrote a stopgap tool for us that will lookup an Engineer’s group in Okta, and grant access to a set of servers based on a regular expression. It was a small tool written in Go that we could run on a schedule. The matching algorithm was controlled by a YAML file that allowed us to easily map Okta groups to StrongDM servers.

For example, if an engineer is added to the strongdm/testapp Okta group, they will now have access to all servers whose names start with testapp. Since our servers followed a distinct naming pattern, we were able to write well-defined regular expressions to match applications to their respective servers.

Here’s a small excerpt from the matchers.yml configuration file:

groups:
- name: strongdm/testapp
servers:
- testapp.*

In our case, the YAML file was stored in source control, and a Jenkins job ran the sync every five minutes.

Simple Steps to Manage Access to Ephemeral Servers

Voilà! With a few little scripts shimmed into our instance boot and shutdown lifecycle hooks, we can rely on our dynamic infrastructure to successfully register and deregister itself from our StrongDM inventory. Small steps, like writing a few bash scripts and plugging them into an AMI, can save your operations team valuable time when working with a large-scale deployment and allow your application engineers to SSH into dynamic infrastructure with ease.

You can try StrongDM out for yourself with a free, 14-day trial.

To learn more about how StrongDM helps companies with managing permissions, make sure to check out our Managing Permissions Use Case.

StrongDM logo
💙 this post?
Then get all that StrongDM goodness, right in your inbox.

You May Also Like

HIPAA Multi-Factor Authentication (MFA) Requirements
HIPAA Multi-Factor Authentication (MFA) Requirements in 2025
The HIPAA Multi-Factor Authentication (MFA) requirement is a security measure that requires users to verify their identity using at least two different factors—such as something they know (a password), something they have (a smartphone or token), or something they are (a fingerprint)—to access systems containing electronic Protected Health Information (ePHI). This additional layer of security is designed to protect sensitive healthcare data from unauthorized access, even if one credential is compromised, and helps organizations comply with the HIPAA Security Rule.
There Will Be Breaches: A Blueprint for Smarter Access
There Will Be Breaches: A 2025 Blueprint for Smarter Access
I’ll spare you the “I drink your milkshake” tropes, but we all face a sobering reality: there will be breaches in 2025. Breaches aren’t a question of “if” anymore—they’re a question of “when” and “how bad.” It’s a foregone conclusion, like taxes or the 37th season of Grey’s Anatomy. But here’s the good news: knowing the inevitability of breaches gives us the perfect opportunity to prepare, if we have the will – and strategy – oh, and tools – to do it. And no, I’m not talking about the “build a bunker and buy 1,000 cans of beans” kind of preparation. I’m talking about a smarter, modern approach to managing access.
13 StrongDM Use Cases with Real Customer Case Studies
13 StrongDM Use Cases with Real Customer Case Studies
Managing access to critical infrastructure is a challenge for many organizations. Legacy tools often struggle to keep up, creating inefficiencies, security gaps, and frustration. StrongDM offers a modern solution that simplifies access management, strengthens security, and improves workflows. In this post, we’ll explore 13 real-world examples of how StrongDM helps teams solve access challenges and achieve their goals.
What Is Network Level Authentication (NLA)? (How It Works)
What Is Network Level Authentication (NLA)? (How It Works)
Network Level Authentication (NLA) is a security feature of Microsoft’s Remote Desktop Protocol (RDP) that requires users to authenticate before establishing a remote session. By enforcing this pre-authentication step, NLA reduces the risk of unauthorized access, conserves server resources, and protects against attacks like credential interception and denial of service. While effective in securing RDP sessions, NLA is limited to a single protocol, lacks flexibility, and can add complexity in diverse, modern IT environments that rely on multiple systems and protocols.
How to Automate Continuous Compliance in AWS with StrongDM
How to Automate Continuous Compliance in AWS with StrongDM
Enterprises seek ways to effectively address the needs of dynamic, always-evolving cloud infrastructures, and StrongDM has developed a platform that is designed with built-in capabilities to support continuous compliance in AWS environments.