Stop Using AWS Access Keys: AWS Security Token Service sts & Federation Complete Guide 2025
By Srikanth Ch, Senior DevOps Engineer | thedevopstooling.com
Prevent credential leaks: a practical 2025 guide to AWS STS — temporary creds, cross-account patterns, federation (SAML/OIDC), CI/CD best practices, and audit-ready implementations.
I’ve been working with AWS for over a decade, and I can tell you that the shift from permanent IAM user credentials to temporary STS credentials was one of the most significant security improvements we made in production environments. Imagine giving long-lived AWS credentials to every application, every CI/CD pipeline, every contractor, and every automation script in your infrastructure — that’s the security nightmare AWS Security Token Service (STS) prevents.
In this comprehensive guide, I’ll walk you through everything you need to know about AWS STS — from basic concepts to advanced federation patterns, cross-account strategies, and real-world implementations. Whether you’re preparing for your AWS Solutions Architect, Security Specialty certification, or simply want to secure your cloud infrastructure better, this guide has you covered.
Table of Contents AWS Security Token Service sts
📋 TL;DR — Key Takeaways
- Use STS/roles everywhere — eliminate long-lived access keys and replace IAM users with federated access and role assumption patterns.
- Prefer OIDC for CI/CD; use ExternalId for third-party roles — GitHub Actions, GitLab, and EKS should use OIDC federation; always require ExternalId when granting third-party access to prevent confused deputy attacks.
- Short sessions for high-privilege roles; enforce MFA for federation — production roles should have 15-60 minute sessions; require MFA at the identity provider level for all federated access.
- Monitor AssumeRole events in CloudTrail + GuardDuty — track all role assumptions, set up alarms for unusual patterns, and use GuardDuty to detect anomalous credential usage.
- Use session policies and source-identity for least privilege & auditability — further restrict assumed role permissions with inline session policies; track individual users through multi-tenant applications with source identity.
What is AWS Security Token Service (STS)?
AWS Security Token Service is a global web service that enables you to request temporary, limited-privilege credentials for your AWS resources. Think of STS as the digital equivalent of a visitor badge system at a secure building — you get access for exactly as long as you need it, with precisely the permissions required, and once the time expires, the credentials automatically become useless.
Here’s what makes STS powerful: instead of creating permanent IAM users with long-lived access keys that could be compromised, leaked, or misused, you generate temporary credentials that expire automatically. These credentials consist of an access key ID, secret access key, and a session token that work together to authenticate requests.
In my experience deploying hundreds of AWS workloads, I’ve seen STS become the foundation of modern cloud security architectures. Let me share some real scenarios where STS becomes absolutely critical:
Your CI/CD pipeline needs to deploy applications across multiple AWS accounts. Rather than storing permanent credentials in Jenkins or GitHub Actions, you use STS to assume roles dynamically. The credentials live only for the duration of the deployment — typically 15 to 60 minutes.
A third-party auditor needs temporary access to review your compliance posture. Instead of creating a permanent IAM user that might accidentally remain active after the audit, you provide them with federated access through STS that expires after their engagement ends.
Your developers need console access to production for emergency troubleshooting. Using AWS Identity Center with STS, they authenticate through your corporate identity provider, get temporary credentials valid for one hour, and their access automatically revokes when the session ends.
Your containerized applications running in EKS need AWS permissions. Rather than embedding access keys in container images, you use IRSA (IAM Roles for Service Accounts) which leverages STS and OIDC federation to provide temporary credentials to individual pods.
The pattern here is clear: temporary, scoped, automatically expiring access. This is the foundation of the principle of least privilege in cloud environments.

AWS STS Architecture: How It All Works Together
Understanding how STS fits into the broader AWS security architecture will help you design better systems. Let me break down the key components and how they interact.
At its core, STS acts as a broker between your identity (whether that’s an IAM user, role, or federated identity) and the AWS resources you need to access. When you make a request to STS, you’re essentially saying: “I am X, I need to become Y, and here’s proof of my identity.” STS validates your request, checks the trust policy of the target role, and if everything checks out, issues you a temporary credential set.
Here’s the fundamental flow I’ve used countless times in production environments:
You start with a principal — this could be an IAM user, an application, or even a user from your corporate directory. This principal calls an STS API operation, such as AssumeRole. STS evaluates the trust policy attached to the target IAM role to determine if this principal is allowed to assume that role. If the trust policy allows it, STS generates temporary security credentials with a default lifetime of one hour (though you can request up to twelve hours depending on the role configuration).
These temporary credentials include three components: an access key ID, a secret access key, and a session token. All three must be used together when making AWS API calls. The credentials are cryptographically bound to the session and cannot be used beyond their expiration time.
Now let’s talk about the major STS API operations you’ll encounter in real-world DevOps scenarios:
AssumeRole is the workhorse of STS. This is what you use when you want to switch from one role to another, either within the same account or across accounts. I use this constantly for cross-account deployments, where my CI/CD pipeline in the tools account needs to deploy into production accounts. The calling principal must have permissions to call sts:AssumeRole, and the target role must have a trust policy that allows the principal to assume it.
AssumeRoleWithSAML enables federated access through SAML 2.0 identity providers like Azure Active Directory, Okta, or Google Workspace. When a user authenticates through your corporate SSO, the identity provider generates a SAML assertion, and AWS STS exchanges this for temporary AWS credentials. This is how you enable your developers to log into the AWS console using their corporate email credentials instead of separate AWS passwords.
AssumeRoleWithWebIdentity is designed for mobile and web applications that authenticate users through identity providers like Amazon Cognito, Facebook, Google, or any OIDC-compliant provider. While this was more common a few years ago, it’s now largely replaced by Cognito’s enhanced authentication flow for consumer applications.
AssumeRoleWithOIDC isn’t a separate API call per se, but rather represents the modern pattern of using OpenID Connect federation. This is what powers IAM Roles for Service Accounts in EKS, GitHub Actions OIDC integration, and other modern CI/CD patterns. Instead of storing AWS credentials in GitHub secrets, you configure GitHub as an OIDC provider in AWS, and your workflows assume roles directly using web identity tokens.
GetSessionToken is used when you need to add multi-factor authentication to existing IAM user credentials. This is particularly useful when you want to enforce MFA for sensitive operations. The temporary credentials you get back from GetSessionToken inherit the permissions of the IAM user who called it, but only work if MFA was provided.
GetFederationToken allows an IAM user to obtain temporary credentials for federated users. This is useful when you need to provide access to external entities but want to maintain control through your IAM users. However, in modern architectures, I generally recommend using AssumeRole patterns instead because they provide cleaner separation and better auditability.
GetCallerIdentity is a diagnostic API that returns details about the IAM entity making the call. I use this constantly in troubleshooting to verify which credentials are actually being used by applications and scripts. It’s particularly helpful when debugging cross-account access issues.
Here’s a practical analogy that helps my students understand STS architecture: Think of AWS accounts as office buildings in a corporate campus. IAM roles are like job positions with specific access rights. STS is the security checkpoint that validates your employee badge, checks that you’re authorized to work in that position today, gives you a temporary visitor pass, and sets an expiration time based on how long you need access. Once that time expires, your pass stops working automatically — you can’t accidentally leave it active or forget to revoke it.
🚀 2025 Pro Tip: Switch to Regional STS Endpoints
Here’s something that catches many teams by surprise: by default, most AWS SDKs and older CLI configurations use the global STS endpoint at sts.amazonaws.com. While this works, it creates two significant problems I’ve seen in production environments.
First, latency. If your workload runs in ap-southeast-1 but STS requests route to the global endpoint (which physically resolves to us-east-1), you’re adding unnecessary round-trip time to every role assumption. For applications that assume roles frequently, this adds up.
Second, resiliency. During the major us-east-1 outage in December 2021, the global STS endpoint became unavailable, which meant even applications running in other regions couldn’t assume roles. This was a painful lesson for many organizations about single points of failure.
The 2025 Standard: Regional Endpoints
Configure your applications to use regional STS endpoints like sts.us-east-1.amazonaws.com or sts.eu-west-1.amazonaws.com. This keeps STS traffic local to your region, reduces latency, and provides better fault isolation.
How to Enable Regional Endpoints:
For the AWS CLI and SDK, set this environment variable:
export AWS_STS_REGIONAL_ENDPOINTS=regional
Or add it to your AWS configuration file at ~/.aws/config:
[default]
sts_regional_endpoints = regional
region = us-east-1
For applications using the AWS SDK, configure it programmatically:
# Python boto3
import boto3
sts_client = boto3.client(
'sts',
region_name='us-east-1',
endpoint_url='https://sts.us-east-1.amazonaws.com'
)
// Node.js AWS SDK v3
import { STSClient } from "@aws-sdk/client-sts";
const client = new STSClient({
region: "us-east-1",
endpoint: "https://sts.us-east-1.amazonaws.com"
});
When This Matters Most:
Regional endpoints are especially critical for high-frequency role assumption patterns like EKS IRSA (where pods assume roles every hour) and Lambda functions (which might assume roles on every invocation). The cumulative latency savings and improved fault tolerance are significant.
I migrated a high-traffic EKS cluster from global to regional STS endpoints last year and saw a twenty percent reduction in role assumption latency, which translated to faster pod startup times and better overall performance.
Diagram Suggestion: Create a visual showing the STS credential flow: Principal → Trust Policy Check → STS Token Service → Temporary Credentials (Access Key + Secret + Token) → Target AWS Resources. Use dark navy (#0C1A2B) background with cyan (#00D9F5) arrows showing the flow and royal blue (#2A65F5) boxes for the components.
Why Temporary Credentials Matter: The Security Imperative
The Real Cost of Credential Leaks
Let me share something that fundamentally changed how I think about cloud security. In 2019, I was consulting for a company that had suffered a security breach. An old IAM user credential — created years ago for a contractor who had long since left — was committed to a public GitHub repository. Within hours, their AWS account was being used to mine cryptocurrency, resulting in tens of thousands of dollars in unexpected charges.
This scenario happens more often than you’d think, and it illustrates why temporary credentials aren’t just a best practice — they’re essential for modern cloud security.
Three Critical Security Principles
Temporary credentials embody three critical security principles that have become non-negotiable in well-architected cloud environments:
Principle 1: Least Privilege
The principle of least privilege means granting only the minimum permissions necessary to accomplish a specific task. With permanent IAM user credentials, you’re often forced to grant broader permissions because the user might need different things at different times.
With STS, you can grant exactly the permissions needed for right now. When my CI/CD pipeline deploys to production, it assumes a role with deployment permissions for exactly fifteen minutes. When a developer needs read access to troubleshoot, they get a different role with read-only permissions for one hour. The credentials automatically align with the task.
Principle 2: Zero Standing Privileges
Zero standing privileges is a concept that’s gained tremendous traction in cloud security. The idea is simple: no one should have permanent access to sensitive environments. Instead, access is granted on-demand through temporary credentials.
I’ve implemented this pattern for production access at multiple organizations — developers don’t have any permanent production permissions. When they need access, they request it through a workflow that grants them temporary credentials via STS for the duration of their troubleshooting session. This dramatically reduces the attack surface.
Principle 3: Automatic Credential Rotation
Automatic credential rotation happens inherently with STS. You don’t need to remember to rotate keys or worry about credentials sitting unused in configuration files. Every time you assume a role, you get fresh credentials. When they expire, they simply stop working. There’s no manual rotation, no key management overhead, and no risk of forgetting to rotate credentials.
The Old Way vs. The STS Way
Let me give you a concrete scenario that demonstrates the security advantage. Imagine a developer needs one hour of access to query a production RDS database to investigate a performance issue.
In the old world, you might:
- Create an IAM user with database access
- Generate access keys
- Share them securely (hopefully)
- Hope the developer deletes them when done (they often don’t)
- Remember to delete the IAM user later (sometimes this gets forgotten)
With STS, the flow becomes:
- Developer authenticates through corporate SSO
- They request database access through your access management system
- STS issues temporary credentials valid for one hour
- Credentials automatically expire after the hour
- No manual cleanup needed, no risk of forgotten credentials
The security posture is dramatically different. Even if those temporary credentials were somehow compromised, they’d be useless after sixty minutes. No permanent foothold, no lateral movement opportunity, no long-term damage.
Reflection Prompt: Take a moment to audit your current AWS environment. Where are permanent IAM user credentials still being used? In CI/CD pipelines? For application access? For third-party integrations? Each of these is a potential candidate for migration to STS-based temporary credentials. In my experience, most organizations can eliminate eighty to ninety percent of their permanent credentials by properly leveraging STS.
STS API Calls Deep Dive: Practical Examples
Theory is valuable, but you learn DevOps by doing. Let me walk you through practical examples of each major STS operation, including the actual AWS CLI commands and JSON structures you’ll work with.
Quick Reference: STS API Operations at a Glance
| API Call | Typical Use Case | Duration | Key Trust/Policy Note |
|---|---|---|---|
| AssumeRole | Cross-account access, CI/CD deployments, service-to-service | 15 min – 12 hrs | Trust policy must allow principal ARN; supports ExternalId condition |
| AssumeRoleWithSAML | Corporate SSO (Azure AD, Okta, Google Workspace) | 15 min – 12 hrs | Trust policy requires SAML provider ARN; attribute mapping critical |
| AssumeRoleWithWebIdentity | OIDC federation (GitHub Actions, EKS IRSA, web/mobile apps) | 15 min – 12 hrs | Trust policy requires OIDC provider; sub claim limits repos/namespaces |
| GetSessionToken | MFA enforcement, temporary elevation of IAM user creds | 15 min – 36 hrs | Inherits IAM user permissions; requires MFA serial number |
| GetFederationToken | Legacy federated access via IAM user (rarely used now) | 15 min – 36 hrs | Permissions capped by calling IAM user; prefer AssumeRole instead |
| GetCallerIdentity | Diagnostic/troubleshooting — verify current identity | N/A (no temp creds) | No permissions required; returns current identity details |
Pro Tip: For production systems, always use AssumeRole or AssumeRoleWithWebIdentity — they provide the cleanest separation, best auditability, and proper zero-trust architecture.
AssumeRole: Cross-Account and Same-Account Access
This is the API call you’ll use most frequently. Here’s how it works in practice. Let’s say you have a role named DeploymentRole in your production account, and your CI/CD pipeline needs to assume it.
aws sts assume-role \
--role-arn arn:aws:iam::123456789012:role/DeploymentRole \
--role-session-name pipeline-deployment-session \
--duration-seconds 3600
The response contains your temporary credentials:
{
"Credentials": {
"AccessKeyId": "ASIAIOSFODNN7EXAMPLE",
"SecretAccessKey": "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY",
"SessionToken": "FwoGZXIvYXdzEBQaD...",
"Expiration": "2025-03-15T12:00:00Z"
},
"AssumedRoleUser": {
"AssumedRoleId": "AROAIDPPEZS35WEXAMPLE:pipeline-deployment-session",
"Arn": "arn:aws:sts::123456789012:assumed-role/DeploymentRole/pipeline-deployment-session"
}
}
In your scripts, you’d then export these as environment variables:
export AWS_ACCESS_KEY_ID="ASIAIOSFODNN7EXAMPLE"
export AWS_SECRET_ACCESS_KEY="wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
export AWS_SESSION_TOKEN="FwoGZXIvYXdzEBQaD..."
Pro tip: The --role-session-name parameter is your friend for auditability. Make it descriptive so CloudTrail logs clearly show who assumed the role and why. I use patterns like {service}-{purpose}-{timestamp} so troubleshooting is easier later.
AssumeRoleWithSAML: Corporate SSO Integration
When you integrate AWS with your corporate identity provider using SAML, this is the API call happening behind the scenes. While you typically won’t call this directly (the AWS console and CLI handle it), understanding the flow helps with troubleshooting.
Your identity provider (like Azure AD) generates a SAML assertion after authenticating the user. AWS then calls:
aws sts assume-role-with-saml \
--role-arn arn:aws:iam::123456789012:role/DeveloperRole \
--principal-arn arn:aws:iam::123456789012:saml-provider/ExampleCorpProvider \
--saml-assertion base64-encoded-saml-assertion
The key here is the trust relationship in your IAM role. It must explicitly trust your SAML provider:
{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::123456789012:saml-provider/ExampleCorpProvider"
},
"Action": "sts:AssumeRoleWithSAML",
"Condition": {
"StringEquals": {
"SAML:aud": "https://signin.aws.amazon.com/saml"
}
}
}]
}
I’ve set up SAML federation with dozens of identity providers, and the most common issues I see are mismatched attribute mappings and incorrect trust policies. Always verify that your SAML assertion includes the attributes AWS expects.
AssumeRoleWithWebIdentity: OIDC and Modern CI/CD
This is where modern DevOps really shines. Instead of storing AWS credentials in GitHub Actions or GitLab CI, you configure OIDC federation. Here’s how GitHub Actions would assume an AWS role:
First, configure GitHub as an OIDC provider in AWS and create a role with this trust policy:
{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::123456789012:oidc-provider/token.actions.githubusercontent.com"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
"token.actions.githubusercontent.com:aud": "sts.amazonaws.com"
},
"StringLike": {
"token.actions.githubusercontent.com:sub": "repo:yourorg/yourrepo:*"
}
}
}]
}
In your GitHub Actions workflow:
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: arn:aws:iam::123456789012:role/GitHubActionsRole
aws-region: us-east-1
Behind the scenes, GitHub generates a JWT token that AWS STS validates and exchanges for temporary credentials. No secrets stored anywhere — it’s beautiful.
The same pattern works for EKS with IRSA. Each Kubernetes service account can be mapped to an IAM role, and pods automatically receive temporary credentials:
kubectl annotate serviceaccount my-app-sa \
eks.amazonaws.com/role-arn=arn:aws:iam::123456789012:role/MyAppRole
When your pod starts, the EKS webhook injects environment variables with temporary credentials that are automatically refreshed before expiration.
GetSessionToken: Adding MFA Enforcement
When you want to require MFA for sensitive operations, GetSessionToken is your tool:
aws sts get-session-token \
--serial-number arn:aws:iam::123456789012:mfa/username \
--token-code 123456 \
--duration-seconds 3600
This returns temporary credentials that have MFA authentication attached. You can then use IAM policy conditions to require MFA for specific actions:
{
"Effect": "Allow",
"Action": "ec2:TerminateInstances",
"Resource": "*",
"Condition": {
"Bool": {
"aws:MultiFactorAuthPresent": "true"
}
}
}
GetCallerIdentity: The Troubleshooting Hero
This simple API call has saved me countless hours of debugging:
aws sts get-caller-identity
Returns:
{
"UserId": "AIDAI23EXAMPLE:session-name",
"Account": "123456789012",
"Arn": "arn:aws:sts::123456789012:assumed-role/RoleName/session-name"
}
I use this constantly to verify which credentials my application or script is actually using. When a developer says “I can’t access this resource,” the first thing I do is have them run this command to see what role they’re actually operating under.
Real-World Runnable Examples: Copy-Paste-Ready Workflows
Let me give you three production-ready, fully working examples you can adapt to your environment immediately.
Example 1: GitHub Actions OIDC Federation (Complete Workflow)
This complete example shows how to deploy to AWS from GitHub Actions without storing any credentials.
Step 1: Create the IAM OIDC Provider (one-time setup)
#!/bin/bash
# create-github-oidc-provider.sh
# Run this once to set up GitHub OIDC federation
ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)
aws iam create-open-id-connect-provider \
--url https://token.actions.githubusercontent.com \
--client-id-list sts.amazonaws.com \
--thumbprint-list 6938fd4d98bab03faadb97b34396831e3780aea1
echo "✅ GitHub OIDC provider created"
Step 2: Create the IAM Role with Trust Policy
#!/bin/bash
# create-github-deploy-role.sh
# Creates a role that GitHub Actions can assume
GITHUB_ORG="your-org"
GITHUB_REPO="your-repo"
ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)
# Create trust policy
cat > trust-policy.json <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::${ACCOUNT_ID}:oidc-provider/token.actions.githubusercontent.com"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
"token.actions.githubusercontent.com:aud": "sts.amazonaws.com"
},
"StringLike": {
"token.actions.githubusercontent.com:sub": "repo:${GITHUB_ORG}/${GITHUB_REPO}:*"
}
}
}
]
}
EOF
# Create the role
aws iam create-role \
--role-name GitHubActionsDeployRole \
--assume-role-policy-document file://trust-policy.json \
--description "Role for GitHub Actions to deploy to AWS"
# Attach deployment permissions (customize as needed)
aws iam attach-role-policy \
--role-name GitHubActionsDeployRole \
--policy-arn arn:aws:iam::aws:policy/AmazonECS_FullAccess
echo "✅ Role created: arn:aws:iam::${ACCOUNT_ID}:role/GitHubActionsDeployRole"
Step 3: GitHub Actions Workflow
# .github/workflows/deploy.yml
name: Deploy to AWS
on:
push:
branches: [main]
# Required for OIDC federation
permissions:
id-token: write
contents: read
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: arn:aws:iam::123456789012:role/GitHubActionsDeployRole
aws-region: us-east-1
role-session-name: GitHubActions-${{ github.run_id }}
- name: Verify identity
run: |
echo "Current AWS identity:"
aws sts get-caller-identity
- name: Deploy application
run: |
# Your deployment commands here
aws ecs update-service \
--cluster production \
--service my-app \
--force-new-deployment
echo "✅ Deployment triggered"
# Expected output when workflow runs:
# Current AWS identity:
# {
# "UserId": "AROAEXAMPLE:GitHubActions-123456",
# "Account": "123456789012",
# "Arn": "arn:aws:sts::123456789012:assumed-role/GitHubActionsDeployRole/GitHubActions-123456"
# }
Result: Zero secrets stored in GitHub. Credentials are temporary (1 hour) and scoped to your specific repository and branch.
Example 2: EKS IRSA (IAM Roles for Service Accounts) Setup
This example shows how to give individual pods in EKS AWS permissions without sharing credentials.
Step 1: Create IAM Role for Service Account
#!/bin/bash
# setup-irsa.sh
# Sets up IRSA for a Kubernetes service account
CLUSTER_NAME="my-eks-cluster"
SERVICE_ACCOUNT="my-app-sa"
NAMESPACE="production"
ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)
OIDC_PROVIDER=$(aws eks describe-cluster --name $CLUSTER_NAME \
--query "cluster.identity.oidc.issuer" --output text | sed -e "s/^https:\/\///")
# Create trust policy for the service account
cat > irsa-trust-policy.json <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::${ACCOUNT_ID}:oidc-provider/${OIDC_PROVIDER}"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
"${OIDC_PROVIDER}:sub": "system:serviceaccount:${NAMESPACE}:${SERVICE_ACCOUNT}",
"${OIDC_PROVIDER}:aud": "sts.amazonaws.com"
}
}
}
]
}
EOF
# Create the IAM role
aws iam create-role \
--role-name MyAppS3AccessRole \
--assume-role-policy-document file://irsa-trust-policy.json
# Attach permissions (example: S3 read access)
cat > s3-access-policy.json <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::my-app-bucket",
"arn:aws:s3:::my-app-bucket/*"
]
}
]
}
EOF
aws iam put-role-policy \
--role-name MyAppS3AccessRole \
--policy-name S3Access \
--policy-document file://s3-access-policy.json
echo "✅ IAM Role created: arn:aws:iam::${ACCOUNT_ID}:role/MyAppS3AccessRole"
Step 2: Create Kubernetes Service Account
# k8s-serviceaccount.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: my-app-sa
namespace: production
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::123456789012:role/MyAppS3AccessRole
# Apply the service account
kubectl apply -f k8s-serviceaccount.yaml
Step 3: Deploy Pod Using the Service Account
# k8s-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
namespace: production
spec:
replicas: 2
selector:
matchLabels:
app: my-app
template:
metadata:
labels:
app: my-app
spec:
serviceAccountName: my-app-sa # Links to IRSA role
containers:
- name: my-app
image: my-app:latest
command: ["/bin/sh", "-c"]
args:
- |
echo "Testing AWS credentials..."
aws sts get-caller-identity
echo "Listing S3 bucket..."
aws s3 ls s3://my-app-bucket/
# Your application logic here
# Deploy and verify
kubectl apply -f k8s-deployment.yaml
# Check pod logs to see credentials working
kubectl logs -n production deployment/my-app
# Expected output:
# Testing AWS credentials...
# {
# "UserId": "AROAEXAMPLE:my-app-sa-123456",
# "Account": "123456789012",
# "Arn": "arn:aws:sts::123456789012:assumed-role/MyAppS3AccessRole/my-app-sa-123456"
# }
# Listing S3 bucket...
# 2025-03-15 10:30:00 12345 data.json
Result: Each pod gets temporary AWS credentials automatically. Credentials rotate every hour. No secrets in container images or environment variables.
Example 3: Cross-Account Deployment Script with Error Handling
This production-ready script safely assumes a role in another account with proper error handling and credential refresh.
#!/bin/bash
# cross-account-deploy.sh
# Production-ready script for cross-account deployments with STS
set -euo pipefail # Exit on error, undefined vars, pipe failures
# Configuration
TARGET_ACCOUNT="444444444444"
TARGET_ROLE="DeploymentRole-Prod"
EXTERNAL_ID="unique-external-id-12345"
ROLE_ARN="arn:aws:iam::${TARGET_ACCOUNT}:role/${TARGET_ROLE}"
SESSION_NAME="deploy-$(date +%Y%m%d-%H%M%S)-$$"
SESSION_DURATION=900 # 15 minutes
# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m' # No Color
log() {
echo -e "${GREEN}[$(date +'%Y-%m-%d %H:%M:%S')]${NC} $1"
}
error() {
echo -e "${RED}[ERROR]${NC} $1" >&2
exit 1
}
warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
# Verify we're in the correct source account
verify_source_identity() {
log "Verifying source account identity..."
local identity=$(aws sts get-caller-identity 2>&1)
if [ $? -ne 0 ]; then
error "Failed to get caller identity. Is AWS CLI configured?"
fi
local current_account=$(echo "$identity" | jq -r '.Account')
local current_arn=$(echo "$identity" | jq -r '.Arn')
log "Current identity: ${current_arn}"
log "Current account: ${current_account}"
}
# Assume role in target account
assume_role() {
log "Assuming role: ${ROLE_ARN}"
local credentials=$(aws sts assume-role \
--role-arn "$ROLE_ARN" \
--role-session-name "$SESSION_NAME" \
--external-id "$EXTERNAL_ID" \
--duration-seconds "$SESSION_DURATION" \
--query 'Credentials' \
--output json 2>&1)
if [ $? -ne 0 ]; then
error "Failed to assume role: $credentials"
fi
# Export credentials to environment
export AWS_ACCESS_KEY_ID=$(echo "$credentials" | jq -r '.AccessKeyId')
export AWS_SECRET_ACCESS_KEY=$(echo "$credentials" | jq -r '.SecretAccessKey')
export AWS_SESSION_TOKEN=$(echo "$credentials" | jq -r '.SessionToken')
local expiration=$(echo "$credentials" | jq -r '.Expiration')
log "✅ Role assumed successfully"
log "Credentials expire at: ${expiration}"
# Verify assumed role
local assumed_identity=$(aws sts get-caller-identity --query 'Arn' --output text)
log "Operating as: ${assumed_identity}"
}
# Deploy application
deploy_application() {
log "Starting deployment..."
# Example: Update ECS service
aws ecs update-service \
--cluster production \
--service my-app \
--force-new-deployment \
--output json > /tmp/deployment-result.json
if [ $? -eq 0 ]; then
local deployment_id=$(jq -r '.service.deployments[0].id' /tmp/deployment-result.json)
log "✅ Deployment triggered: ${deployment_id}"
else
error "Deployment failed"
fi
}
# Wait for deployment to complete (optional)
wait_for_deployment() {
log "Waiting for deployment to stabilize..."
aws ecs wait services-stable \
--cluster production \
--services my-app
if [ $? -eq 0 ]; then
log "✅ Deployment completed successfully"
else
warn "Deployment wait timed out or failed"
fi
}
# Cleanup
cleanup() {
log "Cleaning up..."
unset AWS_ACCESS_KEY_ID
unset AWS_SECRET_ACCESS_KEY
unset AWS_SESSION_TOKEN
rm -f /tmp/deployment-result.json
log "✅ Cleanup completed"
}
# Main execution
main() {
log "========================================="
log "Cross-Account Deployment Script"
log "Target: ${TARGET_ACCOUNT}/${TARGET_ROLE}"
log "========================================="
verify_source_identity
assume_role
deploy_application
wait_for_deployment
cleanup
log "========================================="
log "✅ Deployment process completed"
log "========================================="
}
# Run with trap for cleanup on exit
trap cleanup EXIT
main "$@"
# Expected output:
# [2025-03-15 10:30:00] =========================================
# [2025-03-15 10:30:00] Cross-Account Deployment Script
# [2025-03-15 10:30:00] Target: 444444444444/DeploymentRole-Prod
# [2025-03-15 10:30:00] =========================================
# [2025-03-15 10:30:00] Verifying source account identity...
# [2025-03-15 10:30:01] Current identity: arn:aws:iam::111111111111:user/deployer
# [2025-03-15 10:30:01] Current account: 111111111111
# [2025-03-15 10:30:01] Assuming role: arn:aws:iam::444444444444:role/DeploymentRole-Prod
# [2025-03-15 10:30:02] ✅ Role assumed successfully
# [2025-03-15 10:30:02] Credentials expire at: 2025-03-15T10:45:02Z
# [2025-03-15 10:30:02] Operating as: arn:aws:sts::444444444444:assumed-role/DeploymentRole-Prod/deploy-20250315-103000-12345
# [2025-03-15 10:30:02] Starting deployment...
# [2025-03-15 10:30:05] ✅ Deployment triggered: ecs-svc/1234567890
# [2025-03-15 10:30:05] Waiting for deployment to stabilize...
# [2025-03-15 10:35:20] ✅ Deployment completed successfully
# [2025-03-15 10:35:20] Cleaning up...
# [2025-03-15 10:35:20] ✅ Cleanup completed
# [2025-03-15 10:35:20] =========================================
# [2025-03-15 10:35:20] ✅ Deployment process completed
# [2025-03-15 10:35:20] =========================================
Key Features:
- ✅ Proper error handling with
set -euo pipefail - ✅ Credential verification before and after role assumption
- ✅ Session name includes timestamp and PID for uniqueness
- ✅ Automatic cleanup on script exit (success or failure)
- ✅ Clear logging with timestamps and colors
- ✅ External ID for security
- ✅ Production-ready with wait logic and health checks
Cross-Account Access: The Right Way
Cross-account access is where STS truly demonstrates its power. Let me walk you through a pattern I’ve implemented dozens of times: setting up secure CI/CD deployments across multiple AWS accounts.
The scenario: You have a tools account where your CI/CD pipelines run, and you need to deploy applications into development, staging, and production accounts. Here’s how to architect this securely using STS.
Step 1: Create a Deployment Role in Target Accounts
In each target account (dev, staging, prod), create an IAM role with the permissions needed for deployment:
{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Action": [
"ecs:UpdateService",
"ecs:DescribeServices",
"ecr:GetAuthorizationToken",
"s3:PutObject",
"s3:GetObject"
],
"Resource": "*"
}]
}
Step 2: Configure the Trust Policy
The critical part is the trust policy that determines who can assume this role. In your deployment role, add a trust policy that allows the CI/CD pipeline’s role from the tools account:
{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::111111111111:role/CICDPipelineRole"
},
"Action": "sts:AssumeRole",
"Condition": {
"StringEquals": {
"sts:ExternalId": "unique-external-id-12345"
}
}
}]
}
Notice the ExternalId condition. This is crucial for security — it prevents the confused deputy problem where an attacker might trick your role into performing actions on their behalf.
Step 3: Grant AssumeRole Permission in the Tools Account
In your tools account, the CI/CD role needs permission to assume roles in target accounts:
{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Action": "sts:AssumeRole",
"Resource": [
"arn:aws:iam::222222222222:role/DeploymentRole-Dev",
"arn:aws:iam::333333333333:role/DeploymentRole-Staging",
"arn:aws:iam::444444444444:role/DeploymentRole-Prod"
]
}]
}
Step 4: Implement in Your CI/CD Pipeline
Your deployment script then becomes:
#!/bin/bash
# Assume role in target account
CREDENTIALS=$(aws sts assume-role \
--role-arn arn:aws:iam::444444444444:role/DeploymentRole-Prod \
--role-session-name prod-deployment-${BUILD_ID} \
--external-id unique-external-id-12345 \
--duration-seconds 900)
# Extract and export credentials
export AWS_ACCESS_KEY_ID=$(echo $CREDENTIALS | jq -r '.Credentials.AccessKeyId')
export AWS_SECRET_ACCESS_KEY=$(echo $CREDENTIALS | jq -r '.Credentials.SecretAccessKey')
export AWS_SESSION_TOKEN=$(echo $CREDENTIALS | jq -r '.Credentials.SessionToken')
# Now all AWS CLI commands use the assumed role
aws ecs update-service --cluster prod-cluster --service my-app
This creates a security chain: Pipeline Identity → Assume Deployment Role → Temporary Credentials → Deploy. Each link is auditable via CloudTrail, and the credentials expire automatically after fifteen minutes.
Diagram Suggestion: Create a flow diagram showing: Tools Account (CI/CD Pipeline) → Trust Policy Verification → STS AssumeRole → Temporary Credentials → Production Account Resources. Use cyan arrows (#00D9F5) for the credential flow and navy boxes (#0C1A2B) for accounts.
Reflection Prompt: If your team currently shares AWS access keys between accounts or manually copies credentials for cross-account work, how would migrating to this STS-based pattern improve both security and operational efficiency? Think about auditability, credential rotation, and reduced manual overhead.
Federation and Identity Providers: Enabling Workforce Access
Federation is how you enable your workforce — employees, contractors, partners — to access AWS resources using their existing corporate credentials. This eliminates the need to create and manage separate AWS user accounts for each person, which becomes unmanageable at scale.
Let me walk you through the most common federation patterns I’ve implemented in enterprise environments.
Pattern 1: AWS Identity Center (The Modern Standard)
AWS Identity Center, formerly AWS SSO, is now the recommended approach for workforce federation. It provides a centralized place to manage access across all your AWS accounts and applications.
How It Works:
You connect Identity Center to your corporate directory (like Azure AD, Okta, or Google Workspace). When users authenticate through your identity provider, Identity Center uses STS behind the scenes to provide them with temporary credentials for specific AWS accounts and permission sets.
The Key Benefits:
The beauty of this approach is that your users log in once using their corporate credentials, and they can access multiple AWS accounts without needing separate credentials for each. When someone leaves your organization, you disable their corporate account once, and they immediately lose access to all AWS resources.
Real-World Impact:
I implemented this at a company with two hundred AWS accounts last year. Before Identity Center, they were managing hundreds of IAM users manually. After migration, new employees automatically got the right AWS access based on their department, and offboarding became instant and automatic.
Pattern 2: SAML Federation (The Enterprise Backbone)
For organizations that need fine-grained control or have complex authentication requirements, SAML 2.0 federation provides flexibility. Your identity provider becomes the source of truth for authentication, while AWS handles authorization through IAM roles.
The Authentication Flow:
The flow works like this: A user tries to access AWS. They’re redirected to your identity provider (like Okta or Azure AD) for authentication. After successful authentication, the identity provider generates a SAML assertion containing user attributes and group memberships. This assertion is sent to AWS, which validates it and calls AssumeRoleWithSAML to issue temporary credentials.
Critical Configuration: Attribute Mapping:
The key to successful SAML federation is getting the attribute mappings right. Your SAML assertion must include attributes that AWS can map to IAM roles. Here’s what I typically configure:
The SAML assertion includes attributes like Role (which specifies which IAM role or roles the user can assume), RoleSessionName (which identifies the user in CloudTrail logs), and SessionDuration (which determines how long the credentials should be valid).
Trust Policy Structure:
Your IAM roles then have trust policies that accept your SAML provider:
{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::123456789012:saml-provider/AzureAD"
},
"Action": "sts:AssumeRoleWithSAML",
"Condition": {
"StringEquals": {
"SAML:aud": "https://signin.aws.amazon.com/saml"
}
}
}]
}
Pattern 3: OIDC Federation (The Modern CI/CD Pattern)
OpenID Connect federation has revolutionized how we handle CI/CD authentication. Instead of storing AWS credentials in your CI/CD system, you configure it as an OIDC identity provider in AWS.
The GitHub Actions Example:
GitHub Actions is a perfect example. When you configure OIDC federation with GitHub, here’s what happens behind the scenes: GitHub generates a JSON Web Token (JWT) for your workflow. The JWT contains claims about the repository, branch, and workflow. Your GitHub Actions workflow requests AWS credentials. AWS validates the JWT against your OIDC provider configuration. If valid, STS issues temporary credentials using AssumeRoleWithWebIdentity.
Granular Trust Controls:
The trust policy in your IAM role specifies exactly which repositories and branches can assume the role:
{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::123456789012:oidc-provider/token.actions.githubusercontent.com"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
"token.actions.githubusercontent.com:aud": "sts.amazonaws.com",
"token.actions.githubusercontent.com:sub": "repo:myorg/myrepo:ref:refs/heads/main"
}
}
}]
}
This means only workflows running from the specified repository and branch can assume the role. No credentials stored anywhere, automatic credential rotation with each workflow run, and complete auditability.
Universal Application:
I’ve implemented this pattern for EKS clusters using IRSA, for GitLab CI/CD pipelines, and even for Auth0-based web applications. The pattern is consistent: configure the OIDC provider, create roles with specific trust policies, and let STS handle the credential issuance.
Pattern 4: External Workforce and Third-Party Access
When you need to grant access to external entities — contractors, auditors, partners, vendors — federation becomes even more critical. You definitely don’t want to create permanent IAM users for people outside your organization.
The approach I recommend:
For short-term access (audits, consultants), use your existing federation with time-limited permission sets. Grant them access through Identity Center with an expiration date. When the engagement ends, simply remove their permission set assignment — no credential cleanup needed.
For longer-term external access (partner integrations, managed service providers), use role assumption with external IDs. The external party assumes a role in your account, but only if they provide the correct external ID:
{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::external-account:root"
},
"Action": "sts:AssumeRole",
"Condition": {
"StringEquals": {
"sts:ExternalId": "unique-id-provided-by-vendor"
}
}
}]
}
The external ID prevents the confused deputy problem where an attacker could trick the external party into accessing your resources.
Real-World Example: I worked with a company that needed to give their managed security service provider access to read CloudTrail logs and security findings. Instead of creating permanent credentials, we created a role with read-only security permissions, required an external ID, and limited the session duration to four hours. The security provider’s automation would assume the role every four hours to collect logs. This meant if their own infrastructure was compromised, any stolen credentials would be useless after four hours at most.
Pattern 5: IAM Roles Anywhere (For On-Premise and Hybrid Servers)
If you have servers running outside AWS — whether on-premise data centers, other cloud providers, or edge locations — the traditional approach was to create IAM users with long-lived access keys. This creates the exact security problems STS was designed to solve: permanent credentials that can leak, don’t rotate automatically, and violate zero-trust principles.
AWS IAM Roles Anywhere, launched in 2022 and now the 2025 standard for hybrid access, extends the temporary credential model to infrastructure outside AWS using PKI (Public Key Infrastructure) and X.509 certificates.
How IAM Roles Anywhere Works:
Instead of managing access keys, you use digital certificates to prove identity. Here’s the flow I’ve implemented for hybrid environments:
Your organization has a Certificate Authority (CA) — this could be your corporate CA, AWS Private CA, or any trusted CA. You register this CA with IAM Roles Anywhere as a trust anchor. Your on-premise servers obtain X.509 certificates from this CA, which serve as their identity credentials.
When a server needs AWS access, it runs the AWS signing helper utility, which creates a cryptographic signature using its private key. This signature is sent to IAM Roles Anywhere along with the certificate. IAM Roles Anywhere validates the certificate against the trust anchor and, if valid, calls STS to issue temporary credentials. The server receives standard AWS temporary credentials (access key, secret key, session token) just like any other STS operation.
These credentials expire after the configured session duration (typically one to twelve hours), and the server automatically requests fresh credentials by repeating the signing process.
Why This Matters for Security:
The private key never leaves the server — only signatures are transmitted. Even if network traffic is intercepted, an attacker can’t use it to impersonate the server. No long-lived AWS credentials exist anywhere on-premise. Certificate rotation is handled by your existing PKI infrastructure. Access can be instantly revoked by removing the trust anchor or updating the role’s trust policy.
Quick Setup Example:
First, create a trust anchor in IAM Roles Anywhere using your CA certificate:
# Upload your CA certificate to create a trust anchor
aws rolesanywhere create-trust-anchor \
--name "OnPremise-CA" \
--source sourceType=CERTIFICATE_BUNDLE,sourceData={x509CertificateData="$(cat ca-cert.pem | base64)"} \
--enabled
Create an IAM role that trusts this certificate-based identity:
{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Principal": {
"Service": "rolesanywhere.amazonaws.com"
},
"Action": [
"sts:AssumeRole",
"sts:SetSourceIdentity"
],
"Condition": {
"StringEquals": {
"aws:PrincipalArn": "arn:aws:rolesanywhere:us-east-1:123456789012:trust-anchor/ta-12345"
}
}
}]
}
Create a profile that maps certificates to the role:
aws rolesanywhere create-profile \
--name "OnPremise-BackupServers" \
--role-arns "arn:aws:iam::123456789012:role/BackupServerRole" \
--duration-seconds 3600
On your on-premise server, install the AWS signing helper and configure it:
# Install the credential helper
wget https://rolesanywhere.amazonaws.com/releases/1.0.0/aws_signing_helper_linux
chmod +x aws_signing_helper_linux
mv aws_signing_helper_linux /usr/local/bin/
# Configure AWS CLI to use the credential helper
cat >> ~/.aws/config << EOF
[profile onprem]
credential_process = /usr/local/bin/aws_signing_helper_linux credential-process \ –certificate /etc/ssl/server-cert.pem \ –private-key /etc/ssl/server-key.pem \ –trust-anchor-arn arn:aws:rolesanywhere:us-east-1:123456789012:trust-anchor/ta-12345 \ –profile-arn arn:aws:rolesanywhere:us-east-1:123456789012:profile/pr-12345 \ –role-arn arn:aws:iam::123456789012:role/BackupServerRole EOF # Now use AWS CLI normally – credentials are obtained automatically aws s3 ls –profile onprem
Real-World Use Cases I’ve Implemented:
Backup servers: On-premise backup appliances that need to write backup data to S3 Glacier. Instead of embedding access keys in the appliance configuration, they use IAM Roles Anywhere with certificates that rotate annually through our PKI system.
Monitoring agents: Data center servers running CloudWatch agents that send metrics to AWS. Each server has a unique certificate, and IAM Roles Anywhere provides temporary credentials for metric submission. If a server is decommissioned, its certificate is revoked and AWS access ends immediately.
Legacy application migration: During a lift-and-shift migration, applications running on-premise needed temporary AWS access during the transition period. IAM Roles Anywhere provided secure access without modifying application code to use AWS credentials.
The Hybrid Security Model:
IAM Roles Anywhere brings the Zero Standing Privileges principle to your entire infrastructure, not just AWS-native resources. Combined with AWS Identity Center for human access, OIDC for CI/CD, and IRSA for Kubernetes, you can achieve a truly credential-less architecture where every access path uses temporary credentials backed by cryptographic identity.
I’ve seen organizations reduce their IAM user count from hundreds to single digits by combining these five federation patterns. The operational overhead drops dramatically, security posture improves measurably, and audit compliance becomes straightforward.
Temporary Credential Lifetimes: Understanding Sessions
One question I get constantly is: “How long should my STS session last?” The answer, as with most security decisions, is: as short as practical.
When you assume a role using STS, you can specify a session duration between fifteen minutes and twelve hours. The maximum duration is actually controlled by two settings: the MaxSessionDuration configured on the IAM role itself, and the duration you request when assuming the role. You get whichever is shorter.
Here’s how I think about session durations for different use cases:
For CI/CD deployments, keep sessions short — fifteen to thirty minutes is usually plenty. Most deployment operations complete quickly, and short sessions limit the damage window if credentials are somehow exposed. My typical Jenkins pipeline assumes a role, deploys the application, runs health checks, and the credentials expire before the pipeline even finishes cleaning up.
For interactive troubleshooting, one to four hours makes sense. This gives developers enough time to investigate issues, run queries, and perform debugging without constantly re-authenticating. But it’s short enough that if a developer’s laptop is compromised mid-session, the exposure window is limited.
For automated monitoring and data collection, match the session duration to your collection interval. If you’re polling metrics every five minutes, a fifteen-minute session is perfect. If you’re doing hourly batch processing, a two-hour session might make sense.
For emergency break-glass access, use shorter sessions with higher privileges. When someone needs admin access for emergency repairs, give them a powerful role but limit the session to one hour. This enforces the principle that elevated privileges should be temporary.
One important nuance: you cannot extend a session once it’s issued. When your credentials are about to expire, you need to assume the role again to get fresh credentials. Most AWS SDKs handle this automatically, but if you’re writing custom scripts, you need to detect approaching expiration and refresh:
# Check current credential expiration
aws sts get-caller-identity
expiration=$(aws configure get expiration)
# If expiring soon, re-assume the role
if [ $(($(date -d "$expiration" +%s) - $(date +%s))) -lt 300 ]; then
# Re-assume role to get fresh credentials
refresh_credentials
fi
Session Policies: Fine-Tuning Permissions
Here’s a powerful feature that many people don’t know about: when you assume a role, you can provide an inline session policy that further restricts the permissions. This is perfect for implementing just-in-time, least-privilege access.
Let’s say you have a deployment role with broad permissions, but for a specific deployment you only need S3 access. You can assume the role with a session policy:
aws sts assume-role \
--role-arn arn:aws:iam::123456789012:role/DeploymentRole \
--role-session-name limited-s3-session \
--policy '{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Action": "s3:*",
"Resource": "arn:aws:s3:::my-deployment-bucket/*"
}]
}'
The resulting temporary credentials can only perform S3 operations on that specific bucket, even though the underlying role has broader permissions. The session policy acts as a permission filter — it can only restrict, never expand, the role’s permissions.
I use session policies when I need to delegate temporary access for specific tasks without creating dedicated roles for every scenario.
Credential Lifetime Best Practices
After years of production experience, here are my recommendations for session durations:
Production deployments: 15 minutes
Your CI/CD pipeline should move fast. If a deployment takes longer than fifteen minutes, you might have bigger problems to solve first.
Developer sandbox access: 4 hours
Enough time to work without constant re-authentication, short enough to limit exposure from a compromised developer machine.
Automated system roles: 1 hour
Most automation doesn’t need longer sessions. If your scripts run longer, consider breaking them into smaller tasks.
Security auditor access: 1-2 hours
Long enough for investigation, short enough to minimize risk from compromised audit credentials.
Third-party vendor access: 1 hour
Never give external entities long-lived sessions. If they need ongoing access, they should re-authenticate frequently.
Emergency admin access: 1 hour
High-privilege access should always be time-bounded. If you need it longer, you should re-authorize.
Mini Quiz: You have an IAM role configured with MaxSessionDuration of 2 hours. When assuming this role, you request a session duration of 4 hours. What duration will your temporary credentials actually have?
Answer: 2 hours. The effective session duration is the minimum of the requested duration and the role’s MaxSessionDuration setting. The role’s maximum setting acts as a hard limit that cannot be exceeded by the AssumeRole request.
STS Best Practices: Security and Operations
After years of building and securing AWS environments, I’ve developed a set of patterns that consistently lead to better security outcomes. Let me share the STS best practices that matter most in production environments.
Replace IAM Users with Roles Everywhere Possible
This is the single most impactful security improvement you can make. IAM users with long-lived access keys are the primary source of credential leaks in AWS environments. Your goal should be to have zero or near-zero IAM users.
For human access, use federation through AWS Identity Center or SAML. For programmatic access from AWS services, use IAM roles. For CI/CD systems, use OIDC federation. For third-party access, use cross-account role assumption.
I’ve guided several organizations through this migration, and the pattern is consistent: start by identifying all IAM users, categorize them by usage (human, application, service, third-party), then systematically replace each category with role-based access. The security improvement is dramatic — you eliminate the entire class of “leaked access key” vulnerabilities.
Always Use External IDs for Third-Party Access
When you create a role that will be assumed by a third-party (a vendor, partner, or external service), always require an external ID. This prevents the confused deputy problem where an attacker tricks the third party into accessing your resources.
Generate a unique, random external ID for each third-party relationship:
# Generate a secure external ID
external_id=$(openssl rand -base64 32)
echo "Use this External ID: $external_id"
Then require it in your trust policy:
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::vendor-account:root"
},
"Action": "sts:AssumeRole",
"Condition": {
"StringEquals": {
"sts:ExternalId": "your-unique-external-id"
}
}
}
The external ID should be treated like a password — store it securely and share it only with the authorized third party.
Use Source Identity for Enhanced Auditability
When your applications assume roles on behalf of users, you can use the source identity feature to maintain the connection to the actual user. This is invaluable for audit trails.
aws sts assume-role \
--role-arn arn:aws:iam::123456789012:role/ApplicationRole \
--role-session-name app-session \
--source-identity user@example.com
The source identity then appears in CloudTrail logs for all actions taken with those temporary credentials. This means you can trace actions back to the actual user, even when they’re operating through an application role.
I use this pattern in multi-tenant applications where many users share the same application role. Source identity lets me answer questions like “which user deleted this S3 object?” even though they all used the same role.
Restrict Role Assumption with Conditions
Use IAM policy conditions to add extra security controls around who can assume roles and when. Common conditions I implement:
IP address restrictions:
{
"Effect": "Allow",
"Action": "sts:AssumeRole",
"Resource": "*",
"Condition": {
"IpAddress": {
"aws:SourceIp": ["10.0.0.0/8", "172.16.0.0/12"]
}
}
}
This ensures roles can only be assumed from your corporate network or VPN.
MFA requirement:
{
"Effect": "Allow",
"Action": "sts:AssumeRole",
"Resource": "*",
"Condition": {
"Bool": {
"aws:MultiFactorAuthPresent": "true"
}
}
}
This requires multi-factor authentication before assuming sensitive roles.
Time-based restrictions:
{
"Effect": "Allow",
"Action": "sts:AssumeRole",
"Resource": "*",
"Condition": {
"DateGreaterThan": {
"aws:CurrentTime": "2025-01-01T09:00:00Z"
},
"DateLessThan": {
"aws:CurrentTime": "2025-01-01T17:00:00Z"
}
}
}
This limits role assumption to business hours.
Prefer OIDC Over Access Keys for CI/CD
If you’re still storing AWS access keys in your CI/CD system, migrating to OIDC federation should be your top priority. The security benefits are enormous:
- No credentials stored in your CI/CD system
- Automatic credential rotation with each build
- Fine-grained control over which repositories and branches can access which roles
- Complete auditability through CloudTrail
- Zero credential management overhead
The setup takes an hour or two, and the ongoing maintenance is zero. I’ve migrated dozens of CI/CD systems to OIDC, and I’ve never regretted it.
Implement Role Chaining Carefully
Role chaining is when you assume one role, then use those credentials to assume another role. While this is sometimes necessary, be careful — AWS limits role chaining to one hop by default for security reasons.
A common scenario where I use role chaining: a Lambda function in account A needs to access resources in account B and C. The Lambda has an execution role in account A, which can assume roles in B and C:
# Lambda assumes role in target account
sts_client = boto3.client('sts')
assumed_role = sts_client.assume_role(
RoleArn='arn:aws:iam::target-account:role/TargetRole',
RoleSessionName='lambda-session'
)
# Use the assumed role credentials
session = boto3.Session(
aws_access_key_id=assumed_role['Credentials']['AccessKeyId'],
aws_secret_access_key=assumed_role['Credentials']['SecretAccessKey'],
aws_session_token=assumed_role['Credentials']['SessionToken']
)
However, those assumed credentials cannot assume another role (a second hop). If you need deeper chaining, you must explicitly enable it in the role’s trust policy, and you should have a very good reason.
Use Short Sessions for High-Privilege Roles
Administrative and production roles should have maximum session durations of one to two hours. This limits the window of exposure if credentials are compromised.
I configure all production access roles with:
{
"MaxSessionDuration": 3600 // 1 hour
}
For emergency break-glass roles with full admin access, I go even shorter — thirty minutes. High privileges should require frequent re-authentication.
Callout Box: 🚀 The Golden Rule: Replace long-lived credentials with STS everywhere possible. Every permanent IAM access key in your environment is a security risk waiting to happen. Make it a goal to have zero IAM users with active access keys — federate human access and use roles for everything else.
Monitoring, Logging, and Troubleshooting STS Activity
Security is only as good as your ability to detect when something goes wrong. STS provides extensive logging capabilities through AWS CloudTrail, and understanding how to monitor and troubleshoot STS activity is essential for maintaining a secure environment.
CloudTrail: Your STS Audit Log
Every STS API call is logged to CloudTrail, giving you a complete audit trail of who assumed which roles, when, and from where. The key events you should monitor:
AssumeRole events show when roles are assumed:
{
"eventName": "AssumeRole",
"userIdentity": {
"type": "IAMUser",
"principalId": "AIDAI123456",
"arn": "arn:aws:iam::111111111111:user/developer",
"accountId": "111111111111",
"userName": "developer"
},
"requestParameters": {
"roleArn": "arn:aws:iam::222222222222:role/ProductionAccess",
"roleSessionName": "developer-prod-session",
"durationSeconds": 3600
},
"sourceIPAddress": "203.0.113.42"
}
Pay attention to unusual patterns: role assumptions from unexpected IP addresses, unusual times, or by unfamiliar principals.
AssumeRoleWithSAML and AssumeRoleWithWebIdentity events show federated access:
{
"eventName": "AssumeRoleWithWebIdentity",
"userIdentity": {
"type": "WebIdentityUser",
"principalId": "accounts.google.com:user@example.com",
"userName": "user@example.com"
},
"requestParameters": {
"roleArn": "arn:aws:iam::123456789012:role/WebAppRole",
"roleSessionName": "web-session"
}
}
These logs let you trace federated access back to the actual user who authenticated.
CloudWatch Metrics and Alarms
While CloudTrail gives you detailed logs, CloudWatch metrics let you detect patterns and anomalies. I typically create metric filters and alarms for:
Failed AssumeRole attempts:
{ $.eventName = "AssumeRole" && $.errorCode = "AccessDenied" }
Alert when there are more than five failed attempts in five minutes — this could indicate an attack or misconfiguration.
Role assumptions from unusual locations:
{ $.eventName = "AssumeRole" && $.sourceIPAddress != "10.0.0.0/8" && $.sourceIPAddress != "172.16.0.0/12" }
Alert when roles are assumed from outside your corporate network.
High-privilege role usage:
{ $.eventName = "AssumeRole" && $.requestParameters.roleArn = "*AdminRole*" }
Get notified whenever anyone assumes an administrative role.
GuardDuty Findings
AWS GuardDuty has several detection capabilities specifically for STS abuse:
Suspicious role assumption patterns might indicate compromised credentials being used to explore your environment. GuardDuty detects unusual role chaining, roles being assumed from suspicious IP addresses, and rapid successive role assumptions.
Anomalous API calls after role assumption can indicate an attacker using stolen temporary credentials. GuardDuty learns normal behavior patterns and alerts on deviations.
I’ve seen GuardDuty catch several security incidents: a compromised CI/CD pipeline that started assuming roles it normally doesn’t touch, a leaked temporary credential being used from an unexpected geographic region, and an insider attempting to escalate privileges through role chaining.
IAM Access Analyzer
IAM Access Analyzer helps you identify overly permissive role trust policies. It analyzes your IAM roles and alerts you when roles can be assumed by external entities or have overly broad trust policies.
For example, it will flag this dangerous trust policy:
{
"Effect": "Allow",
"Principal": {
"AWS": "*"
},
"Action": "sts:AssumeRole"
}
This allows anyone in any AWS account to assume your role — a critical security vulnerability that Access Analyzer will immediately detect.
Troubleshooting Common STS Issues
Based on thousands of hours debugging AWS access issues, here are the most common STS problems and how to resolve them:
Problem: “User is not authorized to perform sts:AssumeRole”
This means the principal trying to assume the role doesn’t have permission. Check two places:
- Does the principal have sts:AssumeRole permission in their policies?
- Does the role’s trust policy allow this principal?
Both must be true. Use this diagnostic:
# Check what identity you're using
aws sts get-caller-identity
# Try to assume the role
aws sts assume-role --role-arn [role-arn] --role-session-name test
# If it fails, check CloudTrail for the detailed error
aws cloudtrail lookup-events \
--lookup-attributes AttributeKey=EventName,AttributeValue=AssumeRole
Problem: “The security token included in the request is expired”
Your temporary credentials have expired. This is normal behavior — temporary credentials have a finite lifetime. Your application needs to detect approaching expiration and request fresh credentials:
import boto3
from datetime import datetime, timedelta
def get_credentials_with_refresh():
sts_client = boto3.client('sts')
# Assume role
response = sts_client.assume_role(
RoleArn='arn:aws:iam::123456789012:role/MyRole',
RoleSessionName='app-session'
)
credentials = response['Credentials']
expiration = credentials['Expiration']
# Check if credentials expire in the next 5 minutes
if expiration - datetime.now(expiration.tzinfo) < timedelta(minutes=5):
# Refresh credentials
response = sts_client.assume_role(
RoleArn='arn:aws:iam::123456789012:role/MyRole',
RoleSessionName='app-session'
)
credentials = response['Credentials']
return credentials
Problem: “An error occurred (403) when calling the AssumeRole operation”
This typically indicates a trust policy issue. The role’s trust policy doesn’t allow your principal to assume it. Remember, trust policies are resource-based policies attached to the role being assumed.
Common mistakes in trust policies:
- Wrong principal ARN
- Missing or incorrect conditions
- Typos in the account ID
- Forgetting to include sts:ExternalId when required
Real-World Case Study: I once debugged an issue where a Lambda function in account A couldn’t assume a role in account B, despite the policies looking correct. After hours of investigation, we discovered that the Lambda execution role had been deleted and recreated with the same name but a different unique ID. The trust policy in account B still referenced the old role ID. The fix was updating the trust policy to reference the new role ARN. This taught me to always use role ARNs (which include the account ID) rather than specific role IDs in trust policies.
STS Security Incident Response Playbook
When temporary credentials are compromised or STS activity appears suspicious, fast and systematic response is critical. Here’s the production-tested incident response playbook I’ve used during real security events.
Step 1: Detect — Identify the Security Event
Detection Sources:
- GuardDuty Finding: UnauthorizedAccess, CredentialAccess, or Persistence type findings referencing STS
- CloudTrail Alarm: Multiple failed AssumeRole attempts, unusual geographic access, or role chaining anomalies
- Manual Report: User or security team identifies suspicious STS activity
- Third-Party Alert: Security vendor or partner reports potential credential compromise
Immediate Actions:
# Query CloudTrail for recent STS activity
aws cloudtrail lookup-events \
--lookup-attributes AttributeKey=EventName,AttributeValue=AssumeRole \
--start-time $(date -u -d '2 hours ago' +%Y-%m-%dT%H:%M:%S) \
--max-results 50 \
--query 'Events[*].[EventTime,Username,SourceIPAddress,Resources[0].ResourceName]' \
--output table
# Check for unusual role assumptions
aws cloudtrail lookup-events \
--lookup-attributes AttributeKey=EventName,AttributeValue=AssumeRoleWithSAML \
--start-time $(date -u -d '24 hours ago' +%Y-%m-%dT%H:%M:%S) \
--query 'Events[?contains(Resources[0].ResourceName, `suspicious-role`)]'
What to Look For:
- Role assumptions from unexpected IP addresses or geographic locations
- High-frequency role assumptions (credential stuffing attempts)
- Role assumptions during unusual hours (overnight, weekends)
- Unusual role chaining patterns
- Failed AssumeRole attempts followed by successful ones (privilege escalation)
- AssumeRole calls with session names containing suspicious patterns
Document Everything: Create an incident ticket immediately. Record the first detection time, affected roles, suspicious IP addresses, and initial evidence.
Step 2: Isolate — Contain the Threat
Priority: Stop ongoing unauthorized access immediately.
Option A: Revoke All Active Sessions for a Role (Nuclear Option)
# Add explicit deny to the role's inline policy - revokes ALL active sessions
aws iam put-role-policy \
--role-name CompromisedRole \
--policy-name RevokeAllSessions-$(date +%Y%m%d-%H%M%S) \
--policy-document '{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Deny",
"Action": "*",
"Resource": "*",
"Condition": {
"DateLessThan": {
"aws:TokenIssueTime": "'$(date -u +%Y-%m-%dT%H:%M:%SZ)'"
}
}
}
]
}'
echo "✅ All sessions issued before $(date -u +%Y-%m-%dT%H:%M:%SZ) are now revoked"
How It Works: This policy denies all actions for credentials issued before the current timestamp. Any active sessions become immediately useless, while new role assumptions will work normally.
Option B: Restrict Role to Specific IPs (Surgical Option)
# Modify trust policy to only allow role assumption from known good IPs
aws iam update-assume-role-policy \
--role-name CompromisedRole \
--policy-document '{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Principal": {"AWS": "arn:aws:iam::123456789012:root"},
"Action": "sts:AssumeRole",
"Condition": {
"IpAddress": {
"aws:SourceIp": ["10.0.0.0/8", "YOUR-KNOWN-GOOD-IP/32"]
}
}
}]
}'
echo "✅ Role now only assumable from trusted IPs"
Option C: Disable Role Assumption Entirely (Emergency Stop)
# Replace trust policy with explicit deny
aws iam update-assume-role-policy \
--role-name CompromisedRole \
--policy-document '{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Deny",
"Principal": {"AWS": "*"},
"Action": "sts:AssumeRole"
}]
}'
echo "🛑 Role assumption completely disabled"
For Third-Party Compromises:
# If external ID was compromised, rotate it immediately
# Update trust policy with new external ID
NEW_EXTERNAL_ID=$(openssl rand -base64 32)
echo "New External ID: $NEW_EXTERNAL_ID"
# Update trust policy (coordinate with third party)
aws iam update-assume-role-policy \
--role-name ThirdPartyRole \
--policy-document '{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Principal": {"AWS": "arn:aws:iam::VENDOR-ACCOUNT:root"},
"Action": "sts:AssumeRole",
"Condition": {
"StringEquals": {
"sts:ExternalId": "'$NEW_EXTERNAL_ID'"
}
}
}]
}'
Step 3: Rotate — Eliminate Attacker Persistence
Recreate the compromised role with a new unique ID:
#!/bin/bash
# rotate-compromised-role.sh
ROLE_NAME="CompromisedRole"
BACKUP_SUFFIX=$(date +%Y%m%d-%H%M%S)
# 1. Get current role details
aws iam get-role --role-name $ROLE_NAME > role-backup-${BACKUP_SUFFIX}.json
# 2. Get all attached policies
aws iam list-attached-role-policies --role-name $ROLE_NAME > policies-backup-${BACKUP_SUFFIX}.json
# 3. Get all inline policies
aws iam list-role-policies --role-name $ROLE_NAME > inline-policies-backup-${BACKUP_SUFFIX}.json
# 4. Delete the old role (this invalidates ALL sessions)
aws iam delete-role --role-name $ROLE_NAME
# 5. Recreate role with same name but new unique ID
aws iam create-role \
--role-name $ROLE_NAME \
--assume-role-policy-document file://trust-policy.json \
--description "Recreated after security incident ${BACKUP_SUFFIX}"
# 6. Reattach policies from backup
# (Script continues with policy reattachment...)
echo "✅ Role rotated successfully. All previous sessions invalid."
For Federated Access:
# If SAML/OIDC provider trust was abused, recreate trust with stricter conditions
# Add MFA requirement, IP restrictions, time-based access
For Cross-Account Access:
- Coordinate with all accounts that can assume the role
- Update trust policies in all accounts simultaneously
- Rotate external IDs for all third-party relationships
Step 4: Forensic — Understand the Breach
Collect comprehensive evidence for post-incident analysis:
#!/bin/bash
# forensic-collection.sh
# Collect all STS-related activity for forensic analysis
INCIDENT_DATE="2025-03-15"
OUTPUT_DIR="incident-forensics-$(date +%Y%m%d-%H%M%S)"
mkdir -p $OUTPUT_DIR
# 1. Collect all STS API calls for the affected time period
aws cloudtrail lookup-events \
--lookup-attributes AttributeKey=EventName,AttributeValue=AssumeRole \
--start-time ${INCIDENT_DATE}T00:00:00 \
--end-time ${INCIDENT_DATE}T23:59:59 \
--max-results 1000 > ${OUTPUT_DIR}/assume-role-events.json
# 2. Collect all actions taken by suspected sessions
SUSPICIOUS_SESSION="suspicious-session-name"
aws cloudtrail lookup-events \
--start-time ${INCIDENT_DATE}T00:00:00 \
--query "Events[?contains(UserIdentity.Arn, '$SUSPICIOUS_SESSION')]" \
> ${OUTPUT_DIR}/suspicious-session-activity.json
# 3. Extract unique IP addresses used
jq -r '.Events[].SourceIPAddress' ${OUTPUT_DIR}/assume-role-events.json | \
sort -u > ${OUTPUT_DIR}/unique-ips.txt
# 4. Collect VPC Flow Logs if the access came from EC2/Lambda
# (Coordinate with network team for flow log analysis)
# 5. Export GuardDuty findings
aws guardduty list-findings \
--detector-id YOUR-DETECTOR-ID \
--finding-criteria '{"Criterion":{"service.action.actionType":{"Eq":["AWS_API_CALL"]}}}' \
> ${OUTPUT_DIR}/guardduty-findings.json
# 6. Check for data exfiltration or resource creation
aws cloudtrail lookup-events \
--lookup-attributes AttributeKey=EventName,AttributeValue=CreateAccessKey \
--start-time ${INCIDENT_DATE}T00:00:00 \
> ${OUTPUT_DIR}/access-key-creation.json
aws cloudtrail lookup-events \
--lookup-attributes AttributeKey=EventName,AttributeValue=RunInstances \
--start-time ${INCIDENT_DATE}T00:00:00 \
> ${OUTPUT_DIR}/ec2-launches.json
echo "✅ Forensic data collected in ${OUTPUT_DIR}/"
Analyze the Attack Timeline:
- First unauthorized AssumeRole call (entry point)
- Escalation attempts (trying to assume other roles)
- Data access (S3 GetObject, RDS connections, etc.)
- Resource creation (EC2 instances, Lambda functions)
- Persistence mechanisms (creating new access keys, IAM users)
Key Questions to Answer:
- How did the attacker initially gain access to assume the role?
- What actions did they take with the temporary credentials?
- Did they create any persistence mechanisms?
- Was data exfiltrated?
- Were other roles or accounts affected?
Step 5: Remediate — Fix Root Causes
Address the security gaps that allowed the incident:
Trust Policy Hardening:
{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::111111111111:role/CICDPipelineRole"
},
"Action": "sts:AssumeRole",
"Condition": {
"StringEquals": {
"sts:ExternalId": "NEW-ROTATED-EXTERNAL-ID"
},
"IpAddress": {
"aws:SourceIp": ["10.0.0.0/8", "52.94.76.0/22"]
},
"StringEquals": {
"sts:RequestedRegion": "us-east-1"
}
}
}]
}
Enable Additional Controls:
# 1. Enable MFA requirement for sensitive roles
# (Add to trust policy or permission policies)
# 2. Reduce maximum session duration
aws iam update-role \
--role-name SensitiveRole \
--max-session-duration 3600 # 1 hour only
# 3. Enable AWS CloudTrail data events for all S3 buckets
# (Catch data exfiltration earlier)
# 4. Configure AWS Config rules for compliance
aws configservice put-config-rule \
--config-rule '{
"ConfigRuleName": "iam-role-managed-policy-check",
"Description": "Checks IAM roles for overly permissive managed policies",
"Source": {
"Owner": "AWS",
"SourceIdentifier": "IAM_POLICY_NO_STATEMENTS_WITH_ADMIN_ACCESS"
}
}'
# 5. Enable GuardDuty if not already enabled
aws guardduty create-detector --enable
Implement Session Policies for Least Privilege:
# For applications that need restricted access
import boto3
import json
def assume_role_with_session_policy():
sts = boto3.client('sts')
# Session policy further restricts the role's permissions
session_policy = {
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Action": ["s3:GetObject", "s3:PutObject"],
"Resource": "arn:aws:s3:::specific-bucket/*"
}]
}
response = sts.assume_role(
RoleArn='arn:aws:iam::123456789012:role/ApplicationRole',
RoleSessionName='restricted-session',
Policy=json.dumps(session_policy),
DurationSeconds=900 # 15 minutes only
)
return response['Credentials']
Step 6: Report — Document and Notify
Internal Reporting:
## STS Security Incident Report
**Date:** 2025-03-15
**Incident ID:** INC-2025-0315-001
**Severity:** High
### Executive Summary
Unauthorized AssumeRole activity detected on ProductionDeployRole from
unknown IP 203.0.113.42 between 14:30-15:45 UTC.
### Timeline
- 14:30 UTC: First unauthorized AssumeRole detected
- 14:32 UTC: GuardDuty alert triggered
- 14:35 UTC: Security team notified
- 14:40 UTC: Role sessions revoked via policy update
- 15:00 UTC: Role recreated with stricter trust policy
- 15:45 UTC: Forensic collection completed
### Impact
- 12 unauthorized AWS API calls executed
- No data exfiltration detected
- No resources created
- $0 unexpected charges incurred
### Root Cause
OIDC provider trust policy lacked branch restrictions, allowing
attacker to assume role from forked repository.
### Remediation Actions
1. ✅ Role sessions revoked immediately
2. ✅ Trust policy updated with branch restrictions
3. ✅ External ID rotated and shared with authorized parties
4. ✅ Maximum session duration reduced to 1 hour
5. ✅ CloudWatch alarms configured for this role
6. ⏳ Company-wide OIDC trust policy audit (Due: 2025-03-20)
### Lessons Learned
- OIDC trust policies must include specific branch/tag restrictions
- Need faster alerting on unusual AssumeRole patterns
- Incident response playbook worked well, executed in 15 minutes
Compliance & Billing Notifications:
# 1. Check for unexpected AWS charges
aws ce get-cost-and-usage \
--time-period Start=${INCIDENT_DATE},End=$(date -d "${INCIDENT_DATE} + 1 day" +%Y-%m-%d) \
--granularity DAILY \
--metrics BlendedCost \
--group-by Type=SERVICE
# 2. Notify compliance team if regulated data was accessed
# (Check CloudTrail for access to S3 buckets containing PII/PHI)
# 3. If customer data was accessed, trigger breach notification process
External Notifications (if required):
- Notify third-party vendors if their credentials were affected
- Coordinate external ID rotation with all partners
- Report to relevant compliance authorities (GDPR, HIPAA, etc.) if data breach occurred
Post-Incident Actions:
- Schedule post-mortem meeting within 48 hours
- Update incident response playbook based on lessons learned
- Conduct table-top exercise for similar scenarios
- Review and update all STS trust policies across the organization
- Implement additional detective controls identified during forensics
Recovery Time Objectives (RTOs) for STS Incidents:
- Detection to Containment: < 15 minutes
- Containment to Rotation: < 30 minutes
- Full Forensic Analysis: < 24 hours
- Remediation Complete: < 48 hours
Common STS Mistakes to Avoid
After consulting on hundreds of AWS environments, I’ve seen the same STS mistakes repeatedly. Let me help you avoid them.
Overly Permissive AssumeRole Permissions
The worst mistake I see is granting broad AssumeRole permissions:
{
"Effect": "Allow",
"Action": "sts:AssumeRole",
"Resource": "*"
}
This allows the principal to assume any role they can reach. Instead, explicitly list the roles:
{
"Effect": "Allow",
"Action": "sts:AssumeRole",
"Resource": [
"arn:aws:iam::123456789012:role/DeploymentRole",
"arn:aws:iam::456789012345:role/ProductionReadOnly"
]
}
If you need wildcards, at least scope them by naming convention:
{
"Effect": "Allow",
"Action": "sts:AssumeRole",
"Resource": "arn:aws:iam::*:role/CICD-*"
}
Not Enabling CloudTrail for STS Events
STS events are crucial for security investigations, but some organizations don’t log them properly. Ensure your CloudTrail trail is:
- Logging global service events (where STS events appear)
- Enabled in all regions
- Writing to a secure S3 bucket with proper retention
- Monitored with CloudWatch alarms
I’ve investigated security incidents where the lack of STS logging meant we couldn’t determine which roles were compromised or what actions were taken.
Implementing Federation Without MFA
When you set up SAML or OIDC federation, enforce multi-factor authentication at the identity provider level. Without MFA, federated access is only as secure as user passwords — and users are notoriously bad at choosing strong, unique passwords.
In your trust policies, you can also enforce MFA requirements:
{
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::123456789012:saml-provider/Corporate"
},
"Action": "sts:AssumeRoleWithSAML",
"Condition": {
"Bool": {
"saml:aud": "https://signin.aws.amazon.com/saml"
},
"StringEquals": {
"saml:multi-factor-auth-present": "true"
}
}
}
Misconfigured Trust Policies Leading to Unintended Access
Trust policies are the gatekeeper for role assumption, and small mistakes can have big consequences. I’ve seen trust policies that accidentally granted access to the wrong account:
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::12345678901:root" // Typo! Wrong account
},
"Action": "sts:AssumeRole"
}
Always double-check account IDs. Use IAM Access Analyzer to detect overly permissive trust policies.
Excessive Role Chaining
While role chaining (assuming a role, then using those credentials to assume another role) is sometimes necessary, chains longer than one hop create complexity and debugging nightmares.
If you find yourself needing deep role chains, reconsider your architecture. Usually, there’s a cleaner design that doesn’t require complex chaining.
Setting Extremely Long Session Durations
I’ve seen roles configured with twelve-hour maximum session durations for convenience. This defeats much of the security benefit of temporary credentials.
For most use cases, sessions should be one to four hours. Only increase session durations when you have a specific, documented reason, and never set them longer than necessary.
Ignoring External ID Requirements
When granting third-party access, always require and verify external IDs. Skipping this opens you to confused deputy attacks where an attacker tricks the third party into accessing your resources.
Generate unique, unpredictable external IDs for each third-party relationship and treat them as secrets.
STS Pricing: The Cost of Security
One of the beautiful aspects of AWS STS is that the service itself is completely free. You don’t pay per API call, per assumed role, or per temporary credential issued. AWS provides STS at no charge as part of the IAM service.
However, this doesn’t mean STS is without cost considerations. The actions you take with temporary credentials can absolutely incur costs, and security incidents involving STS can be very expensive.
What’s Actually Free
Every STS API call is free:
- AssumeRole calls: Free
- AssumeRoleWithSAML calls: Free
- AssumeRoleWithWebIdentity calls: Free
- GetSessionToken calls: Free
- GetCallerIdentity calls: Free
The temporary credentials themselves have no cost. You can assume roles millions of times per day at no charge.
Where Costs Can Occur
The costs come from what you do with those credentials:
API calls and service usage made with temporary credentials are billed normally. If someone assumes a role and launches expensive EC2 instances or transfers terabytes of data, you pay for those resources just as you would with any other credentials.
Data transfer costs can accumulate if roles are used to transfer large amounts of data, especially cross-region or to the internet.
Resource creation with assumed roles is billed normally. If a compromised temporary credential is used to launch cryptocurrency miners, you’ll receive a massive bill.
Security Incident Costs
The most expensive STS-related scenarios I’ve seen involve compromised temporary credentials being used for unauthorized activities:
Cryptocurrency mining using stolen temporary credentials can rack up hundreds of thousands in EC2 costs within hours. While STS itself is free, the resources launched with compromised credentials are very expensive.
Data exfiltration can lead to significant data transfer costs, plus the incalculable cost of a data breach.
Resource hijacking where attackers modify existing resources or create new ones for their purposes.
This is why short session durations are so important. Even if credentials are compromised, limiting their validity to fifteen or thirty minutes dramatically reduces potential damage.
Cost Optimization Through STS
Paradoxically, proper use of STS can actually reduce costs:
Eliminating unused IAM users means fewer potential vectors for unauthorized access that could result in unexpected charges.
Scoped permissions through session policies can prevent accidental resource creation or modification that leads to costs.
Audit trails through CloudTrail help you quickly identify and stop expensive unauthorized activities.
Real-World Caution: STS makes your environment more secure, but security is about the whole system — the IAM policies, trust relationships, monitoring, and incident response procedures. A poorly configured role assumed via STS can be just as dangerous as a compromised IAM user. The temporary nature of STS credentials limits the damage window, but proper security hygiene remains essential.
Conclusion: STS as Your AWS Security Foundation
If I could give you one piece of advice that would most improve your AWS security posture, it would be this: embrace temporary credentials everywhere possible. AWS Security Token Service isn’t just another service — it’s the foundation of modern cloud security architecture.
Think about what STS enables: you can grant access that automatically expires, you can scope permissions to exactly what’s needed right now, you can audit every role assumption, and you can eliminate the entire class of problems caused by long-lived credentials. This is the difference between “we have AWS security” and “we have AWS security done right.”
In my years working with AWS, I’ve seen the industry shift from permanent IAM users being the norm to temporary credentials being the standard. Organizations that embrace this shift — federating workforce access, using OIDC for CI/CD, implementing cross-account role assumption, and eliminating standing privileges — consistently have better security outcomes and fewer incidents.
Remember: STS is the backbone of secure, temporary, and scalable access in AWS. If IAM defines who can access what, STS is the mechanism that makes that access temporary, auditable, and safe.
The journey to a fully role-based, temporary-credential architecture takes time. Start by identifying your highest-risk permanent credentials — perhaps IAM users in your CI/CD system or access keys embedded in applications. Migrate those first. Then tackle workforce access through federation. Finally, establish role assumption patterns for cross-account access and third-party integrations.
Your AWS environment will be more secure, more auditable, and honestly, easier to manage. Temporary credentials eliminate entire categories of operational overhead — no more rotating keys, no more wondering which credentials are still in use, no more manual credential lifecycle management.
Reflection Prompt: Look at your AWS environment right now. How many permanent IAM user credentials exist? Which of those could be replaced with STS-based temporary access in the next sprint? What’s stopping you from making that change today?
Frequently Asked Questions
What is AWS STS?
AWS Security Token Service (STS) is a web service that enables you to request temporary, limited-privilege credentials for accessing AWS resources. These credentials consist of an access key ID, secret access key, and session token that work together for authentication. Unlike permanent IAM user credentials, STS credentials automatically expire after a defined period (from fifteen minutes to twelve hours), significantly reducing security risks.
How does AssumeRole work in AWS?
AssumeRole is an STS API operation that allows an IAM entity to temporarily take on the permissions of an IAM role. When you call AssumeRole, you specify the ARN of the role you want to assume. STS validates that your current identity is allowed to assume that role by checking the role’s trust policy. If authorized, STS issues temporary security credentials with the permissions attached to the target role. These credentials expire after the requested session duration.
How do I enable cross-account access using STS?
To enable cross-account access, create an IAM role in the target account with the necessary permissions and a trust policy that allows principals from the source account to assume it. In the source account, grant your users or roles permission to call sts:AssumeRole for the target role. When assuming the role, use the AWS CLI or SDK with the –role-arn parameter pointing to the role in the target account. For enhanced security, use external IDs in the trust policy when granting access to third parties.
Is AWS STS secure?
Yes, AWS STS is secure when properly configured. It provides several security advantages over permanent credentials: automatic expiration limits the damage window if credentials are compromised, temporary credentials can be scoped with session policies for least-privilege access, all role assumptions are logged in CloudTrail for auditability, and the absence of long-lived credentials eliminates entire classes of credential leakage vulnerabilities. However, security depends on proper configuration of trust policies, appropriate session durations, and monitoring of STS activity.
How long do AWS STS credentials last?
STS credential lifetimes vary by API and configuration. AssumeRole credentials last from fifteen minutes to twelve hours, with the actual duration determined by the minimum of what you request and the role’s MaxSessionDuration setting. GetSessionToken credentials last from fifteen minutes to thirty-six hours. AssumeRoleWithSAML and AssumeRoleWithWebIdentity credentials last from fifteen minutes to twelve hours. Most AWS SDKs automatically refresh credentials before they expire when using IAM roles.
Is AWS STS free?
Yes, AWS STS itself is completely free. There are no charges for API calls to STS, regardless of how many times you assume roles or generate temporary credentials. However, any AWS resources or services you access using STS credentials are billed normally. Actions taken with temporary credentials incur the same costs as actions taken with permanent credentials — the difference is in security posture, not pricing.
Next Steps: Implementing STS in Your Environment
Ready to improve your AWS security with STS? Here’s your action plan:
Week 1: Audit your IAM users and identify candidates for migration to role-based access.
Week 2: Implement federation through AWS Identity Center for workforce access.
Week 3: Kill Your CI/CD Access Keys with Terraform (Deploy This Today)
Don’t just plan to migrate to OIDC — deploy this production-ready Terraform configuration to enable GitHub Actions OIDC federation immediately. This eliminates stored credentials in your CI/CD system and takes less than fifteen minutes to implement.
# terraform/github-oidc-federation.tf
# This configuration enables GitHub Actions to assume AWS roles without storing credentials
# Create the OIDC provider for GitHub Actions
resource "aws_iam_openid_connect_provider" "github" {
url = "https://token.actions.githubusercontent.com"
client_id_list = ["sts.amazonaws.com"]
# GitHub's thumbprint - verify current value at:
# https://github.blog/changelog/2022-01-13-github-actions-update-on-oidc-based-deployments-to-aws/
thumbprint_list = ["1c58a3a8518e8759bf075b76b750d4f2df264fcd"]
tags = {
Name = "github-actions-oidc"
Environment = "production"
ManagedBy = "terraform"
}
}
# Create the IAM role that GitHub Actions will assume
resource "aws_iam_role" "github_actions" {
name = "github-actions-deploy-role"
assume_role_policy = data.aws_iam_policy_document.github_assume_role.json
max_session_duration = 3600 # 1 hour - adjust based on your deployment time
tags = {
Name = "github-actions-deploy-role"
Environment = "production"
Purpose = "CI/CD deployments via OIDC"
}
}
# Define the trust policy - which repositories can assume this role
data "aws_iam_policy_document" "github_assume_role" {
statement {
actions = ["sts:AssumeRoleWithWebIdentity"]
effect = "Allow"
principals {
type = "Federated"
identifiers = [aws_iam_openid_connect_provider.github.arn]
}
condition {
test = "StringEquals"
variable = "token.actions.githubusercontent.com:aud"
values = ["sts.amazonaws.com"]
}
# CRITICAL: This condition limits access to your specific repositories
# Replace 'my-org' and 'my-repo' with your actual GitHub organization and repository
condition {
test = "StringLike"
variable = "token.actions.githubusercontent.com:sub"
# Allow any branch in the specified repository
values = ["repo:my-org/my-repo:*"]
# For more granular control, specify exact branches:
# values = [
# "repo:my-org/my-repo:ref:refs/heads/main",
# "repo:my-org/my-repo:ref:refs/heads/production"
# ]
}
}
}
# Attach deployment permissions to the role
# Example: ECS deployment permissions (customize for your use case)
resource "aws_iam_role_policy" "github_actions_deploy" {
name = "github-actions-deployment-permissions"
role = aws_iam_role.github_actions.id
policy = data.aws_iam_policy_document.deployment_permissions.json
}
data "aws_iam_policy_document" "deployment_permissions" {
# ECS deployment permissions
statement {
actions = [
"ecs:UpdateService",
"ecs:DescribeServices",
"ecs:DescribeTaskDefinition",
"ecs:RegisterTaskDefinition"
]
resources = ["*"]
}
# ECR permissions for pushing images
statement {
actions = [
"ecr:GetAuthorizationToken",
"ecr:BatchCheckLayerAvailability",
"ecr:PutImage",
"ecr:InitiateLayerUpload",
"ecr:UploadLayerPart",
"ecr:CompleteLayerUpload"
]
resources = ["*"]
}
# S3 deployment permissions
statement {
actions = [
"s3:PutObject",
"s3:GetObject",
"s3:ListBucket"
]
resources = [
"arn:aws:s3:::my-deployment-bucket",
"arn:aws:s3:::my-deployment-bucket/*"
]
}
}
# Output the role ARN for use in GitHub Actions
output "github_actions_role_arn" {
description = "ARN of the IAM role for GitHub Actions - add this to your workflow"
value = aws_iam_role.github_actions.arn
}
output "setup_instructions" {
description = "Next steps to complete the setup"
value = <<-EOT
✅ OIDC Provider and Role Created Successfully!
Next steps:
1. Add this role ARN to your GitHub Actions workflow:
role-to-assume: ${aws_iam_role.github_actions.arn}
2. Update your .github/workflows/deploy.yml with:
permissions:
id-token: write
contents: read
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: ${aws_iam_role.github_actions.arn}
aws-region: us-east-1
3. Delete your old AWS access keys from GitHub Secrets
4. Test a deployment to verify OIDC authentication works
EOT
}
Deploy This Configuration:
# Initialize Terraform
terraform init
# Preview the changes
terraform plan
# Deploy (this takes ~30 seconds)
terraform apply
# Terraform will output the role ARN and next steps
Update Your GitHub Workflow:
After applying the Terraform configuration, update your .github/workflows/deploy.yml:
name: Deploy to AWS
on:
push:
branches: [main]
# CRITICAL: These permissions enable OIDC token generation
permissions:
id-token: write
contents: read
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
# No AWS credentials in secrets - OIDC handles it
- name: Configure AWS credentials via OIDC
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: arn:aws:iam::123456789012:role/github-actions-deploy-role
aws-region: us-east-1
role-session-name: GitHubActions-${{ github.run_id }}
- name: Verify AWS identity
run: aws sts get-caller-identity
- name: Deploy your application
run: |
# Your deployment commands here
echo "Deploying with temporary credentials..."
Security Wins:
- ✅ Zero credentials stored in GitHub
- ✅ Automatic credential rotation every workflow run
- ✅ Scoped access to specific repositories and branches
- ✅ Complete audit trail in CloudTrail
- ✅ No more key rotation headaches
Week 4: Establish cross-account role assumption patterns for multi-account deployments.
Ongoing: Monitor STS activity through CloudTrail and continuously reduce your permanent credential footprint.
👉 Download the Free AWS STS Cheat Sheet (2025 Edition)
Get instant access to our comprehensive STS reference guide including:
- All STS API commands with examples
- Cross-account architecture patterns
- Trust policy templates
- Session policy examples
- OIDC federation configurations
- Troubleshooting flowcharts
[Download Now] → thedevopstooling.com/aws-sts-cheat-sheet
Want to master AWS security? Explore our comprehensive AWS certification courses covering Solutions Architect Associate, Solutions Architect Professional, and Security Specialty. Learn from real-world DevOps experience with hands-on labs and production-grade examples.
About the Author: Srikanth Ch is a Senior DevOps Engineer with over a decade of experience architecting and securing cloud infrastructure at scale. He founded thedevopstooling.com to share practical DevOps knowledge and help engineers advance their cloud careers. Connect with him on LinkedIn or follow thedevopstooling.com for weekly DevOps tutorials.
