AWS Step Functions

Challenge Inside! : Find out where you stand! Try quiz, solve problems & win rewards!
Learn via video courses


AWS Step Function is an AWS service that allows you to design event-driven, serverless applications. These functions are designed based on finite state machines and are well integrated with various AWS Services. Let's explore what exactly AWS Step Functions are and the different aspects related to them.

AWS Step Functions

AWS Step Functions is a low-code, serverless, orchestration service that enables developers to create event-driven workflows in AWS using AWS Lambda and other AWS services.

That's a lot of keywords! Let's break down each of these:

Low-code enables developers to write simple "code" which is later compiled into a complete program. It abstracts the complex structures and intricacies involved in a programming language in favor of readability and faster delivery.

Serverless computing is a cloud-native service that lets developers build applications without needing to maintain servers that run 24/7. Instead, the applications are spun on ephemeral virtual machines which execute the code and then shut down immediately.

An orchestration service is a platform that allows you to connect multiple services or applications. In the case of AWS Step Functions, the services can be AWS Lambda Functions or other AWS services like Amazon SNS and AWS Batch.

Event-driven workflows are designed to trigger based on events. Event-driven systems are required when the input to a service or application depends on the outcome of the previous one.

AWS Lambda is AWS's serverless computing platform which allows you to define functions in a variety of languages like Node.js, Python, and Go.

AWS Step Functions is an excellent solution when you need to design an event-driven application and take advantage of serverless computing at the same time.

Let's see how you can improve the scalability and availability of an example application using microservice architecture and AWS Step Functions.

For this example, let's assume you run an online shopping website. This online shopping website has a checkout option that calculates the taxes and other charges and outputs a total. Your website also offers an automatic discount if the total crosses Rs. 1000.

In the traditional monolithic design, all of this functionality is packaged into one service and deployed. When the traffic to your online shopping websites increases you need to ensure your service is upscaled accordingly. Also, suppose there is an issue in your "discount" service due to a bug you would need to either disable the entire checkout functionality or re-deploy the entire service with the fix.

AWS Step Functions

Whereas in the event-driven microservice architecture, each service is packaged separately and deployed independently. These services are connected using AWS Step Functions and Amazon State Language. When the traffic to your online shopping website increases the "checkout" and "discount" Lambda functions can scale independently and automatically. If your "discount" service has an issue, then only this service can be disabled till the fix is available without impacting the other services like "checkout".

AWS Step Functions-2

How Do AWS Step Functions Work?

Let's take a look at the different components in AWS Step Functions:

State Machine

A popular concept in the field of theory of computation or compiler design is a state machine. A state machine is simply an abstract way of representing a machine that can transition between "states" based on a certain input and output. A finite state machine is a state machine that has only a countable or finite number of states.

AWS Step Functions are built around the concept of state machines. AWS Step Functions use the term state machine to refer to an application workflow. Let's understand the meaning of states and transitions with AWS Step Functions in mind.


States are the individual steps in a workflow. States in AWS Step Functions can make decisions based on their input, perform actions, and send output to other states.

Two types of tasks can be executed as part of a state:

1. Service Task: Service tasks allow you to integrate the various AWS services into your workflow. They can easily be used for automated steps such as invoking a Lambda function.

2. Activity Task: Activity tasks allow you to connect your workflow to a task that is performed by a worker that can be hosted on Amazon EC2, Amazon ECS, or even your hardware. This is useful when you want to utilize custom-made services which are difficult to migrate to the cloud.


A transition occurs when one state has finished its execution and is ready to start the subsequent steps in the workflow. AWS Step Functions is responsible to transition one state to the next.

State definitions and transition logic are written in a custom, JSON-based, structured language called Amazon State Language. We will be taking a look at Amazon State Language later in this article.

Now that we understand states and transitions, we can take a look at workflows in the next section.

Example State

Let's take an example state which is defined in the Amazon State Language:

In the above example,

  • HelloWorld is the name of the state.
  • Type represents the type of task. In this case, we have chosen the type as Task.
  • Resource indicates which resource needs to be used as part of the task. In this case, we have chosen to execute a Lambda function.
  • Next is the next state to be executed after the current task is completed. In this case, we have chosen to go to the AfterHelloWorldState state.
  • Comment is an optional field in case you want to add comments in your state definition.
  • There are many other possible fields that have not been shown in the above example such as - Choice, Fail, Wait, Parallel.

Standard and Express Workflows

A workflow describes the different states in an AWS Step Function and contains the logic of the transitions. There are two types of workflows:

Standard Workflows: Standard workflows can be used when you want to create long-running and non-idempotent tasks. Examples include starting an EMR Cluster or processing payments.

Express Workflows: Express workflows can be used when you want to create short, frequent, and idempotent tasks. Examples include data ingestion and mobile application backends.

One of the main differences between Standard and Express Workflows is the way AWS bills the execution of these workflows. The below tables illustrate all the differences:

CharacteristicStandard WorkflowsExpress Workflows
Max Duration1 year5 minutes
Execution State RateOver 2000 per secondOver 100000 per second
State Transition RateOver 4000 per second per accountNearly unlimited
Execution HistoryCan be viewed using AWS Console, CloudWatch, or the APICan be viewed using CloudWatch
Execution StrategyExactly once (tasks and states are never executed more than once)At least once (tasks and states can be executed more than once)
PricingPer state transitionNumber of executions, their duration, and memory consumption

Idempotent: An idempotent task is where operations can be run multiple times without changing the result. For example, updating the same key in a database with the same value can be considered an idempotent task. But programmatically creating a database is not an idempotent task as repeated runs of the program might create multiple copies of the database.

Use Cases For AWS Step Functions

AWS Step Functions have many use cases. Let's explore three different use cases:

Microservice Orchestration

An application that uses a microservices architecture consists of loosely coupled services that are designed, developed, and deployed independently. This allows the application to be easily scalable and boast increased resiliency.

AWS Step Functions, with its event-driven design, allow for easy microservice orchestration and management. Both Standard Workflows and Express Workflows can be used based on the application's objective. AWS Lambda functions can be used to define the microservice.

Data Processing

You can use AWS Step Functions to deal with the processing of large volumes of data, especially in cases where there is continuous data ingestion. AWS Step Functions can dynamically provision resources and ensure high availability, resulting in efficient data processing.

Some examples include image processing, coordinating ETL(Extract, Transform, and Load) jobs, and batch processing. AWS Step Functions integrate with other data processing services provided by AWS like Amazon EMR, AWS Glue, and Athena. Machine Learning workflows can also be orchestrated with the help of AWS Step Functions and AWS SageMaker.

IT and Security Automation

AWS Step Functions also provide a smart way to automate repetitive and time-consuming tasks. You can create workflows that automatically retry failed tasks to manage errors in your workflow. Step Functions can pause workflows at a specific step and resume the workflow at a later point, which allows for human intervention as and when required.

Some examples of automation can be automating the deployment of AWS CloudFormation Stacks, patching instances, and auto-remediation of security incidents.

How to Create a Serverless Workflow with AWS Step Functions and AWS Lambda?

Let's take an example of how to create a serverless workflow with AWS Step Functions and AWS Lambda. For this example, let's use the example of an online shopping website mentioned earlier in the article. To recall this website has two services:

  1. Checkout: Calculates the total amount by adding taxes and other charges to the checkout amount.

  2. Discount: Applies a discount if the total amount calculated from the checkout service is greater than Rs. 1000.

This is an illustrative example to show you how AWS Step Functions work, and hence dummy business logic is used. This does not represent the true functioning of an online shopping website.

The screenshots are taken using the New AWS Console. If you are using the Old AWS Console, you can still follow the same steps.


  • AWS Account
  • Basic knowledge of AWS Lambda

Create the AWS Lambda Functions

We will be creating three AWS Lambda functions for this example.

  1. log in to your AWS Account.
  2. Open the AWS Console. Search for "Lambda" in the Search Bar. Select Lambda.
  3. Click on the Create function button.
  4. In the Create function page:
    • Keep "Author from scratch" selected.
    • Give the "Function name" as "CheckOut".
    • Select the "Runtime" as "Node.js 16.x".
    • Select the "Architecture" as "x86_64".
    • Don't change any other settings.
    • Click the Create function button.
  5. Wait for a few seconds till the AWS Lambda function is created. After the function is created you will be automatically redirected to the CheckOut AWS Lambda Function page.
  6. Replace the code in the "Code source" section with the below code:
  1. Click the Deploy button. The CheckOut AWS Lambda function is now ready.

  2. Now repeat the above steps to create two more Lambda functions with the "Function name" and "Code source" as given below: a. Function name: Discount

    Code source:

    b. Function name: DisplayTotal

    Code source:

  3. After all three AWS Lambda functions are created, you can now create the AWS Step Function.

Create the AWS Step Function

  1. log in to your AWS Account.
  2. Open the AWS Console. Search for "Step Functions" in the Search Bar. Select Step Functions.
  3. Click on the Create state machine button.

Create the AWS Step Function-1

  1. Select the "Write your workflow in code" option. Create the AWS Step Function-2

  2. Select the "Type" as "Standard".

Create the AWS Step Function-3

  1. In the "Definition" editor, replace the example snippet with the below definition:

Make sure you replace the REGION and ACCOUNT_ID with the AWS Region and your AWS Account Number, respectively.

  1. After you had added the correct REGION and ACCOUNT_ID, the visual graph should show the following graph. In case the graph has not been refreshed, you can click the "refresh" option on the top left.

Create the AWS Step Function-4 8. Click the Next button at the bottom left of the page.

Create the AWS Step Function-5

  1. In the "Specify details" page:

    • Give the "State machine name" as "OnlineShopCheckout".
    • For "Execution role", select "Create new role".
    • Don't change any other settings.
    • Click the Create state machine button at the end of the page. Create the AWS Step Function-6
  2. After a few seconds your AWS Step Function OnlineShopCheckout will be available for use.

Test the AWS Step Function

Let's test the AWS Step Function with two different values - Rs. 1000 and Rs. 500, and see how the AWS Step Function handles each of these cases.

  1. Login to your AWS Account.
  2. Open the AWS Console. Search for "Step Functions" in the Search Bar. Select Step Functions.
  3. Select OnlineShopCheckout.
  4. Click the Start execution button.

Test the AWS Step Function-1

  1. In the "Start execution" page, modify the "Input" as shown below and click the Start execution button.

Test the AWS Step Function-2

  1. Once the execution is completed, click the "DisplayTotal" block and go to the "Step output" section. This is where you will find the total amount.

Test the AWS Step Function-3

In this case, since the total amount after taxes and charges was greater than 1000, the "ApplyDiscount" state was also called (indicated by the green shading).

  1. Now repeat the above steps but replace the "amount" value with 500.
  2. Once the execution is completed, click the "DisplayTotal" block and go to the "Step output" section. This is where you will find the total amount. In this case, since the total amount after taxes and charges was lesser than 1000, the "ApplyDiscount" state was not called (no shading).

In this example, we learned how to use AWS Lambda functions to create an example AWS Step Function. You can create complicated, event-driven workflows with hundreds of states and integrate AWS services easily into your workflows.

When to Use Step Functions?

AWS Step Functions manage your application's components and logic, allowing you to write less code and focus on developing your application rapidly. The below sections provide a few general workflows where AWS Step Functions can be used:

When to Use Step Functions

1. Function Orchestration: Workflows that involve a sequential order of tasks that need to be performed one after the other.

2. Branching: Workflows that involve choosing the next task based on the output of the previous task. There can be two or more branches.

3. Error Handling: Workflows can involve automatic retry mechanisms and error handling.

4. Parallel Processing: Workflows that can take advantage of parallel processing by running multiple tasks at the same time.

5. Pause and Resume: Workflows that pause for another task to complete, and then automatically resume the next task in the sequence.

Integrations & Development Tools

AWS Step Functions allow for integration with multiple AWS Services. Here are some examples of the integrations:

  1. Invoke Lambda Functions
  2. Publish a message to an SNS Topic
  3. Send messages to an SQS Queue
  4. Run a task on ECS
  5. Trigger an EMR job

AWS also provides certain tools to enable developers to work with Step Functions more easily and faster. These are:

1. AWS CDK: AWS Cloud Development Kit (CDK) provides all the necessary libraries and modules to develop AWS Step Functions. It allows you to programmatically define tasks and integrates with Amazon State Language.

2. AWS Toolkit for VS Code: AWS Toolkit is an extension for the VS Code editor. Using this extension you can easily create state machines (workflows), execute state machines and visualize state machines as a graph.

Advantages and Disadvantages of AWS Step Functions

AWS Step Functions offer many advantages such as:

  • Easy microservice orchestration for event-driven applications
  • In-built retry mechanism and smart error handling
  • Integration with a wide variety of AWS Services

But at the same time have a few disadvantages:

  • Restricted to a serverless programming paradigm
  • Might be not cost-effective if you are dealing with high-traffic services which need to run for long periods
  • Vendor locked with AWS as Amazon State Language cannot be used in other cloud providers

Monitoring AWS Step Functions

AWS provides several tools to monitor AWS Step Functions:

1. Amazon CloudWatch: AWS stores metrics such as the number of state machines started, the execution time of activities, and how many Lambda functions failed. You can find a detailed description of the different types of metrics available in the next section.

2. X-Ray: AWS X-Ray can be used to visualize the components of your state machine, understand performance bottlenecks, and debug requests that resulted in an error. X-Ray works with other AWS Services as well, so you will be able to analyze your service integrations as well.

Monitoring AWS Step Functions

AWS Step Functions Metrics

AWS collects extensive metrics on AWS Step Function executions. These metrics are stored in CloudWatch under the AWS/States namespace. Let's take a look at the categories of metrics:

Execution Metrics

Metrics for how many times a state machine started, succeeded, and failed. Also, the time it took for the last execution of the state machine.

Example Metrics: ExecutionTime, ExecutionsSucceeded, ExecutionsTimedOut

Activity Metrics

Metrics for how many activity tasks started, succeeded, and failed. Also, the time it took for each of the activities to complete.

Example Metrics: ActivityRunTime, ActivitiesSucceeded, ActivitiesTimedOut

Lambda Function Metrics

Metrics for how many Lambda functions started, succeeded, and failed. Also, the execution time of each Lambda function run.

Example Metrics: LambdaFunctionRunTime, LambdaFunctionsSucceeded, LambdaFunctionsTimedOut

Service Integration Metrics

Metrics for how many service tasks started succeeded, and failed. Also, the time it took for each of the service tasks to complete.

Example Metrics: ServiceIntegrationRunTime, ServiceIntegrationsSucceeded, ServiceIntegrationsTimedOut

API Metrics

These metrics collect information about the requests to the Step Functions API.

Example Metrics: ThrottledEvents, ProvisionedBucketSize, ProvisionedRefillRate, ConsumedCapacity


  • In this article, you learned what AWS Step Functions are and how they work.
  • We looked at the two different types of workflows - Standard Workflows and Express Workflows.
  • Then we understood the different use cases - microservice orchestration, data processing, and IT automation.
  • We took a look at how we can create an example AWS Step Function with the help of AWS Lambda functions.
  • We learned about the different developer tools offered by AWS - AWS CDK and AWS Toolkit for VS Code.
  • We went through a few advantages and disadvantages of using AWS Step Functions.
  • Finally, we understood how to monitor AWS Step Functions with AWS CloudWatch and AWS X-Ray.