Help Docs

AWS Step Functions Monitoring

AWS Step functions enables you to coordinate work across various distributed components by expressing workflow as state machines and tasks. With Site24x7's AWS integration you can monitor and alert on metrics like Execution time and more to understand the behavior of your state machines.

Setup and configuration

  • If you haven't done it already, enable Site24x7 access to you AWS resources by creating a cross-account IAM role between your AWS account and Site24x7's AWS account. You can also create Site24x7 as an IAM user and generate security credentials. Learn more
  • In the Integrate AWS Account page, select step functions in the services to be discovered section.Learn more.

Policies and Permissions

Assign the AWS managed policy ReadOnlyAccess to the Site24x7 entity (IAM role or IAM user) to help Site24x7 access and collect information about your state machines. If you're assigning a custom policy, please make sure the following read-level actions are present in the policy JSON. Learn more.

  • "states:ListStateMachines",
  • "states:DescribeStateMachine",
  • "states:ListActivities",
  • "states:DescribeExecution",
  • "states:ListExecutions",
  • "states:GetExecutionHistory",
  • "states:ListTagsForResource"

Polling frequency

Site24x7 collects metric data points about step function execution as per the poll frequency set (1 minute to a day).Learn more.

IT Automations

You can add automations for the AWS services supported by Site24x7. Log in to Site24x7 and go to Admin > IT Automation Templates (+) > Add Automation Templates. Once automations are added, you can schedule them to be executed one after the other.

You can now start a state machine execution using AWS Step function automations.


Each step function is considered a basic monitor. Learn more.

Supported metrics

AttributeDescriptionData typeStatistics
Execution time Measures the interval, between the time the execution starts and the time it ends. Seconds Average
Execution Throttled Measures the number of times state entered events and retries have been throttled. Count Sum
Executions Aborted Measures the number of aborted or terminated executions. Count Sum
Executions Failed Measures he number of failed executions. Count Sum
Executions Started Measures the number of started executions. Count Sum
Executions Succeeded Measures the number of successfully completed execution. Count Sum
Execution Timed out Measures the number of executions that timed out for any reason. Count Sum

To view data

  • Sign in to the Site24x7 web console. On the left navigation pane, choose AWS and choose your monitored AWS account.
  • In the menu dropdown, choose Step Functions.
  • From the list of monitored state machines, choose the state machine for which you want to see metrics.

AWS Step Functions Monitoring Interface


Use the Summary tab to gain insight into your step function executions. By default, time series charts for all state machine metrics are displayed.

Work Flow Graph

A color-coded visual workflow of your state machine is displayed. You can hover over each state to view more information. For example, when you mouse over a failed state, you can see what run time error caused the failure along with the service name of the called resource and the action of the resource.


The Amazon States language (JSON based structured language) definition of the state machine is shown.


The state machine execution history is displayed in reverse chronological order. You can choose a specific execution to view the list of events that occurred in that execution along with time stamp, JSON data input, type, state details and more.


The AWS resources—DynamoDB tables, SNS topics, Lambda, ECS, and SQS queues— referenced in your state machine activities are displayed along with their status (Note: The resource status would only be shown if it is monitored by Site24x7). You can also set thresholds and be notified when any of these services fail by clicking the pencil icon under Action.


Estimate future values of the following performance metrics and make informed decisions about adding capacity or scaling your AWS infrastructure.

  • Execution Time
  • Execution Throttled
  • Executions Failed
  • Execution Timed Out

Was this document helpful?

Would you like to help us improve our documents? Tell us what you think we could do better.

We're sorry to hear that you're not satisfied with the document. We'd love to learn what we could do to improve the experience.

Thanks for taking the time to share your feedback. We'll use your feedback to improve our online help resources.

Shortlink has been copied!