Skip to content
Back to Articles

Terraforming Datadog Workflows

Engineering
10 min read
by Olakitan Aladesuyi Dec 18, 2025

Terraform recently added the datadog_workflow_automation resource to its Datadog registry. This new resource enables teams to provision and manage Datadog workflows using Terraform, giving teams the ease and flexibility of infrastructure-as-code for observability automation.

At Runa, we have found workflows to be especially useful in operations management, alert optimization, and incident flow automation. However, managing these workflows solely via the Datadog UI quickly becomes difficult. Changes are hard to track, reviews are not centralized, and small edits can easily get lost. Combined with the challenge of maintaining workflows that serve multiple teams with different priorities, these gaps become painful and error-prone.

This why I have spent the past few months migrating two of our existing operations management workflows to Terraform. In this post, I will take you through the key technical challenges, the solutions, and the pitfalls to avoid when adopting Terraform for Datadog workflows.

 

Key Technical Challenges and Solutions

Complex JSON Structure Management

The most significant challenge was managing the complex JSON structure required for the spec_json parameter. Datadog workflows have intricate step definitions with nested parameters, outbound edges, and display configurations.

Solution: Export the workflow from the Datadog console as Terraform code.

The Datadog console provides an “Export” option for existing workflows that outputs a complete Terraform representation of the workflow. This gives you the exact JSON format that Datadog expects, minimizing trial-and-error. From there, you can refactor the spec_json using jsonencode() and Terraform locals to make it more modular and maintainable.

Steps to export:

  1. Go to your workflow in the Datadog console.
  2. Click on the Export option in the menu bar.
  3. Select the Terraform option.
  4. Copy the generated spec_json content.

Dynamic Step References and Dependencies

One of the trickiest aspects was handling dynamic references between steps that are conditionally included. For example, one workflow includes a call to another workflow only when a specific criterion is met, which means that the outbound edges (the “links” between steps) must reference steps that may or may not exist.

Solution: Use Terraform locals to define conditional outbound edge maps. This approach allows you to model branching logic without duplicating JSON structures.

locals {
  check_monitor_outbound_edges = var.enable_notifications ? [
    {
      "nextStepName" : "SendSlackNotification",
      "branchName" : "true"
    }
  ] : []
}

---

{
          "name" : "CheckMonitorStatus",
          "actionId" : "com.datadoghq.core.if",
          "parameters" : [ ... ],
          "outboundEdges" : local.check_monitor_outbound_edges,
        }

AWS Services Integration and Connection Management

Both workflows integrate with multiple AWS services, which require authentication. On the Datadog console, AWS connection is easily managed by creating an appropriate connection for each environment. However, Terraform is environment agnostic, so you must supply the AWS connection for the correct environment being deployed.

Solution: Maintain a mapping of pre-existing environment-specific AWS connections:

locals {
  aws_connection_map = {
    "sandbox"    = { connection_id : "12345", label : "SAMPLE_CONNECTION_1" }
    "dev"    = { connection_id : "67890", label : "SAMPLE_CONNECTION_2" }
  }
}

Dynamic workflow naming

Not a challenge, but worth mentioning to avoid all provisioned workflows having the same name.

Solution: Use Terraform variables to parameterize workflow names

resource "datadog_workflow_automation" "my_workflow" {
  name        = "My-workflow-${var.team_name}-${var.env}"
}

 

JavaScript Code in Workflow Steps

Many workflow steps include JavaScript code for data transformation. Managing this code within Terraform strings can quickly become messy and hard to read.

Best Practices:

  • Write and test your code directly in the Datadog workflow JavaScript step editor or any IDE.
  • Maintain proper indentation and commenting.
  • Export the working code as JSON from Datadog or IDE once validated.

{
  "name" : "ParseOutput",
  "actionId" : "com.datadoghq.datatransformation.func",
  "parameters" : [
    {
      "name" : "script",
      "value" : "// parse JSON string \nlet parsedFaults = $.Steps.DescribeOutput;\nlet trigger_monitor_url = $.Source.monitor.url;\n\n ... // rest of logic here"
    }
  ]
}

 

Complete example: Putting It All Together

Here’s a sample workflow that demonstrates the patterns discussed above in a complete workflow.

locals {
  # Create conditional notification steps
  notification_steps = var.enable_notifications ? [
    {
      "name" : "SendSlackNotification",
      "actionId" : "com.datadoghq.slack.send_simple_message",
      "parameters" : [
        {
          "name" : "teamId",
          "value" : "MYSAMPLETEAM"
        },
        {
          "name" : "channel",
          "value" : "#alerts"
        },
        {
          "name" : "text",
          "value" : "Alert triggered: "
        }
      ],
      "display" : {
        "bounds" : {
          "x" : 0,
          "y" : 432
        }
      }
    }
  ] : []

  # Create conditional outbound edges 
  # If notifications are enabled, route to notification step; otherwise, workflow ends
  check_monitor_outbound_edges = var.enable_notifications ? [
    {
      "nextStepName" : "SendSlackNotification",
      "branchName" : "true"
    }
  ] : []

  sample_workflow_name = "Sample-workflow-${var.team_name}"
  aws_connection_map = {
    # AWS connection IDs for different environments. Already created on datadog console
    "sandbox" = { connection_id : "abcd", label : "MY_SAMPLE_AWS_CONNECTION_1" }
    "dev"    = { connection_id : "abcdefg", label : "MY_SAMPLE_AWS_CONNECTION_2" }
  }
  aws_connection_id    =  local.aws_connection_map["sandbox"].connection_id
  aws_connection_label = local.aws_connection_map["sandbox"].label
}

variable "enable_notifications" {
  type        = bool
  description = "Enable Slack notifications in the workflow"
  default     = false
}

resource "datadog_workflow_automation" "sample_workflow" {
  name        = local.sample_workflow_name
  description = "Sample workflow demonstrating Terraform patterns for Datadog workflow automation"
  tags        = var.workflow_tags
  published   = true

  spec_json = jsonencode(
    {
      "triggers" : [
        {
          "startStepNames" : [
            "CheckMonitorStatus"
          ],
          "monitorTrigger" : {}
        }
      ],
      "steps" : concat([
        {
          "name" : "CheckMonitorStatus",
          "actionId" : "com.datadoghq.core.if",
          "parameters" : [
            {
              "name" : "joinOperator",
              "value" : "or"
            },
            {
              "name" : "conditions",
              "value" : [
                {
                  "comparisonOperator" : "eq",
                  "leftValue" : "",
                  "rightValue" : "Alert"
                },
                {
                  "comparisonOperator" : "eq",
                  "leftValue" : "",
                  "rightValue" : "Warn"
                }
              ]
            }
          ],
          "outboundEdges" : local.check_monitor_outbound_edges,
          "display" : {
            "bounds" : {
              "x" : 0,
              "y" : 0
            }
          }
        }
      ], local.notification_steps),
      "handle" : local.sample_workflow_name,
      "connectionEnvs" : [
        {
          "env" : "default",
          "connections" : [
            {
              "connectionId" : local.aws_connection_id,
              "label" : local.aws_connection_label
            }
          ]
        }
      ],
      "inputSchema" : {
        "parameters" : [
          {
            "name" : "MonitorURL",
            "type" : "STRING",
            "defaultValue" : "sample_workflow"
          }
        ]
      }
    }
  )
}

Critical Pitfalls to Avoid

1. Permissions & Ownership

Pitfall: Workflows created via Terraform are owned by Terraform, not by individual users. Standard-role users cannot publish or edit them directly. While this is not necessarily a problem, it could become a blocker in some situations.

Impact: Teams may be unable to manage workflows after they’re deployed.

Solution: Have an admin grant users permission to publish/unpublish workflows.

2. Step Function Limitations

Pitfall: Datadog workflows currently have limited capabilities with Express Step Functions.

Impact: Workflow may fail on step functions-related steps.

Solution: Prefer Standard Step Functions and document this limitation for your team.

3. Datastore Management

Pitfall: Datastores cannot be provisioned via Terraform, and workflows need explicit permissions to access them.

Impact: Workflow may fail due to missing datastore access.

Solution:

  • Create your datastore in the Datadog console and reference it by ID in Terraform.
  • Grant Terraform a Manager role in the datastore:
    1. Go to the Datastore page in Datadog.
    2. Click the Settings icon.
    3. Select Edit Permissions
    4. Add Terraform with Manager role

4. AWS Connection Management

Pitfall: AWS connections are environment-specific and must be pre-created in Datadog.

Impact: Workflow will fail if connections are missing or incorrect.

Solution:

  • Create AWS connections for each environment in the Datadog console and reference them by ID in Terraform.
    1. Click Actions in the Datadog sidebar.
    2. Select Connections
    3. Click on the + New Connection button at the top of the connections page.
    4. Select AWS from the list of possible integrations.
    5. Complete the form, then copy the generated IAM policy statement (the statement only shows at creation of the connection).
    6. In the AWS console, create a new role and attach the generated policy statement to it.

Benefits Achieved

  1. Team Autonomy: Each team can own and modify their workflow variations independently.
  2. Environment Consistency: Identical workflows can be deployed across environments with environment-specific configurations.
  3. Infrastructure as Code: Workflows are now part of the infrastructure, enabling automated deployments and rollbacks.
  4. Documentation: The Terraform code serves as living documentation of the workflow logic.

Terraform support for Datadog workflows is a game-changer. It brings version control, consistency, and automation to what used to be a manual, UI-driven process. While the JSON structures can be complicated to set up, the long-term benefits outweigh the initial setup cost.

Further Reading

For more details on Datadog workflows, here are some resources: