AI Agents

Opsitron uses AI agents to handle the repetitive, error-prone parts of infrastructure management while keeping humans in control of critical decisions.

What AI Does

Planning

When a request is submitted, an AI agent:

Reads your infrastructure — examines the existing Terraform code, module versions, and cloud configuration
Understands the context — knows your naming conventions, SSM paths, networking setup, and DNS structure
Generates a plan — produces a structured implementation plan with steps, affected resources, estimated effort, and risks
Flags concerns — identifies potential issues like cost spikes, security implications, or missing prerequisites

Implementation

After a plan is approved, the AI agent:

Writes Terraform code — creates properly structured files following your repository conventions
Validates the code — runs tofu init, tofu validate, tofu fmt, and tofu plan
Creates a pull request — with a clear description of changes and plan output
Handles feedback — if the PR needs changes, the agent can revise based on review comments

Chatbot

The Opsitron chatbot (powered by the same AI) helps staff:

Plan requests — “I need to deploy a new container app” → suggests the right request type and checks prerequisites
Monitor requests — “What’s happening with request #19?” → shows current status, tasks, errors, and troubleshooting steps
Debug issues — reads GitHub Actions logs and CloudWatch logs to identify failures
Review infrastructure — analyzes Terraform configs against AWS Well-Architected Framework

What AI Doesn’t Do

AI never applies changes directly to your AWS accounts. Every change goes through:

A pull request in your GitHub repository
Your CI/CD pipeline (GitHub Actions)
Terraform plan/apply with full output visibility

Staff review every plan before implementation begins. For routine operations (DNS zones, ECR repos), auto-approval rules can be configured — but even auto-approved changes create PRs and run through CI.

Platform Knowledge

The AI agents are equipped with deep knowledge of:

Platform modules — every variable, output, and best practice for each infrastructure module
AWS Well-Architected Framework — security, reliability, cost, and performance guidance
Your client’s infrastructure — via MCP tools that read configuration, DNS, shared services, and cloud accounts
Cost implications — specific cost data for Aurora, Fargate, NAT Gateways, and other common resources

This knowledge is continuously updated as the platform evolves. When a new module version is released or a convention changes, the AI automatically picks it up.

Skills

AI agents use skills — specialized instruction sets for specific tasks:

Skill	Purpose
Security Review	IAM least privilege, network security, encryption
Cost Estimation	Resource cost awareness with platform-specific pricing
Terraform Best Practices	File organization, naming, variables, patterns
AWS Well-Architected	Six pillars alignment with environment-specific guidance

Clients can also add custom skills in their config repository for organization-specific conventions, compliance requirements, or architectural patterns.