Building Terraback: A CLI for Reverse-Engineering Live Cloud Infrastructure into Terraform

April 2026

Where the idea came from

I kept running into the same wall on consulting engagements.

A client would have years of cloud infrastructure built up organically — some of it through the console, some of it from early Terraform experiments that got abandoned, some of it from scripts a contractor wrote in 2019 and then left. The leadership team would eventually decide, usually right before an audit or a platform migration, that everything needed to be in Terraform. By next quarter.

Doing that by hand is grim. You end up with an engineer clicking through the AWS console, writing resource blocks by hand, running terraform import, watching terraform plan produce a 400-line diff, and spending the next three hours figuring out which attribute is wrong. Multiply that by a few thousand resources across three accounts. That's a team-quarter of work, minimum.

Existing tools helped with parts of it. Terraformer, Former2, the provider-specific exporters like aztfexport and cf-terraforming. Each one covered a slice of the problem, but every one I tried broke on the bits that actually mattered at scale — cross-resource dependencies, clean modular output, multi-cloud parity from a single CLI surface.

So I built the tool I wished existed. I called it Terraback.

The approach

The design constraints came out of the consulting work, not out of a whiteboard session.

Local-first, single binary. No SaaS middleman, no uploading an account's infrastructure map to someone else's servers. Run the CLI against your cloud credentials, get Terraform code out. For enterprise clients in banking and gaming, anything that involves shipping infrastructure data to a third party is a non-starter before the security review even opens.

One CLI, three providers. AWS, Azure, and GCP from the same binary with a consistent command structure. Most real environments are already multi-cloud, and bouncing between provider-specific tools that each have their own conventions is its own friction.

Dependency-aware scanning by default. A security group references a VPC that lives in a region that has subnets that have route tables. Generated Terraform that hardcodes resource IDs instead of using proper references is technically valid and practically useless — you can't move it, you can't template it, and the first plan after a rebuild is a disaster. Walking the dependency graph before emitting any code was non-negotiable.

Modular output, not a flat dump. Nobody wants a ten-thousand-line main.tf. Resources should come out organised by service area — compute, networking, storage, identity — with a directory structure you'd actually commit to a repository.

terraform plan clean against the live environment. This is the real bar. Emitting HCL is easy. Emitting HCL that, after importing, produces zero diff against the thing it was generated from — that's the entire game. Anything less is a toy.

The implementation (where the real work was)

The headline architecture is unremarkable: a CLI in Python, a plugin-style layer for each provider, a dependency resolver, a template engine for output, a caching layer. The interesting work was in the details that never make it into a README.

Provider schemas are each their own universe

Every cloud provider has its own resource model, its own naming conventions, its own way of expressing required vs optional attributes, and its own collection of edge cases that only show up in real environments.

AWS has resources where one field's validity depends on another field's value. Azure has resources that expose different attributes depending on whether they were created via ARM templates, Bicep, or the portal. GCP has resources where the API returns fields that the Terraform provider doesn't accept on input. Translating each provider's live state into correct Terraform HCL — right argument names, right block structure, right handling of computed fields — was the unglamorous majority of the work.

The tooling support doesn't help as much as you'd hope. Provider schemas exist in machine-readable form, but they describe what Terraform accepts, not what a given resource's live state looks like. The translation layer between "what the cloud API returned" and "what Terraform wants to see" has to be handwritten, one resource type at a time.

Dependency graphs have cycles you don't expect

The first version of the dependency resolver worked by following explicit references — a subnet belongs to a VPC, so emit the VPC first. That handled maybe 70% of cases. The remaining 30% showed me corners of cloud provider design I'd never consciously thought about.

IAM roles that trust other roles that trust the original role. Security groups that reference each other for inter-service traffic. Route tables that depend on internet gateways that depend on VPCs that are only there because of the route tables. Classic graph theory problems that the cloud providers cheerfully let you create because they don't care about declarative ordering — they resolve at runtime.

The fix was topological sorting with cycle detection, plus a small set of provider-specific rules for breaking cycles safely. For example: if two security groups reference each other, emit them both first without the cross-references, then emit the references as separate aws_security_group_rule blocks. It works, but "works" hides a lot of provider-specific research.

Import block correctness is the actual product

Terraform 1.5 introduced import blocks as a first-class declarative way to import existing resources. That became the core output format — for every resource Terraback generates, it also emits a matching import block so you can terraform plan your way to a clean state without running manual terraform import commands.

The hard part isn't emitting the import block. It's making sure the .tf you generated alongside it produces zero diff against the imported resource.

Zero-diff is a tight constraint. It means every optional attribute you didn't set must match the provider's default. Every computed attribute has to be either omitted or exactly right. Every block ordering has to match what Terraform expects. One wrong field, one attribute you didn't realise was being defaulted in the background, and the plan output is a wall of red.

Getting the generated code to plan clean, consistently, across thousands of resources, across three providers, was the single longest-running engineering problem in the project. It's the thing that separates "tool that writes HCL" from "tool that produces Terraform you can ship."

Bulk performance is its own small research project

Serial imports on a large environment are a non-starter. A small AWS account might have 3,000 resources; a large enterprise account is easily 30,000+. At a few hundred milliseconds per API call, serial scanning ends before Christmas.

Parallel scanning introduces two new problems. The first is rate limiting — every provider has throttling, most providers don't document the actual limits, and some providers return cryptic errors that don't clearly indicate you've been throttled. The second is memory — holding 30,000 resource descriptions in memory while you walk a dependency graph over them will happily eat a developer laptop alive.

The answer ended up being a bounded worker pool with per-provider rate limit profiles (derived empirically, because documented limits are usually wrong), plus a caching layer that lets you re-scan a partially-changed environment without refetching everything. The cache turned out to be disproportionately useful — consulting work involves iterating on output repeatedly, and rescanning a 10,000-resource account from scratch each time makes the tool unusable for the exact use case it was built for.

Templated output

Different teams have different Terraform conventions. Some organise by environment, some by service, some by team ownership. Some use workspaces, some use separate state files per module. Some name modules with prefixes, some with suffixes, some with hyphens, some with underscores, and all of them consider everyone else's convention wrong.

Rather than pick one and be wrong for everyone, Terraback generates output through Jinja2 templates. The default templates produce a sensible structure — compute, networking, storage, identity as top-level directories — but every layout decision is overridable. Want all security groups in one file? Change one template. Want resources grouped by tag? Change another. The tool adapts to your repo conventions instead of forcing its own.

This was the single biggest ergonomic unlock. Early users consistently pushed back on generated structure until it matched what they already did elsewhere. Templates made "what it generates" a user decision rather than a developer decision.

What I learned building it

A few lessons that weren't obvious at the start:

You don't really know a cloud provider until you've tried to faithfully serialise its state into another tool's language. Reading docs teaches you the happy path. Reverse-engineering live state teaches you every corner of the API that's been deprecated, renamed, or quietly changed behaviour across versions. I know AWS, Azure, and GCP at a level now that no certification or course could have produced.

The gap between "generates HCL" and "generates correct HCL" is enormous. Anyone can write a tool that emits resource blocks. Writing one where the output reliably plans clean against the source environment is a different project entirely, and the last 5% takes longer than the first 95%.

Tooling beats heroics, but building the tooling is often more work than the heroics it replaces. I could have manually imported each client's infrastructure faster, in the short term, than I spent building Terraback. The payoff is in repeatability — the tool helps every engagement, not just the one that inspired it.

Templates are an escape hatch from opinionated decisions. The more opinionated a code-generation tool is about its output structure, the less it gets adopted. Making layout configurable via templates was a late addition that should have been in the first version.

Cross-provider parity is the hardest constraint. Every decision has to work three different ways at once. A resolver that works perfectly for AWS but handles Azure awkwardly is a resolver that needs to be redesigned. Keeping the CLI surface consistent across three cloud APIs that have almost nothing in common forced better abstractions than I would have found working on one provider at a time.

Where it sits now

Terraback works. It handles the full list of services that most real environments actually use — compute, networking, storage, IAM, databases, containers. It generates clean, modular, importable Terraform that plans clean against live infrastructure. The cache layer is fast enough to iterate on output. The template system is flexible enough to match most teams' conventions.

The engineering is the thing I'm proudest of. Building something like this end-to-end — designing the architecture, wrestling the provider APIs, solving the dependency resolution, getting the generated code to plan clean — is the kind of project every infrastructure engineer should do at least once. It reshapes how you think about cloud platforms, IaC, and the gap between declarative intent and actual cloud state.

If you're a DevOps or platform engineer who's ever wondered what's really under the hood when Terraform talks to a cloud provider, try building a tool like this yourself. It's the best technical education I've given myself in years.

← Back to Blog