There's a particular smell to audit season. Calendars go dark for two weeks. Someone owns a spreadsheet with 200 rows and a column called "evidence location." Engineers get pinged to screenshot a console they last touched in March. And the whole organization quietly accepts that, for a quarter, we will stop building and start proving — as if the two were different activities.
They aren't. The reason audits feel like fire drills is that we treat evidence as something we go find after the fact, instead of something the system emits as it runs. Change that one assumption and the audit stops being an event. It becomes a query.
The real mechanic: evidence is a build artifact
Here's the reframe I keep coming back to. A control is just a claim about how your system behaves. "Encryption in transit is enforced." "Production access requires MFA and is logged." "Every deploy is peer-reviewed." Every one of those claims is already true or false in your infrastructure at this exact moment, and most of them are already knowable from data you have — IAM policies, pipeline logs, Terraform state, your CSPM findings, your ticketing system.
The gap is that the proof lives in a dozen consoles and the auditor wants it in a binder. So every cycle we run humans as the integration layer between "the system is compliant" and "here is a PDF that says so." That human integration layer is the fire drill. It's slow, it's error-prone, and worst of all it produces evidence that's stale the moment it's captured.
The fix is to treat evidence the way we treat any other build output. If a control claim is testable, write the test. Run it in CI. Store the result, timestamped and signed, the same way you store a coverage report or a SAST scan. When the auditor shows up, you don't go gather anything — you hand them a feed. The work moved left, into the pipeline, where the cost of being wrong is a failed check instead of a finding.
Why this is suddenly practical, not aspirational
People have wanted "continuous compliance" for a decade. What's actually changed is that the two pillars regulated shops live under — SOC and PCI — are both becoming machine-addressable.
On the SOC side, NIST's OSCAL (Open Security Controls Assessment Language) turns control catalogs, system security plans, and assessment results into structured JSON/XML instead of prose. That matters more than it sounds. A SOC 2 report as a PDF is a document you read once and file. The same content as OSCAL is something a machine can diff, map, and ingest. When your vendor's controls and your own controls speak the same schema, the "shared responsibility" handwaving turns into an actual join. You can answer "which of my controls depend on a sub-processor control that just changed?" without a human re-reading anyone's report.
On the PCI side, the move toward explicit, current cryptography requirements — named algorithms, named key strengths, defined expiry for the weak stuff — is a gift to anyone who wants to automate. Vague controls can't be tested; "use strong cryptography" is a debate. "TLS configurations must not negotiate the following ciphers" is an assertion you can run against every endpoint on a schedule and fail the build when it drifts. The more prescriptive the requirement, the more cleanly it compiles into code. For once, specificity in a standard is the feature, not the burden.
Put those together and the architecture is obvious: prescriptive requirements become automated checks, automated checks emit structured results, and structured results map back to a structured control framework. Evidence stops being a screenshot and becomes a record in a system of record.
What I'd actually build first
Don't try to boil the framework. The trap with "compliance as code" is that someone tries to automate all 300 controls in one heroic quarter, burns out, and the spreadsheet survives. Start where the pain and the determinism overlap.
- Pick your most-asked, most-automatable controls. Encryption settings, MFA enforcement, logging coverage, change-management linkage between a deploy and an approved PR. These are the ones auditors always probe and your cloud APIs already answer authoritatively.
- Write the check as a test that fails the pipeline, not a report someone reads. If a control can be violated by a merge, the merge should be what catches it. Evidence and enforcement are the same artifact when you do this right.
- Emit results in a structured format and keep the lineage. Timestamp, the commit, the responsible owner, the control it maps to. That lineage is what turns "we think we're compliant" into "here's the immutable trail."
The cultural shift is bigger than the technical one. The moment a control is a failing test, it becomes the owning team's problem to keep green — not the GRC team's problem to chase down once a quarter. Compliance stops being a thing done to engineering and becomes a property engineering maintains, like uptime. That's the whole game.
I'll be honest about the limit: not everything compiles. Judgment, policy intent, vendor governance, the human review of an exception — those still need people, and they should. But every control you automate is a row you never argue about again, and it frees your sharpest people to spend the audit on the things that actually require a brain.
So here's the challenge. Before your next assessment, count how many of your evidence requests are screenshots of a setting a machine could have checked. That number is your fire-drill tax. Pick the top five, turn them into pipeline checks this cycle, and aim for the most boring audit of your career. In this line of work, boring is the highest compliment there is.
