OpenForm
Blog

Introducing OpenForm: Documents as Code

5 min read

Introducing OpenForm, an open-source framework that provides AI agents and developers with a context layer and tooling to automate transaction and compliance documents.

OpenForm treats documents as code. By defining them as structured artifacts, it makes them composable, modular, portable, testable, and easy to version.

To understand why that matters, it helps to look at how document infrastructure is usually built today.

Documents Are APIs Nobody Designed

If you've built software in insurance, finance, healthcare, or other regulated domains, you've probably maintained a document layer made up of PDF templates, field mapping spreadsheets, validation rules, template rendering APIs, and test suites. All of this falls on a "forms person" on the dev team who interacts with legal, compliance, and template designers.

This pattern is so common that teams stop noticing how strange it is. We have mature infrastructure for APIs, databases, queues, and files. We do not have the same discipline at the document layer, even though regulated documents carry structure, rules, and downstream consequences. Rendering gets treated as the definition of the document when it is really only one output.

The result is drift and high maintenance burden. When a document changes, every one of those systems has to change with it. This is not a tooling gap. It is a structural one.

Documents as Code

With OpenForm, documents are defined as a structured artifact with explicit fields, parties, layers, validation rules, and rendering logic. That artifact becomes the source of truth for capture, validation, and output. Instead of scattering document behavior across templates, frontend code, backend services, and operational workarounds, you define the document once and let the rest of the system work from that definition.

Your application populates the artifact with data. Validation runs against it deterministically. The same definition can render to multiple output layers. Consider a W-9 artifact: it can declare that taxpayer name and TIN are required, that tax classification changes which fields are relevant, and that the validated instance can render to a web intake flow, a completed PDF, or structured JSON for downstream systems.

Once the document exists that way, agents no longer have to infer the form from layout. They can inspect the artifact for its fields, requirements, conditions, and outputs. That unlocks workflows that are difficult to do cleanly in the old stack: conversational form completion, automatic UI and intake generation, deterministic validation before submission, and multimodal document I/O across chat, voice, translation, summaries, and rendered outputs.

That changes the shape of the work. Developers can version and test document logic. Compliance teams can inspect what a document requires. AI agents can work from the same structure the application uses instead of inferring meaning from layout. The document becomes infrastructure instead of glue code.

AI Makes This Timely

For years, teams could absorb the maintenance cost of broken document infrastructure. It was painful, but it stayed contained inside engineering teams. AI changes that because agents hit the weakness of the old model immediately.

An agent asked to complete a regulated document does not want a screenshot or a PDF blob. It needs structure. It needs to know what fields exist, which are required, which are conditional, and what counts as valid. Without that, the agent is forced to guess.

In regulated workflows, guessing correctly 80% of the time is not good enough. Output that looks right but violates the underlying rules is worse than no output at all. The constraint is not intelligence. It is infrastructure. As more teams try to automate document-heavy workflows, the need for machine-readable document definitions becomes obvious.

What Exists Today

OpenForm is open source under the MIT license. Today you can define artifacts, populate them with data, validate them, and render them without a signup flow or cloud dependency. The framework includes TypeScript libraries for integrating artifacts into applications, CLI tooling for working with them locally and in pipelines, and MCP support so agent frameworks can discover, fill, validate, and render artifacts through a standard tool interface.

That means the foundation is already there: one definition for the document, deterministic validation around it, multiple outputs from the same source of truth, and a tooling surface that both applications and agents can work with directly.

Where We Are Taking This

We think documents should be treated with the same rigor as the rest of software: structured, versioned, validated, and open. That is what Documents as Code means in practice. Our goal is not just to ship a framework. It is to make document infrastructure easier to implement, easier to maintain, and easier for AI systems to work with.

Over time, that means a public registry of common artifacts so teams do not have to recreate canonical documents like W-9s, 1099s, and ACH authorizations from scratch, plus stronger tooling around how those artifacts are defined, shared, and used across systems. The direction is straightforward: fewer one-off document implementations, more reusable infrastructure.

Start with the docs site and inspect how an artifact is defined. If your team is dealing with regulated forms, internal document rendering systems, or AI workflows that keep running into PDFs, reach out. We would like to compare notes, learn where the current stack breaks, and hear your feedback.