Immersive Blogs
Publications about innovation and new functionality.
The IF Software Factory: Every Team Member is an Army
Most of the team at Immersive Fusion is holding a coffee right now. A few are looking at a screen. One is on a walk. One is on a customer call. The machines are the ones typing.
This is not a joke, a provocation, or a recruiting line. It is the operating model. Immersive Fusion runs as an AI-native software factory. Engineering, product management, and go-to-market all run on top of a shared governance layer where humans define intent, set constraints, and own outcomes, and agents do the work in between. Code, specs, pricing updates, sales collateral, legal reviews, release notes, market research, blog posts: the shape of the day is the same. A human opens a ticket with intent. Agents converge on the answer. A human approves the result.
We are a small team running a company at the scale that AI makes possible. That is the point, not the caveat.
Something changed in late 2024. Long-horizon agentic coding workflows stopped compounding errors and started compounding correctness. Several teams recognized this independently, including our own. The factory pattern that emerged was, for most of the industry, aimed at code. We aimed it at the whole company.
One Repository Governs Everything
Every AI-native company faces the same question on day one: where does the agent read truth from? Get that wrong and you get confident hallucination at enterprise scale. Get it right and you get leverage.
Our answer is a single upstream source of truth. It holds the business strategy, pricing, product specifications, roadmaps, competitive analysis, messaging standards, legal posture, brand voice, marketing narratives, engineering conventions, and the work tracker. It is the constitution of the company, and every agent, every human, and every downstream surface reads from it.
Everything downstream subscribes to it. The docs site, the company site, the product surfaces, the pitch decks, and every external channel pull business context from the same upstream source. When pricing changes upstream, every downstream surface stays in sync. When a competitive claim gets refined, the messaging standards update and every agent writing a blog, a landing page, or a sales email picks up the new language on the next run.
This is not a content management system. It is a governance layer. The repository is how we tell a small army of agents what kind of company we are.
What lives upstream:
| Domain | What it contains | Why agents need it |
|---|---|---|
| Strategy | Business model, GTM, competitive analysis, vision | Every agent inherits positioning without having to infer it |
| Product | Specs, roadmaps, feature inventories | PM and sales agents cannot drift from what engineering actually built |
| Marketing | Messaging standards, narratives, banned phrases, tone rules | One voice across blog, site, decks, social, docs |
| Finance | Pricing, tiers, unit economics | No agent ever quotes a stale price |
| Legal | Contracts, trademarks, brand usage rules | Every public artifact inherits the legal guardrails automatically |
| Engineering | Conventions, architectural decisions, runbooks | Code agents inherit the house style |
| Work tracker | Active epics, spikes, tasks, status | Agents resume work where the last session left off |
A human can read this repository and understand the company in an afternoon. An agent can read it in a second. Both get the same picture. That symmetry is the point.
The Three Factories
The factory is not one pipeline. It is three, each with its own cadence and its own definition of done, all wired into the same governance layer. In each one, the work splits cleanly between what humans do and what machines do. That split is the design.
Factory One: Engineering
Our engineering factory is the part most people mean when they say "AI writes the code."
What the machines do. Agents read the work item, load the relevant context from the governance layer, generate the code, write the tests, author the migration, update the documentation, and run the work through a gated pipeline. The pipeline has multiple automated verification layers covering security, correctness, architectural compliance, and behavioral regression. Every merge to main passes automated gates. When agents generate tests for agent-generated code, mutation testing verifies that those tests would actually catch bugs, not just run green. The pipeline decides whether the work ships. No human peer review is required for agent-generated code, because the gates are stricter than peer review would be.
What the humans do. Humans write the intent, shape the constraints, and judge whether the output matches what was asked for. They hold the shape of the system in their head and intervene when the agent is about to make a decision that only a human should make. When the gates go green, the human approves and moves on. When they do not, the human reads the failing signal, sharpens the intent, and the agents try again. The human does not type code. The human thinks about the system. A developer who types all day is a developer who is not thinking. A developer who is thinking is a developer who is directing an army.
Factory Two: Product Management
The product factory is the one most companies have not built yet. Roadmaps, specs, feature definitions, competitive positioning, and release planning all live as structured artifacts in the upstream repository.
What the machines do. Agents read the artifacts, reason across them, and produce the next set of proposals. A new competitive move triggers a teardown. A shipped feature triggers a release note, a docs update, a marketing brief, a sales enablement update, and a pricing page review, all from the same commit. The machines do the legwork of turning a product decision into every downstream artifact that decision implies.
What the humans do. Humans decide what is worth building, what is not, and what the company sounds like when it talks about the work. The PM is not a ticket factory. The PM is a taste function. The PM chooses which competitive signals matter, which customer requests are real, and which proposed features deserve a human's scarce attention. The agents cannot do that, and we are not trying to make them.
Factory Three: Sales and Go-to-Market
The sales factory is the most surprising one to outsiders. Landing pages, competitive comparison pages, outbound sequences, pitch decks, demo scripts, objection handlers, and analyst briefing materials are all generated from the same governance layer.
What the machines do. When a competitor publishes a new feature, an agent reads the announcement, cross-references our positioning, drafts a counter-punch blog post, updates the relevant compare page, and queues a refresh of the demo deck. Every outbound sequence, every briefing document, every landing page rewrite starts as agent output that a human then shapes.
What the humans do. Humans are in the room with the customer. The seller reads the prospect, listens for what they are not saying, and decides when to push and when to hold back. The seller does not write collateral. The seller shows up with collateral already written and spends the meeting being present. Everything a seller used to spend the week building now exists before the week starts.
One Team, No Walls
One of the quiet consequences of building this way is that the factory is flat. Every team member gets the same AI tools, the same upstream source of truth, and the same process for turning intent into output. The same prompts work for the founder as for the most recent hire. No private stack for leadership and a thinner stack for everyone else. No research team hoarding capabilities the rest of the company cannot see.
Most companies are organized around walls. Engineering builds, then throws code over the wall to QA. Product writes specs, then throws them over the wall to engineering. Marketing gets the feature list after the feature ships. Sales learns about the product from a slide deck someone in marketing made last quarter. Finance finds out about a pricing change when the invoice comes back wrong. Every wall creates a translation layer, and every translation layer loses signal.
The factory has no walls. There is one upstream source of truth, and everyone reads from it. The same source, in the same format, with the same depth. The engineer who joined last month opens a work item the same way the CTO opens a work item. The marketer drafting a landing page has access to the same governance layer as the engineer shipping a production fix. The seller preparing a demo briefing uses the same model ladder that answers customer questions inside the product.
The result is that every person in the company has the same mental model of what we are building, why, for whom, and how far along we are. Nobody needs to wait for a sync meeting to find out what another team decided. Nobody needs to reverse-engineer strategy from a Slack thread. The information is there. It has always been there. The factory just makes "there" the same place for everyone. As Tobi Lutke put it in Shopify's AI-first memo: "The fundamental skill of using AI well is to be able to state a problem with enough context, in such a way that the task is plausibly solvable." The governance layer provides the context. The humans bring the problem.
The Assistant That Builds Itself
There is one more thing about the factory that matters more than the rest. The AI assistant we sell runs inside the factory that builds it.
Tessa is the AI assistant that lives inside our product. When a customer opens our platform, Tessa is the interface that diagnoses traces, explains topology, and proposes fixes. Tessa also runs inside our own workflow. She is not one of the agents inside the software factory. She is the orchestrator. She coordinates agents, routes tasks to the right model, and removes the noise and minutiae so the humans can focus on the work that matters. She is the colleague who handles the tedious parts before you even notice they exist.
This is dogfooding with teeth. Most vendors who claim to "use our own product internally" mean they have a dashboard open in a tab somewhere. We mean something different. When a developer opens a work item, Tessa is the one marshaling the agents that converge on the answer, the same orchestrator a customer would talk to. When a product manager writes a spec, the research capability running underneath is the same capability we sell. When a seller prepares a demo, the explainer in the briefing document came from the same model ladder that answers "what is wrong with this trace" for a paying customer.
That symmetry is not an accident. It is the compounding loop. Every improvement we make to Tessa for customers is an improvement to Tessa for us. Every insight we get from using Tessa to build software becomes a feature request that customers benefit from. The assistant we ship gets better because we use it. The factory that builds the assistant gets faster because the assistant gets better. The cycle closes on itself, and each turn is shorter than the last.
Tessa builds Tessa. The tests that exercise her behavior, the skills that extend her capability, the prompts that shape her voice, and the documentation that teaches her to everyone who joins the company were all produced inside the same factory she helps run. The human role in this loop is not writing the code. It is setting the specification, tasting the output, and holding the veto. The recursive part is the point. The human oversight is what keeps it honest.
One Model Is a Monoculture
The factory does not run on a single AI provider. Tessa, our customer-facing assistant, runs on OpenAI's GPT models today, and we published that model ladder so customers can see exactly which model lands on which query. Inside the factory, we run Tessa and Anthropic's Claude side by side. Different models are stronger at different things. A model that is excellent at long-context code reasoning may be mediocre at UI copy. A model that is a poet on prose may be expensive for deep investigation.
Over time, we plan to bring that same model diversity to our customers. The architecture is designed for it. But even today, the factory runs on multiple providers.
The first reason is resilience. A factory that runs on a single model provider is a factory with a single point of failure. Providers deprecate models on their own timelines. Providers change prices. Providers suffer outages. Providers change safety policies mid-quarter. We deliberately architect across providers so that no one vendor can brick the production line.
The sharper reason is adversarial robustness. A monoculture is easier to manipulate with adversarial prompts than a routed ensemble. When one agent on one model flags something, and a review agent on a different model either confirms or contradicts it, the disagreement is where the interesting information lives. The factory pays attention to disagreements more than to agreements, because disagreements are where the signal hides.
Measure Ten Times, Cut Once
There is an old carpentry proverb that says measure twice and cut once. We run the factory on a much more extreme version of the same idea.
The factory flips the traditional ratio of planning to implementation. We spend the vast majority of our time planning, refining, sharpening intent, and drawing the lines the agents are not allowed to cross. The actual handoff to agents is a small fraction of the day. Implementation is the cheap part now. Planning is where the leverage is. As Kent Beck observed about the AI shift: "The whole landscape of what's 'cheap' and what's 'expensive' has shifted." When cutting is cheap, measuring is the job.
When implementation is fast, cheap, and nearly unbounded in volume, a sloppy plan turns into a sloppy artifact at machine speed. A half-formed intent produces a half-formed product, in parallel, across every surface, before anyone has time to notice. Speed without precision is a way to dig a very deep hole very fast. The only defense is to measure more times, not fewer.
So we measure ten times. We write the intent. We argue about the intent. We write the scenarios the system has to handle and the edge cases it is not allowed to get wrong. We write the constraints the agent is not allowed to violate. We write the gates that will verify the output. We decide what the system will look like when it is correct, before the agent writes a single line. And only then do we hand off.
When we hand off, we know what we will get. Not because we can predict every token the model will emit, but because the shape of the output has already been pinned down by the shape of the request. The agent runs inside a small, well-lit corridor. We keep it from wandering. It does not surprise us in the dimensions that matter, because those dimensions are already measured.
Human on the loop is not a safety blanket draped on top of a reckless system. It is the structural load-bearing element that lets the system move this fast at all. Remove the planning rigor and the gates, and the factory turns into a very expensive way to generate very confident nonsense. Keep them, and you get something that actually ships.
Why This Wins
Skeptics of AI-native operations usually raise one of three objections. They deserve real answers, not slogans.
"This only works for trivial code." We ship an observability platform with a 3D rendering engine, a native desktop application, a distributed tracing backend, an AI assistant with tool access to our own API, and a multi-tenant cloud service. None of this is trivial and most of it is written by agents inside a gated pipeline. The factory pattern is not a toy. The gates are the point.
"You will lose quality." Quality is a measurable property, and the right measurement is not "did a human read every line." The right measurements are: do the automated gates catch regressions, does the product pass contract tests, does mutation testing confirm the tests are real, do integration tests exercise the actual paths, and does the end user see fewer bugs. Our gates are stricter than most code review processes we have worked under in the past. We catch more than we used to. We ship faster than we used to. If you want to see how often the factory ships, check our release history on Steam. That cadence is the factory in production. Kent Beck, the creator of test-driven development, calls AI "an unpredictable genie that grants your wishes, but oftentimes in unexpected ways" and argues that TDD becomes a superpower when working with AI agents. We agree. The verification layer matters more than ever when the generation layer is a model.
"You are just an API wrapper over someone else's model." The model is a commodity. Our leverage is not the model. Our leverage is the governance layer, the verification pipeline, the propagation graph, the work decomposition, the taste function on top, and the judgment loop around the edges. Tobi Lutke calls the real skill "context engineering, not prompt engineering." A competitor with the same API key and no governance layer does not get the same output. We know because we have watched it not happen.
The Economics
The math of an AI-native factory is the part that makes every board meeting easier.
A traditional software company hires for throughput. It needs more engineers to ship more features, more PMs to define more roadmap, more marketers to produce more content, more sellers to fill more calendars. Each hire adds coordination cost. At some size, coordination cost exceeds throughput gain, and the company stalls.
An AI-native factory hires for judgment. Throughput comes from compute. Judgment comes from people, and judgment scales sub-linearly with headcount because a small group of people with taste can direct a much larger volume of work than they could ever produce by hand. The coordination cost curve bends.
The receipts are the output. Between November 1, 2025 and mid-April 2026, across 23 active repositories, the Immersive Fusion factory shipped:
- 3,613 commits in five and a half months
- 657 commits per month, averaging 22 commits every calendar day
- An active surface that spans a 3D desktop application, a web application, a distributed tracing backend, an AI assistant with multi-model orchestration, a multi-tenant cloud service, an open-source trace generator, an MCP server, a documentation site, a company site, and the governance layer that keeps it all in sync
The heaviest-commit repositories give a sense of where the work actually lives. The 3D desktop application recorded roughly 1,000 commits in the window. The web application: 570. The AI assistant: 510. The governance layer itself: 565. These are not stubs, refactors, or auto-generated churn. They are the real shipping velocity of a serious product.
Now extrapolate to a traditional organization that produces the same output. A 3D rendering engine typically requires 8 to 12 engineers, because Unity and HDRP are specialist skillsets with high code churn. A production web application at that commit volume implies 4 to 6 engineers plus a dedicated QA team. An AI assistant with tool orchestration and multi-model routing implies a 3 to 5 person platform team. Backend services covering analytics, charts, auth, storage, and ingestion add another 5 to 8 engineers. Content, documentation, and the marketing site account for 3 to 5 people across product marketing, technical writing, and web. The governance layer has no traditional equivalent. Its work gets distributed across product ops, strategy, legal, sales enablement, and finance, which at the output volume we produce would add 4 to 6 more people.
That is 27 to 42 individual contributors before you count the managers, PMs, directors, and VPs required to coordinate between them. Add coordination overhead at a typical ratio of one manager per five to seven ICs, and the comparable traditional organization lands in the range of 30 to 50 people producing roughly the same surface area.
We are not 30 to 50 people. We are a small team running a company at the scale that AI makes possible. That is the leverage.
We are not claiming the factory is free. Tokens cost money. The governance layer costs discipline. But the alternative, at the scale of work we produce, would cost a full org chart. That is the math.
Does This Work at Scale?
The honest steelman of the factory model is: it works because we are small. In larger organizations, information is not just a logistics problem. It is a political one. Transparency is not always in people's best interest, at the top of the hierarchy or the bottom. The governance layer we describe here depends on clear decision rights, clear accountability, and a willingness to write down what the company actually thinks. That is hard to do at ten people. It is much harder to do at a thousand.
We take that seriously. The factory model is not a universal prescription. It is an operating choice, and it rests on a set of commitments that a larger organization would have to make deliberately. Decision rights have to be explicit, not implicit. Accountability for changes to the governance layer has to be owned, not diffused. The people with the authority to change how the company thinks have to be the same people who are accountable for the outcomes when those changes propagate.
A BDFL (benevolent dictator for life, in the open-source sense) can maintain a governance layer by force of personality. A functioning organization cannot. What a functioning organization can do is treat the governance layer like the constitution it is: write down who can amend it, how amendments propagate, and what the review process looks like. Most companies do not do this because they have never had to. In a pre-AI world, tribal knowledge carried the load. In an AI-native world, tribal knowledge is a liability, because every agent inherits it badly.
So the honest answer is: the factory works at our scale because we built it, and it will work at larger scale only for companies that make the organizational commitments the governance layer requires. That work is harder than writing the markdown. It is the work of deciding what kind of company you are, and being willing to say so out loud.
The Force Multiplier Cuts Both Ways
There is a second honest limit, and it is more uncomfortable than the first.
The models are a force multiplier. That cuts both ways. Point a high-throughput agent at a well-decomposed problem inside a clean architecture, and the output compounds correctness. Point the same agent at a tangled codebase with ambiguous intent, unclear decision rights, and no house style, and the output compounds the mess. Faster. At scale. With an air of confidence that makes the mess harder to see, not easier.
This is the part of the AI-native story that gets glossed over in most manifestos. The governance layer is not magic. It does not elevate a team that cannot write a clear spec. It does not substitute for software design literacy. It does not teach an inexperienced team which abstractions to keep, which to kill, and which to never build in the first place. It amplifies whatever the team already is. If the team can hold the shape of a well-designed system in its head, the agents produce a lot of well-designed system. If the team cannot, the agents produce a lot of something else, and nobody reading the diff at machine speed will catch it until the interest on that debt comes due.
Clean architecture and disciplined design patterns matter more in the factory, not less. When humans typed the code, bad design slowed them down enough to notice. When agents type the code, bad design gets shipped at the full speed of the pipeline. The gates catch functional regressions. The gates do not catch a design that will make the next feature twice as hard to build. That is a human's job, and it is a job that requires having been wrong about design before, recovered from it, and learned the pattern.
We are not claiming a monopoly on this kind of experience. We are claiming that the factory is a poor fit for a team that has not yet developed it. A small team with deep software-design literacy can run the factory and get leverage. A small team without that literacy will get the same leverage in the wrong direction. The deciding variable is not the governance layer. The deciding variable is the judgment of the people writing to it.
This is why we spend the ninety percent on planning. The plan is where design literacy shows up. The plan is where a senior engineer's instinct that "this abstraction will bite us in six months" becomes a constraint the agent has to respect. The plan is where a product manager's taste for what to cut becomes a specification the agent cannot drift past. Without those humans, the factory is a faster way to produce code that a more experienced team will have to rewrite. With them, it is a multiplier on the work of people who already knew what good looked like.
The Honest Part
We will say out loud what a lot of factory posts skip.
This is hard. The governance layer takes real investment to build and real discipline to maintain. A repository that is out of date is worse than no repository at all, because agents treat it as gospel. Every upstream update has to propagate. Every gate has to hold. Every intent has to be written down with enough precision that an agent stays inside the corridor. A sloppy intent produces a sloppy artifact at machine speed, which is a dangerous combination.
We also learned the hard way that not every task belongs in the factory. Some conversations are better had out loud, on a whiteboard, with two humans and no model. The factory is for execution. The whiteboard is for invention. Confusing the two burns both.
And the humans who thrive in this model are not the humans who thrived in the last one. The instincts that matter now are: write down what you want, notice what you do not like, stop the loop when it matters, and trust the system when it does not. These are not the instincts most of us were hired for ten years ago. We are learning them together.
The Part Only Humans Can Do
If you read this far and came away thinking the humans are mostly operating the machine, you missed the most important part.
The humans are not here because the machine is incomplete. The humans are here because software companies exist to serve other humans, and the people on the receiving end can tell the difference.
A customer calling during an outage is not looking for a perfectly structured status update. They are looking for someone who understands that their quarter is on the line and their team is not sleeping. An investor on a quarterly call is not looking for a report that could have been generated. They are looking for a founder whose eyes tell them whether to worry. A candidate on a recruiting call is not looking for talking points about benefits and mission. They are looking for a future colleague they can read the room with. An employee going through a hard month is not looking for a Slack autoresponder. They are looking for someone who will notice, and say something, and mean it.
Empathy is the one capability that does not scale from a prompt. It does not compound on iteration. It does not emerge from a better model. It comes from a human being who has been through something similar, or who can sit with someone who has, and who is willing to show up without knowing in advance what the right thing to say is. The machine does not do that. It will not do that in the version after this one, either. Only another human can relate to a human, because relating is the thing that requires having been one.
This is not a sentimental add-on to the factory story. It is the reason the factory is worth building in the first place. We automate the work that does not need a human so the humans can spend their day doing the work that does. Writing boilerplate did not need empathy. Talking to a customer whose product is on fire does. Sitting with a teammate who just got bad news does. Noticing a candidate's hesitation and asking the question behind the question does. The factory buys us the minutes and hours that get returned to the conversations only humans can have.
Every company building on AI right now has a choice to make about what the freed time becomes. You can reinvest it into more throughput and run harder at scale. You can reinvest it into the humans and make them more available to the people who need them. We chose the second. The first is a line item on a spreadsheet. The second is why the company exists.
What You Will See From Us
Every post you read on this blog, every feature you see in the product, every landing page you land on, every tier on our pricing table, and every answer our AI assistant gives inside the product came out of this factory. That is not a marketing claim. It is a description of how the work happens. We publish the model ladder that powers Tessa. We publish the skill inventory that powers Tessa's internal workflows. We are willing to show the work because the work is the advantage.
If you are building an observability platform to run alongside the AI systems your team is shipping, we think you will find we understand the terrain in a way that 2D dashboard vendors simply cannot. We live inside the same loop you are trying to monitor.
And if you are building your own AI-native company, we hope this post is useful. We do not think the factory pattern is an edge case. We think it is the beginning of how the next generation of software companies get built, and the companies that figure out the governance layer first will have a decade of compounding leverage over the companies that do not.
Most of our team is finishing a cup of coffee right now. The army is still working.
Start Free. Immersive. AI-guided. Full-stack observability. Built by an army of agents, directed by humans who own the outcome. Enter the World of Your Application®.
Dan Kowalski
Father, technology aficionado, gamer, Gridmaster
About Immersive Fusion
Immersive Fusion (immersivefusion.com) is pioneering the next generation of observability by merging spatial computing and AI to make complex systems intuitive, interactive, and intelligent. As the creators of IAPM, we deliver solutions that combine web, 3D/VR, and AI technologies, empowering teams to visualize and troubleshoot their applications in entirely new ways. This approach enables rapid root-cause analysis, reduces downtime, and drives higher productivity—transforming observability from static dashboards into an immersive, intelligent experience. Learn more about or join Immersive Fusion on LinkedIn, Mastodon, X, YouTube, Facebook, Instagram, GitHub, Discord>.The Better Way to Monitor and Manage Your Software
Streamlined Setup
Simple integration
Cloud-native and open source friendly
Rapid Root Cause Analysis
Intuitive tooling
Find answers in a single glance. Know the health of your application
AI Powered
AI Assistant by your side
Unlock the power of AI for assistance and resolution
Intuitive Solutions
Conventional and Immersive
Expert tools for every user:
DevOps, SRE, Infra, Education