A CXO Briefing

What it actually takes to ship enterprise AI

Field notes from building and running AI systems in production.

Most AI projects stall not because the model fails, but because everything around it was never built. This briefing is about that everything, drawn from systems we have put into production across sectors.

Download PDF

Executive summary

Most enterprise AI dies in the gap between a demo that impresses and a system that survives real use.

The failure numbers are real: Gartner expects more than 40% of agentic AI projects to be cancelled by the end of 2027, and a 2025 MIT study of over 300 initiatives found that roughly 95% delivered no measurable return. But the cause is rarely the model. By the time a model is famous enough for you to read about it, it is already good enough for most business work.

What actually separates the systems that ship is unglamorous, and it is where almost all the effort goes: choosing a narrow problem worth solving, taming messy real-world data, getting the compliance and data path right, and building something the client can run without the people who built it. The lessons below come from putting working systems into production, not from theory.

The reframe

The model is the easy part.

This is the single most useful thing to understand before you fund an AI project, and the one that surprises most leaders.

In a real build, the call to the AI model is often the smallest, cheapest, simplest line in the whole system. On one of our live systems the AI processing costs a few hundred rupees a year; the platform around it is almost the entire bill, and almost the entire effort.

The work that decides success is the unglamorous part: pulling in messy inputs that never look like the demo, making data save and stay saved across many users at once, handling the edge cases real life throws up, and meeting the compliance rules your sector lives under. None of that is about which model you chose. A project that treats the model as the hard part, and everything else as a detail, is the project that stalls.

The lessons

Six lessons from putting AI into production.

None of these are theoretical. Each one is a place we have seen projects either hold up or fall over, and each comes with the cost of ignoring it.

Start with one painful, measurable workflow

The systems that ship solve one specific, irritating job with a number attached: a query that takes days, work double-handled across a team, a report that swallows a morning. AI for the company has no owner and no finish line.

Ignore it and

Scope sprawls, no one can say whether it worked, and the project dies of ambiguity.

Budget for the system, not the model

The AI call is the cheap part. The real work is ingesting messy inputs, making data persist and sync correctly, and handling the cases that never appear on stage. Most of what determines success has nothing to do with the model you picked.

Ignore it and

You get a dazzling demo that quietly breaks the first week real data hits it.

Settle the data path before the model

Where data is allowed to live and be processed often rules out options before quality is even discussed, especially for regulated firms. Decide residency, retention, and access first, then pick a model that fits inside those walls.

Ignore it and

You build on a model you are later forced to rip out for a compliance reason you could have seen on day one.

Never depend on a single model

Models get deprecated, repriced, or go unavailable in your region. A primary with a cheaper fallback behind it turns an outage or a price change into a configuration switch, not a crisis, and means no vendor can hold your system hostage.

Ignore it and

One vendor's bad day or pricing decision becomes your outage, with nothing to fall back to.

Make the output a work product, not a chat

Value shows up when the system hands a person something they can act on at once: a triaged action with the next step spelled out, a ready document, a flagged exception. Not a chat window they have to coax. Build for the user's real job.

Ignore it and

You ship a clever tool no one adopts, because using it is more work than the old way.

Build it so they can run it without you

A system that needs its builder for every change is not shipped, it is leased. Self-service admin, plain-English error logs, backups the team can restore, and a clean handover are what make a project something the client owns. Plan for hardening after launch: real use always surfaces what the demo could not.

Ignore it and

You leave behind an expensive dependency that stalls the moment the builder steps away.

The gap

What done looks like, versus what shipped looks like.

A demo and a production system can look identical in a meeting. They are not the same thing, and the distance between them is where budgets and timelines quietly disappear. This is the gap your technology lead is paid to close.

	Looks doneThe demo	Actually shippedProduction
The inputs	Clean, hand-picked sample data	Messy reality: odd date formats, forwarded email chains, duplicates, blanks
The data	Held in the browser or a spreadsheet	Saves, syncs across users, dedupes, and survives a refresh or a crash
The model	One impressive model	A primary with a fallback, chosen to fit the data path and compliance
The output	Plausible-looking text	A structured work product someone acts on without rechecking it
When it breaks	It doesn't, on stage	It fails safely, is logged in plain English, and can be recovered
Who runs it	The people who built it	The client's own team, with documented handover and admin controls

In our experience the bugs that surface after go-live are almost never about the AI. They are about data not saving, records duplicating on re-upload, a real-world date format the system rejected, or a screen losing the user's place under live updates. Plan for this hardening phase; it is normal, and it is where a system earns trust.

The shape

What a first project actually looks like.

A useful first AI project is small, fast, and narrow on purpose. The aim is one working thing in real hands, not a platform on paper. Indicative, not a fixed timetable: the point is the order, not the exact weeks.

Weeks 1-2

Pick the workflow, agree the success metric, and settle the data path. The decisions that prevent expensive rework later.

Weeks 3-6

Build the narrow version on real data, with a model fallback and the plumbing that makes it hold up under daily use.

Week 7 on

Put it in real hands, harden what live use exposes, then expand from a base that already works.

The pattern: most of the risk is retired in the first two weeks, before much is built at all. The firms that rush past those weeks pay for it in month three.

The takeaway

A 30-second check before you greenlight.

Run a proposed project against these four. The verdict tells you whether to start now, narrow it first, fix the foundations, or wait.

A named owner and a success metric that is a number

Someone accountable, and a clear definition of what winning looks like

Right idea, but the scope is too broad to ship in weeks

The ambition is sound; the first cut needs to be smaller

Narrow it first

Sound use case, but the data is messy or scattered

The idea works only once the underlying data is usable

Fix data first

No clear owner or no agreed measure of success

An experiment with no accountability and no finish line

Not yet

Have a workflow worth shipping?

We take narrow, painful problems to production systems your team can run without us.

Talk to Us

This briefing is general guidance for business leaders, not investment or legal advice. The lessons here are drawn from our own production deployments across sectors; all examples are generalised and anonymised under client confidentiality. Third-party figures cited as published: Gartner, "Over 40% of Agentic AI Projects Will Be Canceled by End of 2027" (June 2025); MIT study of enterprise AI initiatives, as reported 2025. Figures may be revised over time; verify before relying on them. Current as of June 2026.