Vendor Concentration Risk in the Age of the Three-Lab AI Stack

Every AI strategy deck I've seen in the last year has a slide that says something like "model-agnostic" or "we can swap providers at any time." It's a comforting line. It's also, in most cases, not true — and the gap between that claim and reality is exactly the kind of thing a board should be poking at before it becomes a footnote in a post-mortem.

Here's the uncomfortable starting point. Strip away the logos and the wrappers, and the overwhelming majority of serious AI capability in production traces back to a very small number of frontier labs. Call it three, give or take, depending on how you count. Your "AI vendor" — the SaaS tool, the copilot, the agent platform you bought — is very often a thin layer over one of those labs' APIs. So when you tell yourself you have a diversified AI supply chain, what you may actually have is a diversified set of invoices pointing at the same two or three underlying models. That's not diversification. That's correlation wearing a costume.

The real mechanic: concentration stacks vertically, not horizontally

The instinct in vendor risk is to count vendors. We're trained to ask "how many suppliers do we have for this critical function?" and feel good when the answer is more than one. But AI concentration doesn't live at the horizontal layer where you're shopping. It lives vertically, all the way down the stack, and it compounds at every floor.

Start at the top. Your application vendors mostly resell a handful of foundation models. Those labs, in turn, run on a startlingly narrow base of compute — one dominant accelerator vendor whose chips are the currency the whole industry trades in. Those chips are fabricated by essentially one company at the leading node, in one geography that carries its own geopolitical tail risk. And the whole apparatus depends on physical inputs that don't care about your procurement timeline: power, water for cooling, advanced packaging capacity, and yes, mundane things like the helium and specialty gases that fabrication and cooling quietly rely on. When people say "AI is just software," they're skipping the part where software runs on a global physical supply chain with single points of failure at almost every layer.

So the honest map of your AI dependency isn't a wide row of vendors. It's a narrow column. And a narrow column means a disruption at any floor — a lab changing its terms, a chip allocation crunch, an export-control shift, a packaging shortage, a regional event — propagates upward through everything you've built, regardless of how many vendor contracts you signed.

Index funds are quietly force-feeding the concentration

There's a second-order effect that doesn't show up in a security review but absolutely belongs in a board conversation, because it shapes whether this concentration eases or intensifies. The capital flooding into this space is not, for the most part, discerning capital making bets on the best architecture. A lot of it is passive — index funds and the retirement accounts behind them mechanically buying the largest companies because they're the largest companies. The biggest AI and AI-adjacent names get force-fed inflows simply by virtue of their index weight.

That matters for resilience because it removes the normal market pressure that would fund alternatives. When capital flows to incumbents automatically rather than on merit, the second and third sources you'd want for a healthy supply chain get starved of the oxygen they'd need to mature into real options. The market is, structurally, reinforcing the monoculture. You can't assume "the market will sort out diversity" when the dominant flow of money is a thermostat set to "buy whatever is already biggest."

I'm not making a market-timing call here — I have no idea what any of this is worth, and neither does anyone telling you they do. The point for an operator is narrower and more durable: the conditions that produced the concentration are self-reinforcing, so betting your roadmap on it dissolving on its own is not a plan.

What this looks like through a fintech operator's lens

I run security and DevOps for a company that serves more than 1,500 financial institutions, and the mental model I keep coming back to is one regulators have drilled into financial services for years: concentration risk and concentration in critical third parties. We already know how to think about this. If every bank in a region clears through the same one processor, the system is fragile no matter how solid any single bank is. AI concentration is the same shape, just earlier in its regulatory life. The frameworks aren't new; we just haven't pointed them at the model layer yet.

The trap I watch teams fall into is treating model access like electricity — assumed, fungible, always on, priced per unit. It's closer to a specialized commodity with a fragile supply chain and a counterparty who can reprice, deprecate, or restrict you with a changelog entry. Model deprecation alone is an operational risk most shops have never rehearsed: the exact model your prompts, evals, and fine-tunes were built around gets retired, and your "drop-in" replacement behaves differently enough to break things in ways your tests don't catch. That's not a hypothetical for the future. That's a Tuesday.

The questions a board should be asking now

I try not to write bullet lists, but governance questions are the exception, because a board's leverage is precisely in the questions it forces management to answer in writing. So here's what I'd want on the table:

What is our true upstream count? Not how many AI vendors we pay — how many distinct foundation labs we actually depend on once you trace through the wrappers. Make management draw the vertical column, not the horizontal row.
What breaks, and how fast, if our primary model is repriced, rate-limited, or deprecated? If the answer is "we'd switch," demand the rehearsal, not the assertion. A failover you've never executed is a hope, not a control.
Where are we exposed to the physical layer? Compute allocation, regional dependency, supply constraints we don't control. We don't have to solve geopolitics, but we should know which roadmap commitments are silently leaning on it.
What's our portability cost, in real terms? Prompts, evals, fine-tunes, and agent scaffolding are switching costs masquerading as productivity. Measure them before you need them.
Are we designing for substitution, or for lock-in? Abstraction layers, an internal eval harness that's model-independent, and at least one credible second source kept warm — these are cheap insurance now and impossible to retrofit during a crisis.

None of this is an argument against using frontier AI. I use it, I ship it, and the capability is real. It's an argument against confusing access to a thing with control over it. The companies that come through the next few years intact won't be the ones who picked the "right" lab. They'll be the ones who assumed every layer of the stack could fail and built so that no single failure took the roadmap with it.

So here's the challenge for the next board meeting: stop accepting "model-agnostic" as a statement of fact and start treating it as a claim that has to be proven. Ask management to demonstrate the switch, not describe it. If they can't — and most can't yet — then you've just found the most important item on the AI roadmap, and it isn't a model. It's resilience.

AI GovernanceVendor RiskBoard StrategyResilience

Case Studies & Practice

Open Source