Is There an AI Bubble? ROI, CapEx, and Market Analysis
The Anatomy of Tech Bubbles
Calling the race for models, chips, and digital info centers a “bubble” in the classic sense is a category error. The Dutch mania of the 1630s, as framed by Charles P. Kindleberger, was an episode in which the price of the asset detached from the ability to generate income, utility, or material expansion. It was a resale spiral, not a productivity thesis. In Manias, Panics, and Crises, Kindleberger describes the pattern as displacement, euphoria, easy credit, and reversal. Tulips were speculative tickets with socially constructed scarcity. Foundational models, GPUs, and electrical capacity don’t work that way: they may be overpriced at certain moments, but they remain productive assets because they run inference, train systems, reduce cycle time in corporate tasks, and rely on heavy physical infrastructure to exist.
The correct analogy, so, isn’t with a rare bulb traded among intermediaries; it’s with railroad tracks laid down before there’s enough cargo to occupy them. Carlota Perez explains this mechanism precisely in Technological Revolutions and Financial Capital: big technological waves go through an installation phase in which financial capital runs ahead of the real economy—funding infrastructure before mature business models appear. This happened with canals, railways, electrification, and the internet. The dot-com bubble of the 2000s was destructive for a lot of B2C equity capital, but it built the network on which Web 2.0 later thrived: fiber optics, servers, scalable enterprise program platforms, and global operating standards.
What we see between 2024 and 2026 looks more like that mechanism than empty speculation. The central difference is structural: those leading spending today aren’t cash-strapped startups trying to buy attention; they are highly profitable, vertically integrated companies like Microsoft, Alphabet, Meta, and Amazon—financing computational capacity the way you expand ports before commerce definitively increases. The risk isn’t that the technology lacks usefulness; it’s the mismatch between the investment timeline and the timeline for capturing value.
The numbers reinforce this reading. David Cahn (Sequoia Capital) estimated that the ecosystem would need to generate roughly US$ 600 billion in annual revenue to justify the installed base associated with today’s race; observed revenue came in around US$ 294 billion, leaving a significant gap between CapEx (capital expenditure) and monetization (Sequoia Capital, 2024). This signals a typical phase described by Perez: relative excess capacity before demand matures.
At the same time, Jim Covello (Goldman Sachs) summarized the challenge by noting that Big Tech is heading toward something close to US$ 1 trillion in capital expenditures while many corporate use cases still suffer from high costs and insufficient reliability for critical flows (Goldman Sachs Global Macro Research, 2024). In plain business terms: there’s a lot of road being paved before there’s enough toll-paying traffic.
Even so, reducing this movement to irrationality ignores a point highlighted by MIT Technology Review: Big Tech’s frenzy reflects installing a base technology comparable to cloud or broadband—not just tactical bets on passing applications. When Microsoft integrates copilots across entire corporate stacks; when Google reorganizes search and productivity around multimodal models; and when Meta subsidizes open ecosystems to accelerate downstream adoption—the game is B2B infrastructure. It’s less about selling “a chatbot” and more about redefining entire layers of the technology stack: accelerated computing (accelerated computing), information orchestration (data orchestration), inference security (inference security), plus integration with ERP, CRM, and internal tools.
This distinction matters because purely ornamental bubbles collapse without leaving meaningful economic legacy; bubbles tied to infrastructure leave durable assets even when they destroy financial value in the short term. So the correct anatomy of any possible “AI systems bubble” tends to be less “mania without substance” and more “overinvestment before maturation.” This framing explains how exuberant stock-market behavior can coexist with real material fundamentals.
In the coming years, outcomes will likely be decided less by lab rhetoric and more by converting installed capacity into auditable productivity gains within user companies. If tulips depended on belief in future resale value of the asset, today’s cycle depends on operational domestication of that capacity through viable processes. There may be severe corrections in specific valuations (and brutal margin compression), along with consolidation among suppliers; still, data centers will continue existing, clusters will keep running, and corporate integrations will advance. That’s what separates financial froth from strategic infrastructure.
The Trillion-Dollar CapEx Dilemma: Accounting ROI vs Economic ROI
The central friction in this phase isn’t technological—it’s accounting-operational. Data centers, power contracts, high-speed networks, and GPUs like H100 and Blackwell hit balance sheets now while operational returns at customer companies show up slowly—fragmented—and often below what was promised. It’s like building a port before commercial routes reorganize: concrete (and debt) arrives before ships do.
David Cahn formalized this asymmetry by asking how much revenue would be needed to sustain today’s race: about US$ 600 billion per year versus something close to US$ 294 billion effectively generated by the ecosystem in 2024/2025 (Sequoia Capital, 2024). That gap doesn’t prove irrationality; it indicates that much of the market prices future adoption as if it were already consolidated today.
This mismatch completely changes how we discuss ROI (Return on Investment). In earlier cycles it was enough to demonstrate technical capability: lower response times or better textual quality seemed sufficient to justify broad pilots. Now boards and CFOs require incremental economic proof after inference: what did it cost to serve each case in production; what real utilization rate was extracted from purchased clusters; how much human rework remains embedded in the process; what incremental cash flow entered after governed revisions.
In executive practice there’s a recurring risk: buying premium access to models or reserving accelerated capacity without robust governance can turn operations into an “executive fleet,” transporting tasks equivalent to local deliveries at higher cost. Infrastructure can be technically excellent yet economically oversized for poorly redesigned or poorly governed processes.
That’s why tracking KPIs has shifted from being only an engineering discipline into a financial discipline. Among truly useful indicators are cost per useful inference (cost per useful inference), average GPU utilization both overall and at peak (GPU utilization), latency under real load (tail latency under workload), percentage of responses accepted without human intervention (human-in-the-loop acceptance rate), energy cost per workload (energy cost per workload), and time to payback per deployed application (time to payback). Without this dashboard, ROI becomes narrative.
With it you can separate two phenomena often conflated by markets: local efficiency versus net economic return. A copilot might reduce minutes on a specific task yet still destroy margin if it increases additional review (compliance), if compute consumption rises above captured gains at the edge end-to-end pipeline stage(s), or if rework proves hard to measure early during rollout. It’s analogous to automating an industrial line that increases throughput on an operations dashboard while worsening true profitability by raising scrap rates and manual inspection during monthly closeout.
The digital info also shows abundant supply before full corporate demand matures. CB Insights has been recording robust volumes of Venture Capital directed at the AI chain (infrastructure, tooling, and enterprise applications) even when investors began demanding stronger evidence about sustainable monetization versus narrative-driven growth (CB Insights, 2025). Plenty of capital extends cycles even when final returns remain undefined: additional capacity gets funded based on expectations that winning cases will emerge later.
Strategically this favors “tractor-layer” suppliers: chip manufacturers and cloud operators tend to win even before uniform success happens at the corporate endpoint where buyers capture value conditioned on having adequate internal processes (fertile soil) versus improvised ones (rocky soil).
The executive consequence is direct: it’s not enough to ask whether technology works—you must ask where unit economics close consistently over time. The trillion-dollar dilemma arises when part of the market assumes all installed capacity will find proportional monetization within too short a timeframe to be realistic.
Up to now these numbers suggest another reality: genuine value creation is happening but distributed unevenly across infrastructure providers/intermediate platforms/final corporate buyers. Treating this cycle as merely a race for newer chips risks buying too much horsepower for processes that weren’t redesigned well enough. Treating it as disciplined allocation—measuring real utilization rate(s), incremental margin per use case(s), and speed-to-scalable production—has a better chance of turning inflated CapEx into defensible economic advantage before markets lose patience with promises lacking substance.
Financial Skepticism about Corporate Adoption
Wall Street progressively started rewarding demonstrations that were only technical, demanding consistent proof of economic capture instead. This shift is typical of the “Valley of Disillusionment” described by Gartner: technology keeps advancing continuously while financial analysts adjust the minimum acceptable standard for public evidence.
Rather than accepting broad metrics such as isolated users/pilots/demos/prompts as sufficient proxy for created value (raw adoption), boards now ask for verifiable margin expansion (margin expansion), auditable reduction in cost per process (auditable unit cost), and net impact on revenue (net impact on revenue). The change seems subtle, but it alters the entire decision logic: people stop admiring a “factory-automated” prototype just for its apparent speed and start asking how many defective units it produces per shift; how much capital it ties up; and how quickly it pays back.
It was in this context that the thesis presented by Jim Covello gained traction in the report Gen AI: too much spend, too little benefit?. Goldman Sachs framed the issue by stating that Big Techs are moving toward something close to $1.0 trillion in capital expenditures over the coming years, while the economics of corporate adoption have not yet demonstrated a proportional return (Goldman Sachs Global Macro Research, 2024). The core argument is not anti-technology; it is anti-hasty generalization.
Covello describes a paradox recognized fast by any CFO: models are too expensive to replace simple, cheap tasks at broad scale; at the same time they are still not reliable enough for hard, valuable workflows where there would be a higher potential economic premium. Translating this to management terms: you often end up with a costly structure similar to strategic consulting to perform administrative work that is too cheap to fully replace; plus it requires human supervision when applied to sensitive decisions.
This misalignment erodes ROI by compressing value at both ends: excess cost remains where the work was already too cheap to justify full replacement; excess risk remains where the work was too expensive to accept failures without reliable supervision.
The practical consequence shows up in boards: projects previously approved under generic justification (“strategic learning”) now must survive objective questions about payback (payback period), residual error rate (residual error rate), and compliance impact (impacto compliance). In this debate, one gap mentioned earlier keeps returning as a recurring reference point: $600.0 billion annually needed versus $294.0 billion observed (Sequoia Capital, 2024). For institutional investors, it works as a basic test: either the road cost far more than today’s tolls generate or future traffic needs to grow dramatically—or a significant portion of capital was allocated too early within the correct cycle.
Carlota Perez helps here without caricature: tech bubbles often fund useful infrastructure before economic models mature fully; Wall Street reacts less to the philosophical content of this sequence and more to the timing required by internal and external financial visibility (“when does it start showing up on the balance sheet?”).
This skepticism also grows because real cases begin replacing promotional benchmarks. When companies test internal copilots, they discover raw gains eroded by additional review (governance), human rework, or computational consumption above what is captured at the operational endpoint.
The GitHub Copilot case illustrates this publicly cited technical-economic ambiguity through operational metrics: it reportedly accounted for 46% of code produced in certain contexts up to 2025 after reaching 15.0 million total reported users (GitHub; Microsoft Research, 2025). but operational data indicate a relevant increase in time spent on code review, along with an average interval reported until consistent net gains appear (11 weeks) (GitHub; Microsoft Research, 2025).
For financial analysts, this weighs more than slogans about instant productivity because raw gain without net gain is equivalent to boosting sales via excessive discounting: volume looks impressive quarter by quarter, but margin tells another story at monthly close. The same logic applies when automation speeds up customer support or legal back office without guaranteeing final quality under appropriate supervision.
At this point, Acemoglu & Simon Johnson also enters Power and Progress, arguing that technologies widely celebrated do not always distribute efficiency at the promised speed when implemented as a purely linear instrument focused only on immediate reduction (cut linear costs). Wall Street applies this filter now to generative bets by essentially asking whether unit economics improve without degrading service or increasing operational risk outside controlled niches where strong supervision is appropriate (such as assisted programming under intense review), carefully curated internal search via RAG (Retrieval-Augmented Generation), or technical support with limited scope.
As long as this response remains inconsistent outside those specific niches, much of enterprise adoption will be treated by the market less as proven productivity and more as an expensive option on a future that has not yet materialized economically with sufficient predictability.
Real Operational Challenges After the Model
The bottleneck often underestimated is not in the model; it’s in operations after it. Generative systems accelerate initial output (text/code/initial response/initial analysis) but shift human work into later validation stages—less visible yet more costly—when final auditable quality is involved.
In industrial terms, it resembles installing a machine that doubles line speed but transfers subtle defects to final inspection: initial dashboards celebrate throughput while P&L inherits extra inspection (extra inspection), rework (rework), and residual risk (residual risk). This paradox explains why measuring only “output generated” confuses activity with real economic efficiency delivered to internal or external customers.
GitHub Copilot makes this point clearly again using the same metrics already cited earlier because they connect technique to operational economics: it reached 15.0 million users by 2025 and reportedly accounted for an average close-to/46% of code produced in analyzed contexts (GitHub; Microsoft Research, 2025). At first glance, it seems like it ends the productivity discussion—but telemetry showed a relevant increase in time spent on code review. The cited reason recurs across these technical scenarios: suggestions may look syntactically plausible but be semantically irregular due to local architectural inconsistency or insufficient testing—resulting in an effect known among experienced directors: writing becomes faster while making it maintainable becomes slower.
That’s why teams take about 11 weeks to register consistent net gains reported in the cited sources (GitHub; Microsoft Research, 2025). Financially, this changes business cases completely because ROI depends less on licensing alone and more on maturity of internal technical conventions capable of absorbing this new production flow while maintaining final quality under appropriate governance.
In other words, benefit appears sustainably only when review evolves alongside generation from early iterations—reducing accumulated rework.
There is also corrosive organizational risk tied to cognitive dependence, especially among junior professionals.
When less experienced developers receive “good enough” solutions rapidly, they may produce more lines without consolidating mental models needed for full technical judgment—critical decomposition—rigorous reading of trade-offs—and deep understanding of performance and safety.
In the short term production rises; in the medium term an operational base forms that cannot diagnose complex incidents.
For software-intensive companies, this becomes a direct strategic implication: you reduce apparent cost today while weakening your internal capability tomorrow.
Raw gains turn into hidden liabilities when code accepted out of convenience replaces code accepted out of technical conviction.
This pattern appears outside engineering too.
Klarna showed a similar partial reversal: its AI handled 67% of chats, covering 2.3 million conversations—reducing average resolution time from 11 minutes to 2 minutes, which is an estimated equivalent workload described as equivalent agents (Klarna , 2024).
Between 2025–2026, it had to reintroduce humans into a flexible model after noticing perceived deterioration in quality/satisfaction publicly acknowledged by CEO Sebastian Siemiatkowski (Bloomberg , 2025 ; Bloomberg , 2026).
The lesson applies directly: drastic compression of execution time does not automatically mean better resolution.
In software you get inflated review; in support you get fast contact resolved poorly; in regulated operations you get exceptions escalated too late.
Acemoglu & Simon Johnson warn exactly against this recurring mistake: using technology primarily as a linear cut often produces systems that look efficient on spreadsheets but are fragile in reality.
Finally comes quantitative evidence tied to human execution: Morgan Stanley points out that companies that underinvest in human training relative to technical investment record ROI 60% lower (Morgan Stanley Research , 2026).
Without proportional training, daily promises turn into silent accumulation of operational friction eroding margin, quality, and internal capability month after month.
The correct design treats these systems as supervised amplifiers powered by robust processes: risk-oriented review, clear accountability trails for humans, metrics separating raw gain versus net gain, training aligned with technological ambition.
Without that, mechanization becomes invisible operational debt charging interest later.
Cultural and Social Impacts When Automation Becomes a Straight Cut
The common promise tied to mechanization fails for the same reason seen in earlier cycles: it confuses apparent work removal with real value creation.
Acemoglu & Simon Johnson argue in Power and Progress that distributive effects depend on institutional choices during implementation.
When the priority becomes replacing people just to compress payroll, the frequent outcome is not shared abundance; it’s more fragile services; less professional autonomy; and the silent transfer of costs to worker customers.
At the business level, this shows up first on the spreadsheet: the transaction cost indicator falls; then greater rework appears; higher churn; lower trust .
In practice, the spreadsheet improves first; experience degrades afterward.
The Klarna case helps because it shows the difference between apparent efficiency and economic sustainability.
In February 2024, its assistant handled 67% of chats, covered about 2.3 million reported conversations, reduced average resolution time (11 minutes down to 2 minutes), delivering equivalent capacity described as approximately equivalent agent work (700) (Klarna, 2024).
For anyone seeking instant productivity, it looked like a perfect script: fewer headcount; more speed; same coverage.
But customer service does not work like an assembly line: solving quickly doesn’t mean solving well.
Between 2025-2026, Sebastian Siemiatkowski publicly admitted that an excessive focus on cost compromised perceived quality and satisfaction, leading the company to reintroduce human work into a flexible model aimed at restoring service quality—judgment contextualized according to cited sources (Bloomberg, 2025; Bloomberg, 2026).
The managerial conclusion here is harsh: mass-replacing people with mechanization without properly redesigning the service is equivalent to swapping experienced managers for a script that keeps working until a relevant exception appears.
This retreat should be read less as an isolated technological failure and more as evidence of a wrong managerial choice about where to capture value.
There is a decisive difference between using models to eliminate repetitive tasks under human supervision versus emptying roles whose value depends on empathy, discernment, and implicit negotiation.
When that distinction is ignored, double effects emerge.
For customers, the service becomes cheaper for the company but worse for those who need it: superficial fast answers; late escalations; a sense of abandonment when cases fall outside the standard pattern.
For workers, a bifurcated market grows: a minority of highly qualified professionals designs and supervises systems; most enter intermittent flexible regimes correcting failures left behind by poorly calibrated automation.
The “Uber-style” model cited—associated with the return observed in Klarna—illustrates the shift described in the Bloomberg sources mentioned earlier (Bloomberg, 2026).
From a cultural perspective, excellence criteria change.
When the dominant incentive becomes average interaction time or automated volume per channel, professionals learn quickly—but nuance turns into cost care turns into inefficiency.
This corrodes internal standards long before annual reports reflect it.
And it returns as quantitative evidence cited earlier: research cited indicates companies underinvesting in training relative to technical investment record ROI 60% lower, according to Morgan Stanley Research (Morgan Stanley Research, 2026).
Sustainable productivity comes from combining well-scoped automation with repositioning qualified work—because context matters at key moments.
Technical progress can broaden prosperity or concentrate gains by degrading essential services depending on the institutional architecture chosen during implementation, as Acemoglu & Johnson emphasize.
Agentic AI in Business Practice and Necessary Maturity
Moving beyond passive chatbots toward the broader concept associated with agentic AI changes the nature of the corporate problem.
A system based on LLM + RAG responds better because it consults internal documents but still operates like an informed junior analyst waiting for step-by-step instructions.
By contrast, an agent connected directly to transactional systems such as ERP CRM billing tools workflow acts more like an operational coordinator—interpreting context, deciding next actions, calling APIs, logging events, monitoring exceptions until task closure.
That difference may seem semantic but becomes economic because it changes the underlying flow:
The chatbot reduces interface friction;
The agent changes execution within the workflow.
Integrated with Salesforce SAP ServiceNow, it stops being only a conversational layer and becomes an intermediate execution mechanism automating recurring decisions that previously required constant supervision across sales purchasing customer support collections financial operations.
Computational autonomy without organizational maturity creates a manager without clear instructions or authority boundaries.
The theoretical gain can be enormous because agents do actions beyond suggestions—but it requires elements many companies still have not fully built:
Reliable data across legacy systems;
Clear authorization rules;
Auditable trails;
Robust exception handling design.
Morgan Stanley provides a useful snapshot of this stage:
In research involving over 800 companies, the bank estimated average projected ROI associated with agentic AI systems initiatives near-/reported within high values—including an indicated projected average of 171%—but only 11% of organizations managed to move from pilot to operating real technology at scale, according to the cited source (Morgan Stanley Research, 2026).
Strategic takeaway: economic potential exists but gets trapped by classic enterprise execution bottlenecks.
It’s like buying modern fleets for national distribution and discovering logistics hubs still run disconnected spreadsheets—warehouses poorly synchronized. A powerful asset needs its ecosystem ready enough to absorb it.
That’s why so many proof-of-concept demos impress while core processes fail:
A sales agent generates value only if it consults CRM history validates ERP limits checks real-time inventory drafts proposals according to legal policy and records everything under proper governance.
Any broken link—duplicated digital info permission poorly defined unstable API ambiguous fiscal rule—leads to interrupted autonomy requiring human intervention.
So, better programs abandon the fantasy of a general universal substitute agent and adopt domain-specialized agents with narrow scope and hard metrics/processes:
Completion rate without escalation;
Transaction cost resolved;
Percentage of exceptions correctly routed;
Net impact on SLA operational margin.
Among Morgan Stanley’s data points, what may be most important is capturing what’s real:
Companies underinvesting in human training relative to technical investment recorded ROI 60% lower, according to sources cited repeatedly here as well (Morgan Stanley Research, 2026).
This dismantles simplistic claims that increasing autonomy reduces internal training needs; the opposite is true:
The higher the degree of delegated agency in a system, the more sophistication teams need—defining policies monitoring emergent behavior auditing decisions—and redesigning adjacent processes.
A financial agent reconciles payments renegotiates collections—it doesn’t eliminate experienced operators; it demands better professionals defining guardrails compliance transactional handling sensitive exceptions.
Without human investment, promised ROI evaporates into silent rework risk regulatory exposure reputational loss costs rarely appear in initial business cases.
Corporate maturity at this frontier will be measured less by how many agents are announced
And more by discipline coupling them with real operations:
Access control observability executive accountability consistent transactional integrations .
Advanced companies see that competitive gain isn’t about indiscriminately replacing human interfaces with smart automation,
But about decomposing processes where recurring decisions can be delegated with low marginal risk high auditability .
This shifts where competitive advantage centers:
Instead of choosing an impressive model,
It becomes orchestrating master data integrating transactional governance operationally scaling.
Anyone who treats agentic AI as critical corporate software will have a concrete chance to capture part of that ROI projected by Morgan Stanley;
Anyone who treats it as a glamorous chatbot extension will remain stuck in purgatory: pilots full of internal demos but little verifiable economic transformation.
Hybrid Future And New Success Metrics
The useful post-2026 debate replaces the binary question (“will the bubble burst?”) with a practical one:
Which companies can turn expensive computing capacity into reliable productivity?
The answer tends to favor organizations abandoning childish logic—automation as a synonym for linear cuts to the payroll—and adopting a hybrid design where systems handle volume while humans preserve judgment, as the exception, with responsibility.
Perez frames it well:
After an installation phase financed by abundant capital, comes an infrastructure “domestication” phase through viable operating models.
In plain language, the runway has been paved; now the winners are those who operate regular flights—with safety, occupancy, and margin.
Klarna’s walk-back illustrates the emblematic warning again by linking the wrong KPI to results:
The company showed impressive initial gains by automating 67% of chats, covering about 2.3 million reported conversations, reducing average resolution time (11 minutes down to 2 minutes), according to cited sources,
But it had to reintroduce humans after realizing that speed without quality degraded satisfaction and effective resolution, as cited earlier from Bloomberg,
Thereby invalidating the wrong KPI—while still keeping technological validity intact, provided governance is adequate.
New metrics must track human-machine symbiosis:
Measuring only headcount reduction and automated volume is as myopic as evaluating a logistics network by only counting trucks removed while ignoring delay, damage, and churn.
Mature operations use OKRs by migrating indicators for symbiosis:
Acceptance rate without rework;
Percentage of cases resolved correctly on the first interaction;
Net time to value captured after human review;
Exception density per automated process;
NPS impact on margin and operational risk.
In software development, Copilot is an example in development:
It’s not enough to celebrate “46% of code” or “15 million users.”
The decisive indicator is whether the gain survives an increase in code review
And the average reported interval until real net productivity appears (11 weeks), cited in the GitHub/Microsoft Research sources mentioned earlier.
The same reasoning applies in legal services, finance, and sales:
A system may accelerate first-pass work but worsen final outcomes by increasing supervision needs—introducing errors that are hard to detect.
Strategic data comes precisely from companies that left the lab and hit a wall on execution:
A Morgan Stanley study involving +800 companies indicated agentic AI initiatives could carry high projected average ROI—including the reference mentioned earlier (171%)—but only a small portion moved from pilot to scaled production (11%),
In addition, companies underinvesting in human training relative to technical investment recorded lower ROI (60% less), according to sources repeatedly cited here as well,
Showing future profitability depends less on “more powerful” models
And more on institutional capacity: training teams, defining authority levels, auditing exceptions, integrating legacy systems—while keeping compliance intact.
Stanford HAI reinforces this axis through annual reports of the AI Index:
Widespread diffusion does not automatically mean widespread value creation;
As technology matures, governance rigor and responsible implementation move from being an extra layer to becoming a basic economic condition for sustainable capture, according to Stanford HAI cited here too (Stanford HAI, 2025).
For boards, this implies revisiting corporate portfolio goals:
Replace OKRs like “automate X% of interactions” or “reduce Y% labor cost”
With goals tied to net quality:
Increase correct resolution rate with minimal supervision;
Reduce method cost without statistically relevant drops in quality;
Shorten time to net team productivity;
Maintain a fully auditable trail for decisions delegated to agents.
It also makes sense to create composite indicators:
Net productivity per expanded FTE enabled by the system;
Operational confidence index per automated workflow;
Incremental return adjusted for regulatory risk.
This even changes how we read today’s excessive investment:
Sequoia’s macro question about US$600 billion remains valid as a test of macroeconomic CapEx discipline,
But at the micro level natural selection tends to be less dramatic—and less apocalyptic:
Some stock theses compress; some suppliers disappear; many pilots die,
While companies treat these systems as supervised operational infrastructure,
Not an accounting shortcut cutting people,
Tend to exit the cycle with better processes and defensible margins—durable competitive advantage.
Conclusion
The most useful reading of a possible AI bubble is not binary—it’s economic. At the same time there is excess expectation in parts of the market and real creation of productive capacity, but value capture is proving far more difficult than the commercial narrative suggested. The signals gathered throughout this article point in that direction: Sequoia’s question about US$600 billion remains relevant as a CapEx discipline test; Klarna’s case showed that automating 67% of chats and cutting average resolution time from 11 minutes to 2 minutes does not guarantee better net outcomes; and Morgan Stanley’s research—with projected ROI of 171% for agentic AI but only 11% of initiatives in scaled production—exposes the mismatch between promise and execution. The central point, so, is not denying technology—but separating apparent throughput from sustainable operational performance.
The next cycle should be less about buying capacity and more about proving governance, integration, and net productivity. Boards, CFOs, and operations leaders will need to decide where AI works as critical infrastructure, where it remains an expensive experiment, and which processes require permanent human supervision. The main risk isn’t only paying too much for models or digital info centers—it’s institutionalizing wrong KPIs and turning local gains into systemic loss. Those who move forward with composite metrics, auditable trails, and serious investment in training are likely to convert CapEx into operational advantage; those who treat AI as an accounting shortcut will probably discover too late that scaling without control destroys ROI.
Further Reading
Recommended Books
- Prediction Machines: The Simple Economics of Artificial Intelligence by Ajay Agrawal, Joshua Gans, and Avi Goldfarb (Harvard Business Review Press, 2018). This book provides an economic framework for understanding how artificial intelligence affects business strategy and society—focusing on declining prediction costs and their implications for ROI and capital allocation.
- AI Superpowers: China, Silicon Valley, and the New World Order by Kai-Fu Lee (Houghton Mifflin Harcourt, 2018). While it doesn’t focus directly on ROI or CapEx, this book offers an in-depth view of the global AI landscape—including massive investments and technological races—that can influence perceptions of value and bubble risk.
- Working with AI: Real Stories of Human-Machine Collaboration by Thomas H. Davenport and Steven Miller (MIT Press, 2022). This book explores collaboration between humans and AI across different business contexts—offering practical insights into implementation challenges and opportunities—which is crucial for understanding real ROI and associated human-capital costs tied to AI.
Reference Links
- MIT Technology Review – A news portal with deep analysis on emerging technologies—including AI—that often publishes articles about economic impact, investments, and the future of the AI market.
- Morgan Stanley Research Reports – To access reports on the AI market, corporate readiness assessments, and financial analyses that may include data about CapEx and ROI in AI implementations—such as the “Enterprise AI Readiness Guide” mentioned in the article.
- GitHub Blog – GitHub’s official blog frequently publishes updates and insights about its tools—including GitHub Copilot—and may contain case studies or data on productivity impacts from AI in software development.
