Buying AI Well
How to diligence AI companies before the liabilities become yours
The numbers tell the story before the strategy does. Global M&A surged roughly 40% in 2025 to a record $4.9 trillion of deal value, with nearly half of strategic technology transactions over $500 million touching an AI-native target. Private AI companies raised $226 billion in the first quarter of 2026 alone, more than the entire 2025 calendar year, and 266 AI M&A deals closed in that same quarter, ninety percent above the prior-year run rate. Median private AI multiples sit at roughly 30x EV/Revenue, against 7 to 8x for public SaaS.
That premium is doing real work. It is paying for talent density, proprietary training data, customer outcomes, and an option on what these systems become in three years. It is also, with uncomfortable frequency, paying for legal, technical and governance risk that the buyer has not actually priced. The acquirers writing the largest cheques are behaving like collectors at a private auction: confident, fast, surrounded by advisors. What many of them lack is the diligence apparatus to verify what is in the crate.
This is not an argument against acquiring AI. The strategic case for buying capability rather than building it is, in many sectors, self-evident. The argument is narrower and aimed at the people whose names appear on the deal documents. Standard tech M&A diligence will miss the things in an AI target that matter most, and the cost of missing them is borne by the buyer in regulatory liability, in customer flight, and in goodwill that gets impaired in year two. The patterns below are where the holes tend to be.
The Data Question
In an AI business the asset is the model, and the model is the data it was fed. Diligence frameworks designed for SaaS contracts and code repositories will not interrogate that hierarchy, and lawyers who have not internalised it tend to miss the central issue entirely.
The first job is to trace, line by line, the provenance of every training dataset still material to model performance. Licensed corpora routinely include change-of-control carve-outs and field-of-use restrictions, and the licensor will sometimes treat the M&A event as the moment to renegotiate. Scraped public web data carries a separate exposure profile. The New York Times v. OpenAI litigation is now in active discovery, with the court ordering OpenAI to produce twenty million de-identified ChatGPT logs in late 2025, and the eventual finding on market substitution will materially reset the boundary on fair use for the rest of the industry. The English High Court’s November 2025 judgment in Getty Images v. Stability AI went largely against Getty on the principal copyright claims, but on grounds that hinged on UK jurisdiction over where training had occurred, rather than on a clean substantive answer to whether ingestion of copyrighted works is lawful. The legal map across the United States, the EU and Asia remains fragmented, and the buyer inherits the ambiguity.
Where personal data has been ingested into training, the position compounds. Consent obtained for one purpose does not automatically survive into model training, and Article 5 of GDPR has been actively enforced against AI training corpora by the Italian, French and Dutch supervisory authorities. The CNIL’s evolving guidance, the Hamburg DPA’s published position that model weights themselves may constitute personal data in certain conditions, and ongoing supervisory work on healthcare and credit-scoring use cases should be read together rather than separately. A target that cannot produce a defensible audit trail from data subject to model output is not asking the buyer to absorb a compliance project. It is asking the buyer to absorb a contingent liability of unknown size and unknown timing.
What the Model Actually Does
A trained model is not a piece of software in any sense a CFO recognises from prior tech diligence. It is a statistical artefact whose behaviour drifts as inputs drift, whose failure modes are probabilistic, and whose performance on the population the target sells to may not match performance on the population the buyer sells to. The implication, for any acquirer with material customer exposure in regulated end-markets, is that an independent model audit should be commissioned during diligence on the same terms as a Quality of Earnings review.
The work is no longer exotic. Specialist providers are reasonably mature and the scope is well understood: benchmarking across realistic input distributions, bias and disparate-impact testing across protected characteristics relevant to the target’s regulated use cases, robustness testing against adversarial prompts, and an honest assessment of how the target manages drift in production. Equally telling is the documentation. A target that can produce model cards, versioning records, retraining cadences and rollback procedures has a capability that is institutionalised. A target that cannot has a capability sitting in the heads of four engineers, none of whom is contractually obliged to stay. In a market where Meta has offered first-year compensation reaching $100 million for senior AI researchers, and OpenAI has issued retention awards of up to $1.5 million per researcher in response, the distinction between institutional and individual capability is not a footnote. It is a price.
Governance That Will Hold Up
AI-native companies that have scaled quickly tend to have built remarkable products on governance that is, to put the matter generously, thin. Boards forming a view on a target’s AI risk posture should expect to find, and should treat the absence of any of them as a diligence finding: a written policy on permitted and prohibited applications with sign-off authority specified; a live inventory of every model in production, including the shadow tools that individual teams have deployed without central oversight; a defined process for approving new deployments and material model updates, with model-risk-management discipline borrowed from financial services where relevant; and an incident response protocol, tested at least once, for handling AI failures that touch customers or regulators.
For buyers that are themselves regulated (financial services, insurance, healthcare, critical infrastructure), the acquirer’s own operating licence is in scope from the moment the deal closes. The EU AI Act’s principal obligations for high-risk Annex III systems become enforceable on 2 August 2026, with a Commission and Council proposal to defer certain high-risk obligations into late 2027 still moving through the legislative process and not yet law. Treating that timetable as soft is a category error. Acquirers should be modelling compliance investment into the integration P&L, not discovering it in the second post-close board meeting.
The Ownership Chain
IP in AI is dirtier than in conventional software, and the chain breaks at predictable points. Open-source components incorporated under copyleft licences (notably GPL and AGPL) can taint commercial product if the engineers were not careful. Contractor agreements without proper assignment language can leave critical model improvements owned by people who no longer work at the company. Targets that spun out of university research, particularly from US and European institutions, almost always carry residual institutional rights, joint-ownership claims, or grant-funded restrictions on commercial use. Surfacing these issues before exclusivity is straightforward. Surfacing them afterwards, once the seller’s leverage has shifted to the institutional licensor or to a former post-doc, is the kind of work that produces material adverse change debates and price chips no one budgeted for.
The People Equation
Most AI acquisitions are, in commercial substance, talent acquisitions in which the technology is the explanatory variable. The buyer’s valuation model treats the engineering organisation as a continuing concern. The post-close reality is grimmer: roughly 47% of acquired employees leave by the end of year one and 75% by the end of year three, with voluntary attrition during integration running approximately 3.6 times the baseline rate. In AI organisations, where value sits with a small and identifiable group of researchers, the impact on deal value is non-linear. The buyer has not bought ninety engineers; the buyer has bought eight whose names appear on three papers.
Retention work begins during diligence, not after signing. The first task is to map dependency rather than seniority, identifying the five to ten individuals whose departure would meaningfully change what the buyer has bought. That group rarely overlaps cleanly with the management chart in the data room. The second task is to design retention with a clear-eyed view of the alternative bid. The same engineer the acquirer is trying to retain at $1.5 million may be holding an open offer at four times that figure from a frontier lab, and compensation alone will not be enough. Mission, autonomy, compute budget, equity in something with credible upside, and freedom from corporate process matter as much. Meta, despite offering nine-figure packages, has reportedly retained senior AI talent at roughly 64%, which is what happens when money has been solved for and the operating environment has not.
Cultural fit deserves comparable scrutiny. AI-native talent operating in fifteen-person teams concludes, with depressing regularity, that the job inside a large acquirer has become slower than the work merits. The acquirers that keep people are the ones that ring-fence the team, protect its operating model, and accept that integration on conventional timelines will, on the present mathematics, destroy the asset they paid for.
The Regulatory Perimeter
The reverse acquihire has become the most distinctive transaction form of this cycle, and it now sits squarely in the regulatory crosshairs. Microsoft’s March 2024 arrangement with Inflection (a $620 million non-exclusive technology licence and $30 million for legal waivers, with substantially all of Inflection’s seventy-person workforce moving to Microsoft) was reviewed by the UK CMA, which cleared the deal but formally designated it a merger; by the German Federal Cartel Office, which concluded that the transaction constituted a notifiable concentration but ultimately declined jurisdiction for lack of German nexus; and by the US Federal Trade Commission, which opened a formal inquiry. Amazon’s June 2024 hiring of the Adept leadership team, Google’s $2.7 billion arrangement with Character.AI in August 2024 (now under DOJ investigation), and Google’s $2.4 billion reverse acquihire of Windsurf on 11 July 2025, which collapsed an active $3 billion OpenAI transaction at the moment exclusivity expired and saw Cognition acquire the remaining assets within seventy-two hours, have together changed the regulatory frame. The structural innovation that allowed acquirers to avoid Hart-Scott-Rodino notification is no longer beneath the regulators’ notice. FTC Chair Andrew Ferguson has stated publicly that the agency will examine acquihires and reverse acquihires as a category.
For cross-border buyers, the antitrust review timetable for AI transactions has lengthened materially, and the European Commission’s Article 22 referral mechanism remains an unsettled risk for sub-threshold deals that nevertheless raise innovation concerns. Builders of the MAC clause, the long-stop date and the deal insurance stack should now be assuming twelve to eighteen months of regulatory contingency on consequential AI acquisitions, rather than the six months that sponsor capital is accustomed to in tech.
The Way In
Bain’s 2026 research suggests that roughly 70% of AI value will be realised inside core operating functions (sales, marketing, manufacturing, supply chain, customer service) rather than in standalone product. The implication for buyers is clean: the right question is not whether the target has an impressive demo, but whether the capability can be embedded in existing revenue, cost or service lines without breaking the things that already work.
The acquirers seeing AI clearly are those treating AI diligence as a discrete workstream, separately scoped and separately budgeted alongside financial, legal, commercial and tax diligence. The acquirers folding it into existing technology diligence are the ones discovering the holes during the year-two impairment test.
The most expensive lesson of the cycle so far has not been overpaying. Premiums on AI assets can sometimes be defended on the strength of underlying optionality, and a 30x multiple that prices an option correctly is not a bad use of capital. The most expensive lesson has been buying the wrong thing: a model whose data rights expire on close, a team whose key talent had a parallel offer the acquirer never saw, a product whose regulatory perimeter shifts under it within twelve months. These are diligence questions, and they have answers when they are asked in time.
Speed remains a competitive advantage in AI deal markets. Conviction without rigour is how acquirers end up owning liabilities they never priced.