Dec 10, 2025 5 min read      Merelda Wu <merelda@melio.ai>

Beyond the Hype: How Gen AI Delivered Real Value in 2025

The breakthrough wasn’t a new model - it was the engineering, data discipline, and workflow design around it 🚀

2025 was the year generative AI stopped being a demo and started delivering real business value. Not the gimmicky kind, actual, measurable outcomes in operations, compliance, customer service, and decision-making.

And the teams who made the biggest progress all learned the same lesson: the breakthrough wasn’t a new model. It was the engineering, data discipline, and workflow design around the model.

From Pilots to Production

In 2023-24 everyone asked “Can we use GenAI for this?”

In 2025 the real question became: “Can we trust it at scale?”

And the shift showed up everywhere:

  • 88% of companies now use AI in at least one function,
  • but only ~33% have managed to scale it beyond pilots,
  • and only 39% reported any meaningful profit impact from AI this year.1

Most organisations are experimenting, but only a minority are operationalising.

Across our client work we saw the same pattern:

  • Banks moved from analysing documents to automating full review workflows with SLAs and audit trails.
  • Insurers rolled out domain-specific AI assistants for claims processing and underwriting.
  • Telecom and healthcare adopted AI copilots for customer support and clinical documentation.23

And none of these wins hinged on choosing the “best model.” They came from clean data, reliable pipelines, well-defined workflows, and strong evaluation.

Three Themes That Defined 2025

1. Reliability Became the Currency

Teams stopped asking whether an AI answer was “clever” and started asking whether the system was consistent, auditable, and monitored.

Evaluation frameworks like HELM Safety, AIR-Bench, and FACTS4 started showing up in real deployments, not just research.

Even AI agents followed this trend. While 62% of organisations experimented with agentic systems, only 23% scaled them,1 almost always with strict controls, guardrails, and monitoring.

The rule of the year:

If it isn’t reliable, it isn’t going into production.

2. Data Stewardship Became a Competitive Advantage

2025 made it painfully clear: good data teams win.

The highest-performing companies weren’t the ones with the biggest models - they were the ones with clean, structured, well-governed data and continuous feedback loops.

And at the same time, the “free data” era ended fast:

Restricted or non-scrapable web data jumped from ~5-7% to ~20-33% of available tokens in a single year.4

This pushed the value firmly toward proprietary, well-maintained internal datasets, the kind that only disciplined organisations can leverage.

3. Companies Stopped Asking for Custom Models, and Started Asking for Custom Workflows

As open-source models closed the performance gap to proprietary ones, shrinking it from 8% to just 1.7%,4 the question changed from “Which model?” to “Which workflow?”

The biggest ROI came from end-to-end workflow redesign, not model tinkering.

By the end of the year it was clear:

An average model with great engineering beats a great model with average engineering.

What We’re Watching in 2026

1. Multi-Agent “Teams” Instead of One Super-Agent

The all-purpose autonomous agent didn’t land in 2025. But small, specialised agents working together quietly did.

Expect architectures where one agent fetches data, one reasons, one validates, and another executes, essentially AI microservices.

2. Evaluation Becomes Non-Negotiable

In 2026, no enterprise AI will go live without an evaluation checklist and a monitoring dashboard.

With 59 federal AI-related regulations introduced in 20244 alone in the US (double the previous year), this shift is now unavoidable.

Think of this as the early days of an “ISO for AI quality.”

3. Privacy-First Infrastructure

With global AI regulation accelerating, privacy can’t be bolted on later.

Companies will favour platforms that guarantee data residency, encryption-in-use, and secure workflows by default, especially as countries invest heavily in sovereign AI (Canada $2.4B, France €109B, Saudi Arabia $100B, etc.).4

4. ROI Overtakes Benchmarks

Executives are asking:

Did it reduce cost, improve speed, or de-risk a process?

Not:

Is ChatGPT, Gemini or Claude better for my workflow?

Funding in 2026 will follow ROI, not novelty.

The Road Ahead

2025 taught us that most companies didn’t need “more AI.” They needed:

  • cleaner data
  • better workflows
  • solid evaluation

The real progress happened in the boring places, version control, schema cleanup, pipeline reliability, monitoring, and governance.

At Melio AI, we’ve always believed that’s where the magic actually happens. And 2025 proved it.

The most impactful AI in 2026 won’t feel like AI at all. It will sit quietly in the background, triggering actions, preventing issues, and making decisions faster and more consistently than before.

Better systems, not bigger models. Better outcomes, not bigger hype.

That’s the direction we’re building toward: AI that just works, and gets out of the way until you need it.


References


  1. McKinsey & Company – The state of AI in 2025: State of AI Global Survey (Nov 2025). Key stats: ~88% of firms use AI in at least one function; ~33% have scaled AI beyond pilots; 62% experimenting with “AI agents,” 23% scaling them; only 39% see enterprise-level EBIT impact. ↩︎ ↩︎

  2. IBM Institute for Business Value – AI in Telecom 2025 Study (Oct 2024). Insights: 65% of telco executives say their AI projects haven’t delivered expected value (often due to data and scaling gaps); 69% of telecoms have generative AI live in customer care; 44% use “agentic AI” in operations; leading telcos use AI performance dashboards to monitor reliability and ensure governance. ↩︎

  3. Menlo Ventures – State of AI in Healthcare 2025 Report (Oct 2025). Highlights: Healthcare industry deploying AI at 2.2× the rate of the broader economy; health systems reached 27% adoption vs ~9% across other industries; large deployments like Kaiser Permanente’s rollout of AI assistants across 40 hospitals (the largest in healthcare to date) exemplify the rapid shift from pilots to production in healthcare. ↩︎

  4. Stanford HAI – 2025 AI Index Report (Apr 2025). Key insights: Business AI adoption rose from 55% (2023) to 78% (2024); new RAI benchmarks (HELM Safety, AIR-Bench, FACTS) emerging; open-source models closed performance gap with closed models from 8% to 1.7%; in 2024, U.S. agencies introduced 59 AI-related regulations; major national AI investments by Canada, France, China, Saudi Arabia, etc. ↩︎ ↩︎ ↩︎ ↩︎ ↩︎