How Companies Design AI Features:
Where They Fail and Where They Get It Right
2025 was the year every company shipped AI. Most of it failed. This article dissects real successes and failures with sources, numbers, and design lessons you can actually use.
The Uncomfortable Numbers
In 2025, global AI spending reached record levels. Every major tech company shipped AI features. And most of them failed.
These numbers tell a clear story: the technology works, but the design of how it's presented to users doesn't. As Vizzuality argues, the real reason AI projects fail isn't the model — it's the product decisions around it.
The gap between AI products that succeed and those that fail is almost never about model quality. It's about design decisions: transparency, user control, contextual integration, and graceful failure handling.
Part 1: The Failures (and Why They Happened)
Let's start with what went wrong. These aren't obscure startups — they're some of the most well-funded, well-staffed companies in the world.
Google AI Overviews: Confidence Without Verification
In May 2024, Google rolled out AI Overviews to hundreds of millions of US search users. The feature used the Gemini model to generate direct answers to queries. It quickly went viral — for being dangerously wrong.
It told users to add non-toxic glue to pizza sauce (sourced from a decade-old Reddit joke), recommended eating rocks for minerals (misinterpreting a satirical Onion article), and suggested mixing bleach and vinegar (which produces chlorine gas). As UNSW researchers noted, the fundamental problem is that "generative AI tools don't know what is true, just what is popular."
Google presented AI-generated answers with the same visual authority as verified search results. No confidence indicators. No source quality signals. No "this may be wrong" caveats. The design communicated certainty that the model couldn't deliver.
By late 2025, Google had improved significantly — the AI now better recognizes adversarial inputs and is "more willing to admit uncertainty." But the damage to trust was done, and the lesson is clear: never present probabilistic outputs with deterministic confidence.
ChatGPT's Feature Bloat: The "Everything App" Trap
OpenAI dominated 2025 with 800-900 million weekly active users. But according to a16z's State of Consumer AI 2025 report, most of their new features failed to gain traction.
OpenAI shipped Pulse (daily updates), Group Chats, Record, Shopping Research, Tasks, and Study Mode. a16z's verdict: "None of the new experiences have truly 'broken through' in terms of either usage or retention. It's hard to deliver a first-class product experience within the constraints of the existing ChatGPT interface."
Even their standalone browser, Atlas, saw under 5% of ChatGPT users visit the download page. Sora achieved 12M+ downloads but only 8% day-30 retention — while top consumer apps target 30%+.
Cramming every new AI capability into a single chat interface. The chat paradigm is a constraint, not a canvas. New capabilities need purpose-built experiences, not more menu items in an existing app.
Volkswagen Cariad: The $7.5 Billion "Big Bang"
Volkswagen launched Cariad in 2020 to build a unified AI-driven operating system for all 12 VW brands. By 2025, it had become automotive's most expensive software failure: $7.5 billion in operating losses, a sprawling buggy codebase, delayed vehicle launches (Porsche Macan Electric, Audi Q6 E-Tron), and 1,600 job cuts.
Monolithic transformation instead of modular iteration. They tried to replace legacy systems, build custom AI, and design proprietary silicon simultaneously. The lesson: AI features succeed when integrated incrementally, not as a platform rewrite.
Taco Bell Voice AI: Worse Than Human Service
Taco Bell deployed voice AI across 500+ drive-throughs. The system couldn't handle accents, background noise, or adversarial inputs (one customer ordered "18,000 cups of water"). Staff needed constant intervention, creating more work, not less.
Optimized for theoretical efficiency instead of real customer satisfaction. If an AI feature is worse than the human process it replaces, it destroys brand value. The baseline is human service, not "no service."
More Failures Worth Knowing
Meta AI App
MetaUsers accidentally shared private AI conversations on the public Discover feed. Millions of DAU but growth confined to non-US markets.
Privacy scandal, June 2025Replit Autonomous Agent
ReplitAutonomous coding agent ignored code-freeze instructions, executed DROP DATABASE, wiped production. Then generated 4,000 fake user accounts to cover its tracks.
Production data lossnH Predict
UnitedHealth / HumanaAlgorithm denied elderly patient coverage, systematically overriding physicians. 9 of 10 denials overturned on human appeal — a 90% error rate.
Class-action lawsuitsGoogle Portraits, Doppl, Whisk, Gems
GoogleMultiple AI products launched to "relatively muted traction" due to confusing access pathways and unclear account requirements.
Low adoptionSources: a16z State of Consumer AI 2025, NineTwoThree: Biggest AI Fails of 2025, Ataccama: 9 AI Fails
Part 2: The Successes (and What Made Them Work)
Now for the companies that got it right. The pattern isn't "better AI" — it's better design decisions around the AI.
Notion AI: Meet Users Where They Are
Notion's approach to AI integration is a masterclass in contextual design. As analyzed on Design Bootcamp, Notion applied four perceptual design principles to integrate AI: color contrast (purple accent to draw attention to AI features), peripheral vision (AI CTA placed at corner but visible during normal use), pop-out effects (visual differentiation for AI-generated content), and Gestalt grouping (similar AI features grouped with shared icons and dividers).
The key design decision: AI is accessed through the existing / slash command —
the same pattern users already use for everything else. No new paradigm to learn.
Familiar patterns reduce friction; new power doesn't require new interface.
AI is embedded into existing workflows, not presented as a separate mode. Users discover AI capabilities progressively, through patterns they already know. The AI enhances the tool; it doesn't replace the tool.
Canva Magic Studio: Never Leave Users Stuck
Canva's Magic Studio — featuring Magic Design, Magic Media, Magic Edit, and Magic Switch — went all-in on AI and earned a spot on TIME's Best Inventions list. With over 10 billion uses of its AI tools, it's one of the most successful AI feature launches ever.
The critical design choice: if the AI generation fails, users get suggested templates instead — they're never stuck in a dead end. Magic Switch adapts content across formats automatically. The AI lowers the floor without lowering the ceiling.
Graceful degradation as a core pattern. AI failure never means user failure. Every AI path has a non-AI fallback. This builds trust because the product always delivers value, even when the AI doesn't.
Google Gemini's Viral Moments: Style as Strategy
While many Google AI products struggled, Gemini had two massive wins. According to a16z, Nano Banana saw 200 million images generated in the first week and brought 10 million new users to Gemini. Veo 3 was "arguably the breakthrough moment for AI video."
a16z's insight: "The most viral AI products of 2025 were not new models but certain images or videos — with a distinct style that allows users to create something instantly without having to think about what to make."
Opinionated defaults beat blank canvases. When AI gives users a distinctive starting point (a style, a template, a constraint), adoption explodes. Freedom isn't "do anything" — it's "do something specific, beautifully."
More Successes Worth Studying
ChatGPT 4o Image Generation
OpenAIThe "Ghibli event" added 1 million users per hour at peak. Distinct visual style + one-click creation = massive viral adoption.
1M users/hour at peakPerplexity
Perplexity AI$100M run rate, 6x YoY paid growth, 20M+ MAUs. Purpose-built search experience instead of bolting AI onto an existing product.
$100M ARR, 20M+ MAUsNotebookLM
GoogleWeb users doubled YoY, 8M mobile MAUs. Standalone experience with clear purpose — not crammed into Gemini's interface.
8M mobile MAUsGrok
xAIFrom zero to 9.5M DAU, 38M MAU in under 12 months. Personality-driven approach + integrated multimedia capabilities.
0 → 38M MAU in <1 yearSources: a16z State of Consumer AI 2025, Fast Company, Design Bootcamp
Part 3: The 10 Design Mistakes That Kill AI Features
Across all the failures above, a pattern emerges. Here are the 10 most common AI UX mistakes, compiled from UZER's analysis, Nielsen Norman Group's research, and the case studies above:
No transparency
Users can't tell when AI is active or what content is AI-generated. Medical apps that don't distinguish AI from doctor recommendations.
Overconfident outputs
"You have the flu" instead of "This may be the flu." Presenting probabilistic results as facts. See: Google AI Overviews.
No user control on failure
AI gives 2-3 options, none work, and there's no way to specify what you want. Email reply suggestions with no "write my own" escape hatch.
Poor error handling
Vague "Something went wrong" with no explanation, retry option, or alternative path. See: Canva's solution of fallback templates.
Overpromising capabilities
Marketing says "answers anything" but the chatbot fails on basic queries. Setting expectations the AI can't meet erodes trust instantly.
Black-box results
No explanation of how AI reached its conclusion. Music recommendations without "because you listened to X." Users need the "why."
Ignoring model bias
AI hiring tools favoring certain demographics. Workday faced a nationwide class-action for age discrimination in automated screening.
No user education
Blank input field with no guidance on what to type or how to prompt effectively. See: Notion's progressive disclosure approach.
Feature overload
Too many AI options at once overwhelm users. ChatGPT's feature additions struggled because they were all crammed into one interface.
Skipping user testing
AI behaves differently than traditional software — you can't predict outputs from specs alone. Prototype with real AI, not mocks.
Source: UZER: 10 Common Mistakes When Designing AI Products
Part 4: Five Principles That Separate Hits from Disasters
Across every success and failure above, five design principles consistently separate the products that work from those that don't:
1. Integrate Contextually, Don't Bolt On
Notion embeds AI in slash commands. Canva puts AI inside the design editor. The failures? Standalone chatbots attached to products "not because they are solving a problem, but because they can" (Vizzuality).
2. Design for Failure, Not Just Success
Canva's fallback templates. Google's improved uncertainty admission. The strongest AI products are the ones that handle failure gracefully. As NN/g's research agenda asks: "What design patterns best support transparency and explainability in AI systems?"
3. Show Confidence, Not Certainty
Use probabilistic language ("might be," "80% likely"). Show where the data came from. Never present AI outputs with the visual authority of verified facts. Google learned this the hard way.
4. Give Users an Escape Hatch
Always provide a way to override, correct, or bypass the AI. "None of these" options. Manual fallbacks. The ability to edit AI outputs, not just accept or reject them. NN/g emphasizes: "An essential element in getting full value from AI is to include a heavy dose of human judgment."
5. Ship Incrementally, Not Monolithically
Notion's progressive feature discovery. Canva's tool-by-tool rollout. Contrast this with Volkswagen's $7.5B attempt to build everything at once. AI features succeed when they're integrated modularly and iterated on — not launched as a platform rewrite.
In my PR→PO Copilot prototype, every one of these five principles came into play. The AI was embedded into an existing procurement workflow (principle 1), errors showed the specific rule that was violated with a suggested fix (principle 2), all recommendations displayed source chips showing data origin (principle 3), users could override any AI suggestion with mandatory reason capture (principle 4), and I built it feature-by-feature over three design iterations, not as a monolithic launch (principle 5). The hardest lesson: principle 4 (escape hatches) is the one most teams skip — and it's the one that builds the most trust.
What This Means for Designers in 2026
The a16z report ends with a surprising note of optimism for builders: "We've never been more excited about what startups can build in consumer AI." The big labs are focused on models and features within existing products — leaving massive white space for purpose-built experiences.
For designers, the opportunity is clear:
- The chat interface is a ceiling, not a floor. The most successful new products (Perplexity, NotebookLM, Grok) built dedicated experiences instead of copying ChatGPT.
- Trust is the product. Every failure above traces back to a trust violation — overconfidence, opacity, lack of control. Designing for trust isn't a nice-to-have; it's the product.
- Familiar patterns > novel paradigms. Notion's slash commands. Canva's editor. The wins came from embedding AI into patterns users already understood, not from inventing new interaction models.
- Test with real AI, not mocks. You can't predict AI behavior from specs. The unpredictability is the design challenge. Prototype with live models.
"MVPs fail when AI is treated as a shortcut, and succeed when AI is engineered as an internal capability." — NineTwoThree
Sources
Every claim in this article is traceable. Here are the primary sources:
- State of Consumer AI 2025: Product Hits, Misses, and What's Next — Andreessen Horowitz (a16z), Dec 2025
- The Biggest AI Fails of 2025: Lessons from Billions in Losses — NineTwoThree
- Why AI Projects Fail (95% in 2025) — TimSpark (citing MIT, RAND, S&P Global)
- The Real Reason AI Projects Fail (and It's Not the Model) — Vizzuality
- 10 Common (and Costly) Mistakes When Designing AI Products — UZER
- AI Research Collection — Nielsen Norman Group
- A Research Agenda for Generative AI in UX — Nielsen Norman Group
- How Notion Uses Visual Design Principles for AI Feature Adoption — Design Bootcamp
- Canva Goes All In on AI with Magic Studio — Fast Company
- How Google's AI Is Losing Touch with Reality — UNSW
- 9 AI Fails (and How They Could Have Been Prevented) — Ataccama
Interested in how these principles apply to enterprise AI?
See my PR→PO Copilot case study — a working prototype where every design decision was guided by the transparency, trust, and control principles discussed above.
View Case Study