Shortage Agent｜Case study

The Purchase Order MVP established a foundation for transparent procurement workflows. This project extends that work into agentic AI—where an intelligent system recommends suppliers and quantities, but humans remain in control of final decisions.

How this differs from PR-PO Copilot

PR-PO Copilot

Conversational AI — Users describe needs in natural language, copilot interprets intent and guides form completion. Focus on understanding what users want.

Shortage Agent

Recommendation system — System detects problems, ranks solutions, users confirm or override. Focus on helping users choose between options.

Both explore AI transparency, but through different interaction paradigms: one is generative (creates from intent), the other is evaluative (ranks existing options).

Design challenge: How do you build trust in AI recommendations for high-stakes supply chain decisions where errors can halt production lines?

The prototype explores a Calibrated Trust Framework:

Reliance — Can users depend on the AI's recommendations?
Correctness — Are recommendations based on verified data?
Robustness — Does the system handle edge cases gracefully?
Governance — Is every decision auditable and explainable?

Research Approach

This exploration draws on:

Domain immersion — Studying supply chain workflows from the Purchase Order MVP project
Pattern research — Reviewing AI trust frameworks from Google PAIR, Microsoft HAX, and IBM Design for AI
Analogous experiences — Analyzing how high-stakes recommendation systems (medical, financial) handle human-AI collaboration

Target Outcomes (Hypothetical)

Note: These are hypothetical projections based on industry research, not measured production data. The purpose is to illustrate the types of outcomes this design aims to enable.

40%

Faster shortage resolution

95%

Audit compliance rate

↓60%

Unjustified overrides

How these estimates are derived

40% faster resolution: Based on reducing manual supplier research time. Industry benchmarks suggest procurement teams spend 2-4 hours per shortage event on supplier evaluation; automated ranking + evidence chips could compress this to under 1 hour.
95% audit compliance: Every action is logged with actor, timestamp, and rule IDs. The override-with-reason pattern ensures even exceptions are documented. Target based on similar audit trail implementations.
60% fewer unjustified overrides: When users see evidence chips showing data sources, they're more likely to trust recommendations. Studies on transparent AI systems show reduced override rates when reasoning is visible.

Scope note: This is a design exploration prototype using seeded data and deterministic rules. No real ERP systems or production data are connected.

Before designing the UI, I mapped out how different layers coordinate in this workflow. The principle: "Make the agent's reasoning visible and overridable."

Intent Layer

→

Strategy Selection
User defines goal: Speed / Cost / Reliability

↓

Tools Layer

→

Deterministic Functions
Supplier scoring · Contract lookup · Price calculation

↓

Policy Layer

→

Business Rules
Preferred suppliers · Contract validation · Override requirements

↓

UX Layer

→

Trust Patterns
Evidence chips · Trade-off cards · Override modal · Audit trail

This layered architecture ensures the AI handles scoring and ranking while humans control strategy selection and final decisions—making the workflow auditable and enterprise-safe.

System flowchart: Detection → Resolution workflow

This diagram maps the complete workflow, showing how the system handles shortage detection, strategy selection, and human decision points.

A core design principle: AI handles computation, humans handle judgment. The system defines explicit moments where control transfers between agent and user.

AGENT

Detect shortage — Monitor inventory, identify gaps, calculate severity

↓ HANDOFF

HUMAN

Select strategy — Choose optimization goal (Speed / Cost / Reliability)

↓ HANDOFF

AGENT

Rank suppliers — Apply weights, attach evidence, calculate scores

↓ HANDOFF

HUMAN

Confirm or override — Review options, select supplier (justify if non-recommended)

↓ HANDOFF

AGENT

Execute PO — Generate document, send to supplier, log to audit trail

Handoff Design Principles

Strategy is human-owned — AI never decides the optimization goal
Recommendations are suggestions — Always overridable with documented reason
Execution requires confirmation — No auto-send for high-stakes actions
Every handoff is logged — Full audit trail for compliance

Mapping the full service experience across frontstage interactions, backstage AI operations, and supporting systems.

	Shortage Alert	Strategy Selection	Supplier Review	PO Creation
User Actions	View inbox, click shortage	Choose Speed/Cost/Reliability	Compare options, select supplier	Confirm & send PO
Frontstage UI	Severity badge, material details	Strategy cards with weights	Option cards, evidence chips	PO preview, confirmation modal
Agent Actions	Detect gap, calculate severity	—	Apply weights, rank, attach evidence	Generate PO, send, log audit
Support Systems	ERP inventory feed	Policy rules engine	Contracts DB, historical data	Supplier portal, audit log

The blueprint reveals where AI acts invisibly (backstage) versus where it surfaces evidence (frontstage) — ensuring transparency without cognitive overload.

When AI systems simply output "Recommended: Supplier X", users can't tell:

Why this supplier? — What criteria were used?
What's being traded off? — Is it faster but more expensive?
Can I override it? — What happens if I choose differently?

"If I pick a different supplier than what the AI recommends, will I get blamed if something goes wrong?"

— Supply Planner concern

"I need to justify my supplier choices to procurement. How do I explain that I overrode the AI's suggestion?"

— Governance concern

Instead of a black-box recommendation, users choose their optimization goal first:

Fastest Delivery

ETA 60% · Price 20% · Reliability 20%

Lowest Cost

Price 60% · ETA 20% · Reliability 20%

Most Reliable

Reliability 60% · ETA 20% · Price 20%

The AI then ranks suppliers according to the selected strategy, with weights shown transparently. Users understand exactly how the ranking was calculated.

Each supplier option shows evidence chips that explain the data sources:

CONTRACT — Pricing from active contract (Contract ID shown)
HISTORICAL — On-time delivery rate from past orders
CALCULATED — Composite score based on strategy weights

This transforms "AI magic" into verifiable, auditable recommendations.

When users select a non-recommended supplier, they must provide a justification:

Override Confirmation

You selected ValueSource Co. instead of the recommended FastParts Inc.

Reason required: "Existing relationship with supplier, need to maintain volume commitment"

Why this matters

Accountability — Overrides are logged with timestamp and user
Governance — Auditors can review override patterns
Learning — Frequent overrides can inform AI model improvements

Every action in the workflow is logged:

SHORTAGE_CREATED — System detected shortage event
STRATEGY_SELECTED — User chose optimization goal
OPTION_SELECTED — User selected supplier
OPTION_OVERRIDDEN — User overrode recommendation (with reason)
PO_SENT — Purchase order sent to supplier

Each log entry includes: Actor, Timestamp, Entity ID, and Full Details.

The audit timeline transforms "what did the AI do?" into a complete decision history that can be reviewed for compliance, training, or dispute resolution.

Instead of a persistent panel, transparency now lives inside the task context. Each step surfaces a compact “Agent actions” disclosure that expands only when users need the detail.

Agent actions: Ranking suppliers Working

Apply SPEED weights (ETA 60%, Price 20%, Reliability 20%)
Attach evidence chips (Contract ID, Quote)
Check override requirement

In the prototype, this stays collapsed by default to keep the UI calm while still discoverable.

Live status is embedded next to the moment users wait: syncing the inbox, ranking options, or sending the PO. These cues feel closer to native app behavior than a separate panel.

Live status cues

Syncing inventory Ranking options Checking policy gates Sending PO…

These micro-interactions clarify wait time without interrupting decision flow.

Following enterprise app conventions, the prototype features a consistent shell with:

Top Header

App branding, notifications, and persona switcher always visible. Provides context on "who is acting" at any moment.

Left Sidebar

Icon-based navigation to Dashboard, Triage, PO Drafts, and Audit Log. Active state clearly highlighted.

This matches the navigation pattern used in the PR-PO Copilot prototype for consistency across the portfolio.

For prototype demos, the header includes a persona switcher that allows switching between different user roles:

S

Sarah Chen

Supply Planner

M

Michael Torres

Buyer

J

Jessica Park

Procurement Manager

This pattern enables showing how the audit trail captures actor attribution—every action is logged with who performed it, supporting enterprise governance requirements.

Even in a non-chat interface, the agent communicates through labels, status messages, and microcopy. Defining a consistent voice builds trust.

Agent Persona

Name: Shortage Agent

Tone: Professional, direct, evidence-focused

Personality: A meticulous analyst who always shows their work — never vague, never presumptuous

Voice Guidelines

Principle	Don't	Do
Be specific	"AI recommended this"	"Recommended based on: Contract C-2024-001 pricing"
Show confidence	"This might be the best option"	"Score: 92/100 (Speed 60%, Cost 20%, Reliability 20%)"
Respect autonomy	"You should select FastParts"	"FastParts ranks highest for your Speed strategy"
Acknowledge limits	"Best supplier found"	"3 suppliers matched criteria. Showing top options."

Status Message Examples

                            ● "Ranking 3 suppliers using SPEED weights..."
                        

                            ● "Checking contract validity for FastParts Inc."
                        

                            ● "Override detected — reason required for
                            audit"
                        

Patterns are applied in-context so AI behavior stays inspectable without overwhelming the user.

Progressive disclosure Contextual agent actions Evidence-first recommendations Trade-off comparison Override with required reason Human confirmation gate Audit trail with actor attribution Deterministic tools + policy guardrails Status cues for waiting Unified navigation shell Persona/role switching

Measuring trustworthiness isn't just about features—it's about how each design choice addresses specific trust dimensions:

Trust Dimension	Pattern	Implementation
Transparency	Evidence Chips	Every supplier option shows data source (Contract ID, Historical data, Calculated)
Control	Strategy Selection	Users choose optimization goal (Speed/Cost/Reliability) before AI ranks
Accountability	Override with Reason	Selecting non-recommended option requires documented justification
Governance	Audit Trail	Every action logged with actor, timestamp, entity ID, and details
Explainability	Strategy Weights	Ranking formula shown explicitly (e.g., "Speed 60%, Price 20%, Reliability 20%")

Why this matters

These patterns aren't just "nice to have"—they're the difference between AI users trusting recommendations and blindly accepting (or ignoring) them. In high-stakes supply chain decisions, calibrated trust prevents both over-reliance and under-utilization of AI capabilities.

Trust is earned through transparency

Users don't need to understand the AI's algorithm—they need to see the evidence behind each recommendation and have the power to override when their domain expertise says otherwise.

Human-in-the-loop ≠ Human bottleneck

The goal isn't to make humans approve everything—it's to put humans in control of strategy and exceptions while letting AI handle the repetitive scoring and ranking.

Governance is a UX problem

Audit trails and override logging aren't just for compliance—they protect users by documenting their decision rationale when things go wrong (or right).

Beyond decision support: A maturity path

This prototype represents Level 2 of an AI maturity model:

Level 1: Visibility — AI surfaces data (dashboards, alerts)
Level 2: Recommendation — AI suggests, human decides (this prototype)
Level 3: Delegation — AI acts within guardrails, human monitors
Level 4: Autonomy — AI handles routine cases, human handles exceptions

The trust patterns established here — evidence, override, audit — scale across all levels.

Challenges & what I'd do differently

Designing without production data

Using seeded/mock data limits validation. Real supply chain data has messy edge cases—partial shipments, multi-source orders, quality variations—that I couldn't fully model. Next time: Partner with a procurement team earlier, even for read-only data access.

Strategy selection complexity

Three strategies (Speed/Cost/Reliability) is simple to explain but may be too rigid. Real procurement often involves multi-objective optimization. If I revisited: Explore customizable weight sliders instead of preset strategies.

Accessibility & responsive considerations

Accessibility

Confidence bars use pattern fills alongside color
Strategy cards have clear focus states
Override modals respect focus trapping
Screen reader-friendly table structures

Responsive Design

Dashboard adapts to tablet/desktop viewports
Recommendation cards stack on smaller screens
Critical actions remain accessible on mobile
Data tables scroll horizontally when needed

Building recommendation systems is about more than accuracy—it's about giving users the confidence to act on recommendations, even when they can't audit every calculation themselves.

Designing Human-in-the-Loop AI for Supply Chain

Extending PO MVP into agentic territory

How this differs from PR-PO Copilot

Research Approach

Target Outcomes (Hypothetical)

How these estimates are derived

4-Layer Agent Architecture

System flowchart: Detection → Resolution workflow

AI-Human Handoff Moments

Handoff Design Principles

Service Blueprint

AI recommendations without evidence erode trust

Strategy-Driven Ranking

Fastest Delivery

Lowest Cost

Most Reliable

Evidence Chips

Override Modal with Required Reason

Why this matters

Full Audit Trail

Contextual Agent Actions

Lightweight activity states

Unified Navigation System

Top Header

Left Sidebar

Persona Switching for Demo

Agent Voice & Dialogue Principles

Agent Persona

Voice Guidelines

Status Message Examples

Patterns used in this AI system

Trust Scorecard

Why this matters

What I learned

Trust is earned through transparency

Human-in-the-loop ≠ Human bottleneck

Governance is a UX problem

Beyond decision support: A maturity path

Challenges & what I'd do differently

Designing without production data

Strategy selection complexity

Accessibility & responsive considerations

Accessibility

Responsive Design