by Kumar Srivastava

Contributor

Why human-in-the-loop is the only path to trustworthy AI in CPG R&D

Opinion

Sep 17, 20259 mins

Artificial IntelligenceManufacturing IndustryRegulation

The future of food reformulation and the crisis no one talks about.

Engineer calling colleagues after finding error in code while doing checkup on server room used for AI workloads. Man requesting help from coworkers after finding bug in data center systems, camera B

Walk down any supermarket aisle today, and you’ll see the symptoms of a reformulation crisis. Sugar taxes, sodium reduction targets, sustainability mandates and shifting consumer preferences are rewriting the product landscape. A ketchup that once passed compliance tests in 2019 may now face red flags under updated UK salt guidelines. A baby formula that seemed competitive last year could suddenly be non-compliant if EU fortification rules change.

Reformulation used to be a once-in-a-decade exercise. Today, it’s a constant drumbeat. Brands are under pressure to retool entire product portfolios every 12–18 months — not just for compliance, but to chase consumer trends (low-sugar, plant-based, sustainable packaging) while still hitting cost and margin targets.

And yet, the reformulation process in most consumer packaged goods (CPG) organizations is still a patchwork of Excel sheets, siloed lab notebooks and institutional memory. It’s slow, error-prone and heavily dependent on the intuition of a few veteran formulators. When you combine this with volatile ingredient supply chains and shifting regulatory regimes, the result is predictable: late launches, failed pilots and missed revenue opportunities.

It’s no wonder McKinsey reports that over 70% of new product launches in CPG fail to meet their revenue targets. Reformulation should be a competitive advantage. Too often, it’s a graveyard of wasted R&D spend.

The AI temptation (and why it’s dangerous without humans in the loop)

It’s no surprise that CPG companies are rushing to bring artificial intelligence (AI) into the reformulation process. Active Learning & Optimization, generative models and predictive analytics promise faster iteration, smarter trade-offs and data-driven confidence.

But here’s the inconvenient truth: AI on its own cannot guarantee that a reformulated product will work in the real world.

Left unchecked, AI systems will:

Propose formulations that violate FDA or EFSA regulations (like exceeding fortification limits for vitamins or misclassifying allergen thresholds).
Suggest ingredients that are unavailable or cost-prohibitive in current supply chains.
Optimize for lab-scale outcomes that collapse when scaled up on a factory homogenizer or UHT line.
Hallucinate solutions that look elegant on paper but fail consumer sensory panels.

This is not a hypothetical risk. In 2023, Nestlé announced it would reformulate over 100 products to reduce sodium and sugar across European markets. Despite their sophisticated R&D machine, reports from FoodNavigator noted that pilot-scale failures delayed launches for multiple SKUs because plant equipment couldn’t handle the new recipes at throughput.

The lesson is clear: AI can be a powerful tool, but without human-in-the-loop (HITL) design, it will make costly, real-world mistakes.

What HITL really means in formulation

Human-in-the-loop is not just a buzzword. It is the only mechanism that ensures AI-driven formulation platforms are trustworthy, compliant and factory-ready.

At its core, HITL design acknowledges that AI excels at exploring vast design spaces and finding optimal trade-offs, but humans must:

Define the guardrails (legal, technical, sensory, commercial).
Validate the data and calibrate the models.
Interpret the trade-offs in context of brand, consumer and factory realities.
Approve go/no-go decisions at each stage.

Think of it as the marriage of active learning & optimization and human governance: the AI proposes, the human disposes.

The 9 HITL checkpoints

Through my work with some of the largest CPG companies globally, I’ve seen where reformulation projects succeed and where they fail. The difference almost always comes down to how deliberately the human checkpoints are designed.

Here are the nine HITL stages that matter most:

Project intake and goal definition. Success requires clear, measurable objectives (maximize stability, minimize cost, maintain pH between 4.15–6.7).
Design-space and constraints sign-off. Regulatory and process engineers must confirm the AI cannot propose infeasible or unlawful solutions.
Data validation and QC. Every lab measurement must be normalized, traceable and verified before it feeds the model.
Model calibration and validation. Scientists must review uncertainty coverage to ensure the model isn’t overconfident.
Optimization proposal review. Humans evaluate if the AI’s candidate formulations make practical sense.
Experiment execution and results acceptance. Labs confirm that results are real and replicable.
Trade-off and Pareto selection. Cross-functional teams align on which trade-offs are acceptable.
Pilot and scale-up readiness gate. Manufacturing ensures formulations will run on actual equipment.
Regulatory and final release approval. Legal, regulatory and leadership confirm full compliance before launch.

Each checkpoint has a clear success criterion: reduce the risk of failure at the next stage.

Ranking HITL by risk impact

Not all checkpoints carry the same weight. In practice, five of them matter the most for reducing catastrophic failure:

Regulatory and final release approval. Miss here, and you face recalls and lawsuits.
Design-space and constraints sign-off. If the AI searches outside real-world boundaries, every suggestion downstream is wasted.
Pilot and scale-up readiness. Lab wins mean nothing if the line can’t run the recipe.
Data validation and QC. Bad data equals bad models.
Model calibration and validation. An overconfident model is more dangerous than an inaccurate one.

These stages are where the cost of failure is measured in millions, not thousands. They deserve the most robust human oversight and UI/UX design.

Real-world evidence: Why HITL is non-negotiable

This isn’t just theory. Real-world evidence from across the CPG sector demonstrates the consequences of skipping HITL:

Baby food reformulation under scrutiny (UK, 2025)., The UK government announced new salt and sugar reduction guidelines for foods targeting children under 36 months. Importantly, sweeteners are banned. Without human oversight, an AI optimizer could easily propose a stevia-based reformulation that would fail regulatory review and damage brand trust (UK Government – Plan for Change).
FDA warning letters (US, 2022–2024). The FDA has issued multiple warning letters to brands making unverified “low sugar” or “high protein” claims. These often stem from data quality issues or misapplied nutrient calculations — exactly the kind of error that HITL data validation prevents.
Unilever sustainable packaging (2023). When Unilever tried to switch several lines to recyclable mono-material packaging, they faced equipment compatibility issues that required costly plant retrofits. It wasn’t the AI or material science that failed — it was the lack of HITL at the scale-up readiness gate (Packaging Europe).

The pattern is obvious: when humans fail to set guardrails, validate data or check scale-up feasibility, the AI becomes untrustworthy.

Designing HITL for speed and quality

Critics will ask: Doesn’t human-in-the-loop slow things down? The opposite is true. Done right, HITL accelerates reformulation because it reduces late-stage failure.

The design principles are straightforward:

Make guardrails code, not guidelines. Regulatory, process and supply constraints should be encoded as executable rules, not buried in PDFs.
Automate the easy checks, elevate the hard ones. Units normalization should be automatic; trade-off selection should be a cross-functional discussion.
Design UI/UX for decision gates. Every checkpoint should have a clear decision card: ✅ Pass, ⚠️ Amber (needs mitigation), ❌ Fail.
Record the rationale. Every override, every sign-off should be logged for audit and learning.

The best HITL platforms are not bureaucratic — they are lightweight, intuitive and transparent, allowing experts to focus only on the decisions that matter.

The competitive advantage of trustworthy AI

At the end of the day, CPG executives don’t care if the optimizer uses Gaussian Processes or TuRBO trust regions. They care about two questions:

Will this reformulation work at the plant on the first run?
Can I launch this product without regulatory, safety or brand risk?

Human-in-the-loop is how you answer “yes” to both.

Trustworthy AI in reformulation is not about speed alone. It’s about speed with certainty. That’s why HITL is not a compromise — it’s the competitive advantage.

The future is hybrid

The future of reformulation will not be humans versus AI. It will be humans plus AI, in a carefully orchestrated loop. AI will explore, optimize and accelerate. Humans will constrain, validate and approve.

The companies that master this hybrid model will ship reformulated products faster, safer and more profitably than their competitors. They will turn regulatory headwinds into market opportunities and consumer demand into sustainable growth.

The rest will drown in failed pilots, regulatory pushbacks and wasted launches.

The choice is clear. The only path to trustworthy reformulation AI is human-in-the-loop.

This article is published as part of the Foundry Expert Contributor Network.
Want to join?

by Kumar Srivastava

Contributor

Follow Kumar Srivastava on LinkedIn

Kumar Srivastava is a product-technology leader and AI strategist who helps CPG, finance and enterprise teams build GenAI systems that don’t just launch — they ship real value. He is currently the CTO at Turing Labs. He previously held that position at Hypersonix.

Show me more

Why AI upskilling fails, and how tech leaders are fixing it | What IT Leaders Want, Ep. 11

Sep 12, 202528 mins

CareersGenerative AI

CIO ASEAN Leadership Live with McKinsey: Transformative role of AI and GenAI in the insurance sector in Southeast Asia.

By Estelle Quek

Sep 10, 202531 mins

Artificial IntelligenceDigital TransformationInsurance Industry

First Person Meets... Greg Finnigan: Being your authentic self

Sep 8, 202530 mins

CareersCloud ManagementSecurity Hardware

How Sysdig Sage uses AI, agentic automation to quickly close cloud security gaps

Sep 16, 202521 mins

Cloud SecurityDevSecOpsThreat and Vulnerability Management

Why AI upskilling fails, and how tech leaders are fixing it | What IT Leaders Want, Ep. 11

Sep 12, 202528 mins

CareersGenerative AI

IBM, AWS unite to scale trustworthy AI with seamless governance integration

Sep 11, 202513 mins

Amazon Web ServicesComplianceIBM

Africa

Americas

Asia

Europe

Oceania

Topics

About

Policies

Our Network

More