Safety-Aware Multi-Agent Deep Reinforcement Learning for Adaptive Fault-Tolerant Control in Sensor-Lean Industrial Systems: Validation in Beverage CIP

Gonzalez-Potes, Apolinar; Felix-Cuadras, Ramon A.; Mena, Luis J.; Felix, Vanessa G.; Martinez-Pelaez, Rafael; Ostos, Rodolfo; Velarde-Alvarado, Pablo; Ochoa-Brust, Alberto

Abstract

Fault-tolerant control in safety-critical industrial systems demands adaptive responses to equipment degradation, parameter drift, and sensor failures while maintaining strict operational constraints. Traditional model-based controllers struggle under these conditions, requiring extensive retuning and dense instrumentation. Recent safe multi-agent reinforcement learning (MARL) frameworks with control barrier functions (CBFs) achieve real-time constraint satisfaction in robotics and power systems, yet assume comprehensive state observability-incompatible with sensor-hostile industrial environments where instrumentation degradation and contamination risks dominate design constraints. This work presents a safety-aware multi-agent deep reinforcement learning framework for adaptive fault-tolerant control in sensor-lean industrial environments, achieving formal safety through learned implicit barriers under partial observability. The framework integrates four synergistic mechanisms: (1) multi-layer safety architecture combining constrained action projection, prioritized experience replay, conservative training margins, and curriculum-embedded verification achieving zero constraint violations; (2) multi-agent coordination via decentralized execution with learned complementary policies. Additional components include (3) curriculum-driven sim-to-real transfer through progressive four-stage learning achieving 85-92% performance retention without fine-tuning; (4) offline extended Kalman filter validation enabling 70% instrumentation reduction (91-96% reconstruction accuracy) for regulatory auditing without real-time estimation dependencies. Validated through sustained deployment in commercial beverage manufacturing clean-in-place (CIP) systems-a representative safety-critical testbed with hard flow constraints (>= 1.5 L/s), harsh chemical environments, and zero-tolerance contamination requirements-the framework demonstrates superior control precision (coefficient of variation: 2.9-5.3% versus 10% industrial standard) across three hydraulic configurations spanning complexity range 2.1-8.2/10. Comprehensive validation comprising 37+ controlled stress-test campaigns and hundreds of production cycles (accumulated over 6 months) confirms zero safety violations, high reproducibility (CV variation < 0.3% across replicates), predictable complexity-performance scaling (R-2=0.89), and zero-retuning cross-topology transferability. The system has operated autonomously in active production for over 6 months, establishing reproducible methodology for safe MARL deployment in partially-observable, sensor-hostile manufacturing environments where analytical CBF approaches are structurally infeasible.

Más información

Título según WOS: ID WOS:001672039600001 Not found in local WOS DB
Título de la Revista: TECHNOLOGIES
Volumen: 14
Número: 1
Editorial: MDPI
Fecha de publicación: 2026
DOI:

10.3390/technologies14010044

Notas: ISI