In a world drowning in AI news, we find what actually matters.

JP|EN
Soft MPCritic: Amortized Model Predictive Value Iteration
arxiv_cs_lg·Apr 3, 2026, 08:00 PM·8

Soft MPCritic: Amortized Model Predictive Value Iteration

Summary

soft MPCritic  is a novel Reinforcement Learning (RL)  and Model Predictive Control (MPC)  framework designed to overcome computational challenges in combining these two powerful paradigms. It learns in a soft value space, utilizing sample-based planning  via Model Predictive Path Integral Control (MPPI)  for both online control and value target generation. By training a terminal Q-function  with fitted value iteration  and introducing an amortized warm-start strategy , soft MPCritic  significantly improves computational practicality while maintaining solution quality. This approach, combined with scenario-based planning  using an ensemble of dynamic models, enables effective learning through robust, short-horizon planning on complex control tasks, establishing a scalable blueprint for synthesizing MPC policies.

Technical Impact

  • Addresses Scalability and Computational Challenges : soft MPCritic  provides a practical and scalable solution for integrating Reinforcement Learning (RL)  and Model Predictive Control (MPC) , which has historically been computationally intensive. This opens doors for applying advanced control to larger and more complex systems.

  • Enhanced Planning Horizon and Decision Quality : By aligning the learned Q-function  with MPPI -based planning through fitted value iteration , the framework implicitly extends the effective planning horizon. This can lead to more optimal and robust decision-making in dynamic environments.

  • Improved Computational Efficiency : The introduction of an amortized warm-start strategy  significantly reduces the computational cost of generating batched MPPI -based value targets. This is crucial for real-time or near real-time applications where computational resources are a constraint.

  • Robustness through Ensemble Modeling : The use of an ensemble of dynamic models for scenario-based planning  enhances the robustness of the control policies against model uncertainties, leading to more reliable system performance in varied conditions.

  • Blueprint for Advanced Control Systems : soft MPCritic  serves as a "blueprint" for developing more advanced and scalable control algorithms in domains such as robotics , autonomous driving , and industrial automation , enabling the synthesis of high-performance MPC policies where traditional methods might fail.

Reinforcement learning (RL)Model predictive control (MPC)soft MPCriticModel predictive path integral control (MPPI)Q-functionFitted value iteration
Soft MPCritic: Amortized Model Predictive Value Iteration - EX ViSiON