Learning to Drift in Extreme Turning with Active Exploration and Gaussian Process Based MPC

This paper introduces AEDGPR-MPC, a framework combining Model Predictive Control with Gaussian Process Regression and active exploration to correct vehicle model mismatches, achieving significant reductions in lateral error (up to 52.8% in simulation, 36.7% in RC car tests) and velocity tracking RMSE during extreme cornering drift control.

Active Learning, Control, Robotics, Efficiency, Prediction

Guoqiang Wu, Cheng Hu, Wangjia Weng, Zhouheng Li, Yonghao Fu, Lei Xie, Hongye Su

Zhejiang University, State Key Laboratory of Industrial Control Technology

Generated by grok-3

Background Problem

Extreme cornering in autonomous driving, characterized by high sideslip angles, poses significant challenges for traditional vehicle controllers like Pure Pursuit (PP), which struggle to maintain precision due to tire slip and model mismatches. This paper addresses the need for effective drifting control to enhance trajectory tracking accuracy in such scenarios, focusing on mitigating vehicle model inaccuracies during drift equilibrium calculation and MPC optimization to ensure safer and more precise control in high-speed racing or emergency maneuvers.

Method

The proposed method, termed AEDGPR-MPC, integrates Model Predictive Control (MPC) with Gaussian Process Regression (GPR) to address model mismatches in extreme cornering scenarios. The core idea is to use a double-layer GPR to correct vehicle dynamics model errors both in calculating drift equilibrium points and during MPC optimization for trajectory tracking. The implementation involves: 1) Employing a nominal single-track vehicle model with a simplified Pacejka Tire Model as the baseline; 2) Using GPR to learn and compensate for model errors by fitting residuals between the nominal and actual vehicle dynamics; 3) Introducing an active exploration strategy that leverages GPR variance to explore diverse cornering speeds, enriching the dataset and identifying optimal drift velocities; 4) Planning reference states with a minimum-time planner and tracking them using the GPR-corrected MPC controller. Key points include the iterative data collection process to improve GPR accuracy and the balance between exploration (via variance) and exploitation (via distance to reference) to ensure stable control.

Experiment

The experiments were conducted using the Matlab-Carsim simulation platform and a 1:10 scale RC car to validate the AEDGPR-MPC framework. The setup involved a predefined track with distinct drift and non-drift sections, using a minimum-time planner to generate reference trajectories. Three scenarios were tested: traditional MPC without GPR, MPC with GPR compensation, and MPC with GPR plus active exploration. Datasets were collected iteratively, starting with traditional MPC in the first lap, followed by GPR integration and exploration in subsequent laps. Results showed significant improvements: in simulation, lateral error decreased by 52.8% with GPR (from 2.50m to 1.18m) and by an additional 27.1% with exploration (to 0.86m); velocity tracking RMSE improved by 10.6% with exploration. In the RC car experiment, lateral error reduced by 36.7% with GPR (from 0.49m to 0.31m) and by another 29.0% with exploration (to 0.22m), with velocity tracking RMSE decreasing by 7.2%. The experimental design is reasonable for initial validation, focusing on controlled environments to isolate the impact of GPR and exploration. However, the results, while matching the expectation of error reduction, lack robustness testing against environmental variations (e.g., surface friction changes) and scalability analysis for full-sized vehicles, limiting the generalizability of the findings.

Further Thoughts

The integration of GPR with MPC for drift control opens intriguing avenues for adaptive control systems in robotics and autonomous vehicles, particularly in high-stakes scenarios like racing or emergency response. However, the reliance on GPR variance for exploration raises concerns about potential biases in variance estimation, especially in data-sparse regions, which could misguide the exploration process. A deeper investigation into hybrid exploration strategies, perhaps combining GPR variance with heuristic-based or reinforcement learning approaches, could enhance robustness. Additionally, the computational cost of GPR in real-time applications, though not detailed in the paper, might pose challenges for scaling to full-sized vehicles with stricter latency requirements. Relating this to broader AI for Science domains, similar model correction techniques could be applied to other dynamic systems, such as drone navigation or industrial robotics, where environmental uncertainties are prevalent. Future work could also explore connections with federated learning to aggregate diverse datasets from multiple vehicles, potentially improving GPR model generalization across varied conditions and vehicle types.