Operators of ridehail platforms such as Lyft and Uber will likely be early-adopters of autonomous electric vehicles (AEVs), since AEVs promise to reduce costs, be safer, and more efficient. While studies on the operation of ridehail systems with AEVs exist, nearly all have ignored the need to recharge the vehicles during operation. We address this here in our work on the ridehail problem with AEVs (RP-AEV).
In the RP-AEV, a decision maker (DM) operates a fleet of AEVs that serve requests arising randomly throughout a region. The DM is responsible for assigning AEVs to requests, as well as repositioning and recharging AEVs in anticipation of future requests. We model the RP-AEV as a Markov decision process.
We compare classical approximate dynamic programming (ADP) solution methods with those of deep reinforcement learning (RL), which have garnered enthusiasm but achieved only limited success to date in operational problems. From ADP, we explore novel heuristic policies, both alone and combined with lookaheads. From RL, we build on the approach from Holler et al. (2018). We employ neural-networks (NNs) both to determine the state representation (with single-layer NNs) and to learn state-action value functions (with deep NNs) using Q-learning.
Additionally, we establish a dual bound to gauge the effectiveness of these approaches by calculating the expected value with perfect information. With perfect information, the RP-AEV may be decomposed so as to permit a solution via Benders decomposition, where the master problem assigns AEVs to requests, and the subproblem provides instructions for repositioning and recharging.