LEVERAGING FUTURE RELATIONSHIP REASONING FOR VEHICLE TRAJECTORY PREDICTION

Abstract

Understanding the interaction between multiple agents is crucial for realistic vehicle trajectory prediction. Existing methods have attempted to infer the interaction from the observed past trajectories of agents using pooling, attention, or graph-based methods, which rely on a deterministic approach. However, these methods can fail under complex road structures, as they cannot predict various interactions that may occur in the future. In this paper, we propose a novel approach that uses lane information to predict a stochastic future relationship among agents. To obtain a coarse future motion of agents, our method first predicts the probability of lanelevel waypoint occupancy of vehicles. We then utilize the temporal probability of passing adjacent lanes for each agent pair, assuming that agents passing adjacent lanes will highly interact. We also model the interaction using a probabilistic distribution, which allows for multiple possible future interactions. The distribution is learned from the posterior distribution of interaction obtained from ground truth future trajectories. We validate our method on popular trajectory prediction datasets: nuScenes and Argoverse. The results show that the proposed method brings remarkable performance gain in prediction accuracy, and achieves state-ofthe-art performance in long-term prediction benchmark dataset.

1. INTRODUCTION

For safe autonomous driving, predicting a vehicle's future trajectory is crucial. Early heuristic prediction models utilized only the past trajectory of the target vehicle ( Lin et al. (2000) ; Barth & Franke (2008) ). However, with the advent of deep learning, more accurate predictions can be made by also considering the vehicle's relationship with the High-Definition (HD) map ( Liang et al. 2020)). Since surrounding vehicles are not stationary, predicting relationships with them is much more complicated and has become essential for realistic trajectory prediction. Furthermore, since individual drivers control each vehicle, their interaction has a stochastic nature. Previous works modeled interaction from past trajectories of the surrounding vehicles by employing pooling, multi-head attention, or spatio-temporal graph methods. However, we observed that these methods easily fail under complex road structures. For example, Fig. 1 shows the past trajectories of agents (left) and the attention weights among agents (right) obtained by a previous method (Mercat et al. ( 2020)) that learned the interaction among agents using multi-head attention (MHA). Since agents 0 and 4 are expected to join in the future, the attention weight between them should be high. However, the model predicts a low attention weight between them, highlighting the difficulty of reasoning future relationships between agents based solely on past trajectories. Incorporating the road structure should make the reasoning process much easier. The decision-making process of human drivers can provide insights on how to model interaction. They first set their goal where they are trying to reach on the map. Next, to infer the interaction with surrounding agents, they roughly infer how the others will behave in the future. After that, they infer the interaction with others by inferring how likely the future path of other vehicles will overlap the path set by themselves. The drivers consider interaction more significant the more the future paths of other vehicles overlap with their own. We define the interaction from this process as a "Future Relationship". We use the following approaches to model Future Relationship, as shown in Fig. 3 .



(2020); Zeng et al. (2021)) or surrounding agents ( Lee et al. (2017); Chandra et al. (

