# Coopernaut: End-to-End Driving with Cooperative Perception for Networked Vehicles

CVPR 2022

## Paper | Code | Dataset | Bibtex

 Optical sensors and learning algorithms for autonomous vehicles have dramatically advanced in the past few years. Nonetheless, the reliability of today's autonomous vehicles is hindered by the limited line-of-sight sensing capability and the brittleness of data-driven methods in handling extreme situations. With recent developments of telecommunication technologies, cooperative perception with vehicle-to-vehicle communications has become a promising paradigm to enhance autonomous driving in dangerous or emergency situations. We introduce COOPERNAUT, an end-to-end learning model that uses cross-vehicle perception for vision-based cooperative driving. Our model encodes LiDAR information into compact point-based representations that can be transmitted as messages between vehicles via realistic wireless channels. To evaluate our model, we develop AutoCastSim, a network-augmented driving simulation framework with example accident-prone scenarios. Our experiments on AutoCastSim suggest that our cooperative perception driving models lead to a 40% improvement in average success rate over egocentric driving models in these challenging driving situations and a 5 times smaller bandwidth requirement than prior work V2VNet.

# Coopernaut Overview

 We introduce Coopernaut, an end-to-end point-based model that uses cross-vehicle perception for vision-based cooperative driving. Our model encodes LiDAR information into compact point-based representations that can be transmitted. It contains a Point encoder to extract critical information locally for sharing, a Representation Aggregator for merging multi-vehicle messages, and a Control Module to reason the joint messages. The message produced by the encoder has 128 keypoint coordinates and their corresponding features. The message is then spatially transformed into the ego frame. The ego vehicle merges received messages and performs max voxel pooling on the joint representation. Finally, the Aggregator synthesizes the joint representation from all the neighbors as well as the ego vehicle itself before sending them to the Control Module to generate control decisions. The numbers in parentheses specify the data dimensions as messages between vehicles via realistic wireless channels.

# AutoCastSim Framework

 Overtaking Left Turn Red Light Violation
 We developed AutoCastSim, a simulation framework that offers network-augmented autonomous driving simulation on top of CARLA. This simulation framework allows custom designs of various traffic scenarios for training and evaluating autonomous driving models. The simulated vehicles can be configured with realistic wireless communications. It also provides a path planning-based oracle expert who has access to privileged environment information to generate action supervision for imitation learning. Above show three example challenging traffic scenarios we designed in AutoCastSim as the evaluation benchmark for Coopernaut. We have made AutoCastSim open-source. You can download this simulation framework here.

# Qualitative Results

 Here we provide qualitative side-to-side comparisons between the No V2V Sharing driving model, which makes driving actions solely based on the line-of-sight sensing, and Coopernaut, our model that makes decisions based on the augmented field of view from cooperative perception. Please click on the thumbnails to switch to a specific scenario.
 1 / 3 2 / 3 3 / 3 ❮ ❯

# Driving Dataset

 Overtaking Left Turn Red Light Violation
 We provide a driving dataset for imitation learning for our benchmark. You can download the dataset here. Furthermore, you can collect your own dataset by running our data collection scripts provided in the public GitHub repository Coopernaut. The kick-start dataset consists of 3 scenarios, each of which has a Train Set and a Validation Set. The Train set of a scenario contains on average 12 trajectories in total, with 3 of them being accident-prone and 9 being normal driving trajectories.

# Citation

 If you are interested in citing AutoCastSim or Coopernaut in your work, we encourage you to use the following BibTeX: @inproceedings{coopernaut, title = {Coopernaut: End-to-End Driving with Cooperative Perception for Networked Vehicles}, author = {Jiaxun Cui and Hang Qiu and Dian Chen and Peter Stone and Yuke Zhu}, booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, year = {2022} }