End-to-End Autonomy: A New Era of Self-Driving

CVPR 2024 Tutorial
18 June 2024

Available to registered CVPR attendees

Add to calendar

Overview

A comprehensive half-day tutorial focused on End-to-End Autonomous Driving (E2EAD), reflecting the significant shift in focus towards this approach within both industry and academia. Traditional modular approaches in autonomous driving, while effective in specific contexts, often struggle with scalability, long-tail scenarios, and compounding errors from different modules, thereby paving the way for the end-to-end paradigm.

This tutorial aims to dissect the complexities and nuances of end-to-end autonomy, covering theoretical foundations, practical implementations and challenges, and future directions of this evolving technology.

Schedule

Frontiers in End-to-End Learning for Autonomous Driving

Presented by Jamie Shotton, Chief Scientist at Wayve. This segment delves into the latest advancements and methodologies in end-to-end learning tailored explicitly for autonomous driving, highlighting cutting-edge research and innovations in the field.

Fundamentals of Autonomous Driving, History, and Current State-of-the-Art

Presented by Hongyang Li, Principal Investigator at OpenDriveLab. This session will explore the evolution of autonomous driving from its early days to the current state-of-the-art technologies, focusing on the shift from modular to end-to-end approaches.

Towards Neural Simulator: Offline Validation of End-to-End Autonomous Driving

Presented by Nikhil Mohan, Lead Scientist at Wayve, this session will focus on the development and application of neural simulators in autonomous driving. It will cover their roles in enhancing the validation and efficiency of end-to-end systems and their impact on advancing research in the domain.

Learning Models of the World: Exploring Generative World Models in Autonomous Driving

Presented by Gianluca Corrado, Principal Scientist at Wayve.  This session will focus on advancements in generative world models. Explore how integration of world models empowers autonomous vehicles to anticipate and strategize their actions, elevating safety and efficiency on the road. Discover how, by incorporating world models into driving algorithms, there exists the potential for enhanced comprehension of human decisions, ultimately facilitating better adaptability to a broader array of real-world situations.

Language Meet Driving: Empowering End-to-End Autonomous Driving with Large Language Models (LLMs)

Presented by Oleg Sinavski, Principal Scientist at Wayve. This session will focus on the usage of Large Visual-Language Models in autonomous driving. What are the benefits of adding language modality to a behaving autonomous robot? Here, we will cover topics in explainability, grounding textual modality in perception and action, and language reasoning for planning.

Navigating the Future of End-to-End Autonomous Driving: Reflections and Future Directions

Presented by Elahe Arani, Head of AI Research at Wayve and Adjunct Assistant Professor in the Department of Mathematics and Computer Science at Eindhoven University of Technology. In the final session, we will reflect on the key insights gained throughout the tutorial, summarizing the fundamental concepts discussed in the preceding sessions. We will delve into the implications of end-to-end autonomous driving in the larger landscape of AI and transportation. Additionally, we will explore potential challenges and opportunities, paving the way for the next phase of innovations in this domain. This session aims to provide a cohesive understanding of the evolution, current state, and future trajectories of end-to-end autonomous driving, charting a course toward its broader adoption and integration into our daily lives.

Picture of Jamie Shotton
Picture of Hongyang Li
Picture of Nikhil Mohan
Picture of Gianluca Corrado
Picture of Oleg Sinavski
Picture of Elahe Arani

Presenters

Picture of Jamie Shotton

Jamie Shotton

Wayve

Jamie Shotton is the Chief Scientist at Wayve, building foundation models for embodied intelligence, such as GAIA and LINGO, to enable safe and adaptable autonomous vehicles. Before this, he was Partner Director of Science at Microsoft and Head of the Mixed Reality & AI Labs, where he shipped foundational features, including body tracking for Kinect and the hand- and eye-tracking that enable HoloLens 2’s instinctual interaction model. He has explored applications of AI in autonomous driving, mixed reality, virtual presence, human-computer interaction, gaming, robotics, and healthcare.

Highlighting Talks:

Picture of Hongyang Li

Hongyang Li

OpenDriveLab

Hongyang Li is a Senior Research Scientist at Shang- hai AI Lab. He works on perception, sensor fusion, and applications for scalable products in autonomous driving. He received a Ph.D. from the Chinese University of Hong Kong in 2019. He is a recipient of the Hong Kong Ph.D. Fellowship scheme. He has served as an industrial lecturer at Tsinghua University and the industry PhD supervisor at Shanghai Jiao Tong University since 2021. He is the Area Chair for CVPR 2024, 2023, and NeurIPS 2023.

Highlighting Talks:

Picture of Nikhil Mohan

Nikhil Mohan

Wayve

Nikhil is a Lead Scientist at Wayve, where he concentrates on research related to neural simulation methods and synthetic data generation. He holds a Master’s degree from Carnegie Mellon, specializing in signal processing and machine learning. During his tenure at Wayve, he has dedicated his efforts to various research areas, including self-supervised learning and driving policy learning.

Picture of Gianluca Corrado

Gianluca Corrado

Wayve

Gianluca is a Principal Scientist at Wayve, London. His research focuses on advancements in world modeling, particularly in the context of autonomous vehicles. He holds a PhD from the University of Trento in Italy, specializing in machine learning and focusing on computational biology. Formerly with Amazon, he contributed to deep reinforcement learning for recommender systems.

Picture of Oleg Sinavski

Oleg Sinavski

Wayve

Oleg Sinavski is a Principal Scientist at Wayve, London. His research interests focus on applying advances in large language models to the fields of self-driving, reinforcement learning, planning, and simulation. Previously, Oleg worked at Brain Corp in San Diego, CA, as the VP of R&D, where he led research efforts in scalable robotic navigation. Earlier in his career, he specialized in neuromorphic computation and hardware and holds a Ph.D. in computational neuroscience. Oleg gave a guest lecture at the Chalmers University of Technology on Language and Video-Generative AI in Autonomous Driving in Oct 2023.

Picture of Elahe Arani

Elahe Arani

Wayve

Elahe Arani serves as the Head of AI Research at Wayve and holds the position of Adjunct Assistant Professor in the Department of Mathematics and Computer Science at Eindhoven University of Technology. As both a neuroscientist and a computer scientist, her research is dedicated to advancing next-generation AI models by drawing inspiration from the functional and learning mechanisms of the human brain. Her primary focus is pushing the frontiers of AI, behaviour learning, and language to develop a scalable, adaptable, and reliable solution for autonomous driving.

Organizers

Long Chen

Staff Scientist at Wayve

URL

Gianluca Corrado

Principal Scientist at Wayve

URL

Vassia Simaiaki

Head of AI Research at Wayve

URL

Oleg Sinavski

Principal Scientist at Wayve

URL

Fergal Cotter

Staff Scientist at Wayve

URL

Elahe Arani

Head of AI Research at Wayve

URL

Nikhil Mohan

Lead Scientist at Wayve

URL

Jamie Shotton

Chief Scientist at Wayve

URL

Full Course Description

Modular Autonomous Driving: Benefits and Limitations

The world of self-driving cars is undergoing a major change. For a long time, autonomous driving was based on a modular approach, where different tasks like perception, prediction, and planning were individually developed and integrated. While this approach has the benefits of interpretability, verifiability, and ease of debugging, it also presents notable limitations. One of the most significant challenges is addressing long-tail scenarios—rare or unforeseen situations that are not adequately covered in the standard training or operation of these systems. The modular stack’s siloed nature, where each module is optimized for a specific task, can result in a lack of alignment with the overall driving objective. Errors from each module can compound as the sequential procedure progresses, leading to information loss. Moreover, the multi-task, multi-model deployment increases the computational burden and can potentially lead to sub-optimal resource utilization.

The Paradigm Shift to End-to-End Autonomous Driving Systems

In contrast, end-to-end autonomous driving systems represent a paradigm shift. Defined as fully differentiable programs that take raw sensor data as input and produce planning and/or low-level control actions as output, these systems offer a unified approach. Their key advantage lies in their simplicity, combining perception, prediction, and planning into a single model that can be jointly trained. This holistic approach ensures that the entire system, including its intermediate representations, is optimized towards the ultimate task of safe and efficient driving. Shared backbones in these models also increase computational efficiency, offering a streamlined alternative to their modular counterparts.

Advancements and Challenges in End-to-End Autonomous Driving

However, end-to-end autonomous driving has challenges, notably in interpretability, generalizability, and effective validation. One of the major issues with end-to-end systems is their lack of interpretability and generalizability. This used to be a significant obstacle, but recent advancements in foundational vision models and multi-modal large language models (MLLMs) have started to change the landscape. We’re now seeing a growing trend of utilizing these developments to improve the capabilities and reach of end-to-end systems in a wider array of scenarios. In this tutorial, we will thoroughly explore these advancements, examining how they tackle the issues of interpretability and generalizability in end-to-end autonomous driving systems. We’ll look at how these cutting-edge technologies are being integrated and their impact on the field, providing attendees with a detailed understanding of these essential aspects.

Evaluating End-to-End Autonomous Systems: Challenges and Innovations

Evaluating the reliability and safety of these systems presents a more complex challenge compared to modular approaches. In modular systems, each component, such as perception, prediction, and planning, can be tested separately using extensive datasets and well-established simulation technologies. This approach simplifies the validation process of the overall system. In contrast, with end-to-end systems, simulating realistic sensor inputs and real-world agent behaviors to comprehensively cover all potential long-tail scenarios is far from trivial. In this tutorial, we will delve into addressing this issue by exploring the development of generative world models and neural simulators for autonomous driving. These tools are not just pivotal in validating self-driving systems but also play a crucial role in accelerating end-to-end autonomous driving research. They provide easy-to-use data generation capabilities and realistic closed-loop validation, enhancing both the development and assessment of these advanced systems. Additionally, these advancements open up the potential of model-based reinforcement learning in real-world setups, offering new avenues for improving the performance and reliability of autonomous driving technologies.

Back to top