20 March 2023 | Engineering
Introducing radar: Wayve's sensor stack explained
In this post, we share our thoughts on sensor architectures for autonomous driving. Our AV2.0 approach is flexible to the choice of sensors, as we can learn to adapt to different architectures quickly. We discuss why we believe that cameras and radar will be the most important sensors for building a safe and affordable AI Driver system.
What is the optimal sensor suite for self-driving?
The composition and capabilities of the perception sensor suite required for Level 4 (L4) and Level 5 (L5) systems is still an open question. As a starting point, we can use the observation that human eyes might reasonably serve as a lower bound for a sensor suite capability that, when paired with adequate artificial intelligence software, can control a vehicle.
How do we reconcile this with the current state of the industry? L4 autonomy in the United States has been shown to work as intended in 2-3 operational design domains (ODDs) with impressive but costly sensor suites. Operational L4 platforms employ 15-30 cameras, 5-20 radars, and 5-7 lidars per vehicle. Even if we assume these sensor suites are adequate for L5 autonomy, this is a striking departure from humans, both in the number of sensors and in the diversity of wavelengths employed (from visible to infrared to millimetre wave). We might even treat these sensor suites as realizations of an upper bound of capability that is surely sufficient for self-driving.
Today’s L4 platforms use 15-30 cameras, 5-20 radars, and 5-7 lidars per vehicle. While this has been shown to work, is it all necessary?
Wayve’s AV2.0 approach allows us to think differently about sensors
Wayve is on a mission to reimagine autonomous mobility through embodied intelligence. We are the first to deploy autonomous vehicles on public roads with end-to-end deep learning. This data-first approach is flexible to sensor choice and implicitly learns to balance the strengths and weaknesses of sensors.
This approach permits a sensor suite that enables safe-as-human driving that isn’t engineered around addressing edge cases with single-sensor systems analysis but rather one that provides an adequate signal to an intelligent consumer of data. But as we look to design an “optimal” sensor suite that balances sensor capabilities, safety and affordability, additional attributes come into play:
- The sensor suite should support safer-than-human driving
- The sensor suite should be easy to manufacture, integrate, and maintain. Such an architecture supports the rapid adoption and dissemination of AV technology.
Why enhanced safety? Worldwide, about 1.3 million people die yearly from traffic accidents. This is the equivalent of more than 6 fully loaded Boeing 747 jets crashing and killing everyone aboard every day, 365 days a year. Autonomous vehicles represent an opportunity to maintain the societal benefits of the automobile while curtailing or even eliminating the awful payment we all make in return for these benefits. Super-human sensing is a foundation upon which we can build a super-human driver.
Why affordability? If self-driving technology is going to scale and scale rapidly, it must be affordable on a total lifecycle basis. From the perspective of sensors, this means:
- Ruggedness and reliable operation, withstanding the extremes of the operational envelope over a typical automotive lifespan of 10+ years
- Straightforward integration into a variety of platforms
- A mature high-volume manufacturing/supply chain
Without these qualities, we have a high fixed hardware cost that can only be amortized in niche businesses and markets. Since 90% of road deaths occur in middle and low-income countries, we would miss out on the lion’s share of the positive impact that self-driving cars could have.
With this optimal definition of sensors in mind, let’s consider radar. Radar is an active and coherent sensing modality that operates at wavelengths about a thousand times longer than visible light. Important differences between radar and cameras are:
- Active sensing means that radar provides illumination-independent operation
- Coherence means that radar can provide a per-frame measurement of distance and motion (through the Doppler effect)
- With such a long wavelength, angular resolution and accuracy are far lower than a camera for any reasonably sized sensor aperture
Bearing these qualities in mind, consider camera failure modes. These are scenarios where the output of a sensor fails to provide enough information for the controller to operate safely. Examples of this for cameras include:
- Total failure of a sensor caused by a rock strike or perhaps a hardware fault
- Inadequate illumination, nighttime driving and low reflectance
- Inclement weather conditions such as rain, snow, fog, and dirt on the lens
- High scene dynamic range (e.g. sun in frame, headlights) masking dim objects
- Changing scene dynamic range (e.g. entering/exiting tunnels, tree/building shadows)
Automotive radar can address these failure modes with an alternate sensing modality that provides:
- Different uncorrelated risk of sensor hardware failure
- Active scene illumination, not dependent on the time of day or sun angle, indifferent to the presence of the sun in the field of view
- Different weather phenomenology, which offers complementary strengths in inclement weather
- A much longer wavelength yields different and complementary object reflectances
- Direction measurement of range and motion that otherwise would have to be derived from context/multiple frames in the camera
Several disparate forces have worked constructively to make W-band automotive radar a remarkably affordable sensor. At the hardware level, years of R&D investments have yielded entire radar systems on a single PCB—thanks to advances in using efficient PCB-based transmission lines and antennas and low-cost CMOS fabrication processes. No moving parts, a small bill of materials, and conventional IC and PCB manufacturing methods result in a sensor whose production can be rapidly scaled using conventional manufacturing infrastructure. The ADAS automotive radar market alone is projected to exceed $10B in 2028, achieving:
- Substantial economies of scale
- An existing worldwide regulatory framework
- Deep penetration into the automotive supply chain, with automotive radars, increasingly already a component of the vehicle sensor suite
The rapid adoption of W-band automotive radar makes sense when we consider how well-suited it is to automotive. These sensors can withstand the challenging automotive environment, are cheap to manufacture, and may be mounted behind the vehicle’s exterior A surface.
We began with a camera-only sensor suite because it was the fastest way to prototype our AV2.0 approach. Now that we are progressing towards commercial trials and deployment, radar—as compared to additional cameras or lidar—presents promising safety benefits at a low cost.
AV2.0 is well-positioned to unlock the value of camera and radar with end-to-end learning
Sensor performance isn’t just about sensor physics. As an example, consider how a mantis shrimp has 10x better eyes than humans but has vastly poorer perception due to the lack of brain matter attached to those eyes. Similarly, we must consider the performance of our embodied ‘driving’ intelligence that unlocks the value of cameras and radar.
Recent progress in the industry has seen the successful use of machine learning for radar-camera fusion for perception, which has only been accelerated by transformer neural network architectures, which are very capable of aligning representations between camera and radar data modalities. Finally, our end-to-end neural network is not constrained by a hand-engineered scene representation. Instead, it learns a representation that best enables our system to leverage the complementary strengths of disparate sensing modalities.
For this reason, we’re excited to announce the introduction of radar in our sensor suite, starting with our second-generation autonomous driving system in development. This platform will provide the opportunity to demonstrate this thesis with the aim of delivering substantial safety enhancements. As we continue to see advancements in sensor technology, we will regularly assess what we consider to be an optimal AV2.0 sensor suite and build our end-to-end neural network to learn to adapt to new sensor modalities and mixes.