Educating the $7 Trillion Driverless “Brain”
As the Trends editors forecast roughly 20 years ago, autonomous highway vehicles, including automobiles, trucks and specialized delivery systems, are poised to disrupt transportation of people and goods leading to an era dominated by “mobility as a service.” Small unmanned vehicles will deliver things like pizzas, groceries, and Ikea dressers. People will watch movies, read books, do work and even have sex in-transit. Cities will be planned differently, as we’ll need less parking and roads will be less crowded. Insurance will change as traffic injuries and fatalities decrease by as much as 90%. Many car-related businesses will disappear. And, the owner-driver automobile industry we’ve known for over 110 years will change, as most people no longer buy cars at all.
How and when this self-driving technology disrupts transportation depend on the “design” of these technologies and how they are perceived by consumers. To understand why, consider the history of personal computers and smartphones. Neither became must-have consumer products until they adopted the graphical user interface pioneered by the Macintosh and iPhone, respectively. Similarly, self-driving cars have to leverage design to make autonomous driving approachable. In fact, the “brains” within the driverless car will need to make passengers as comfortable and confident as they are when climbing into an Uber or Lyft.
Most car manufacturers are already offering semi-autonomous driving aids, like auto-braking or auto-steering which can keep you in a lane, as safety features. At the same time, they’ve been slowly figuring out the best way for the driver to hand over control to the car and then take it back.
Consider Tesla’s “Autopilot.” Today’s Autopilot can really only handle parking and highway driving, and it requires you to always keep a hand on the wheel, otherwise it beeps an alarm, forcing you to take over. And even with this, Tesla’s Autopilot has so far been involved in three driver fatalities. For the time being, this “Level 3 product” makes a lot of sense for Tesla, since its business model is to sell cars.
Meanwhile, Uber and Lyft have both been investing considerably in their own top-secret driving programs intended to sell a service.
In early 2019, Lyft employed 300 engineers and was working to double that number, with $200 million in new investments from self-driving technology partner Magna. It also runs the booking platform for Aptiv, which has 30 vehicles circling the Las Vegas Strip offering autonomous rides between the hotels and major attractions.
In contrast, Uber has over 1,000 employees working on self-driving vehicles. Until recently, Uber was experimenting in Arizona; the state has a perfect climate, wide roads, light traffic, and generous regulations, making it a hotbed for the autonomous driving industry. However, in 2018, Uber pulled out of testing in Tempe after one of its vehicles fatally struck a pedestrian at night. As a result, Uber suspended its own self-driving vehicle testing and focused on partnering with others including Toyota.
Meanwhile, Waymo is already offering a service based on “Level 4” self-driving technology. With Level 4 autonomy, you can say, “Pick me up here, and drop me off there,” and the car is smart enough to handle the rest. Since Waymo is the first-to-market during a time when autonomous vehicle fatalities are still front-page news, the public perception of self-driving cars may largely be shaped by this one company — at least to start. Which may help explain why Waymo is positioning autonomous vehicles not as exciting and futuristic, but as a non-threatening public utility here to prevent 1.25 million deaths worldwide in auto accidents a year.
As a result, Waymo’s designers and engineers have worked to make the service “courteous and cautious,” in the words of Dan Chu, the head of product at Waymo. That narrative is reinforced across the full experience called Waymo One, the company’s Mobility as A Service solution now being refined in Chandler, AZ. In fact, “Courteous and cautious” are everywhere from the car’s interface, which cautions and reassures you at every turn, to the way it drives, which brakes early and often to avoid the whiff of an accident. To assuage people’s fears of climbing into “robocars” (and to prevent a truly catastrophic media event), Waymo One vehicles still have human “drivers” inside, who just sits awkwardly at the self-steering wheel with their hands on their lap, prepared to hit a red “stop” button that’s secured in the car’s center cup holder.
At the heart of every self-driving car’s “design challenge” is delivering vehicles that can operate with extremely high safety, while sharing the streets with “human-driven automobiles,” pedestrians, bicycles, and other autonomous vehicles. To do so, a self-driving car has to continually answer one crucial question: “What will the cars, pedestrians and cyclists around me do over the next five seconds? This problem is called behavior prediction.” Recently, Anthony Levandowski, a former top engineer at Waymo explained that, “...the reason why nobody has consistently achieved level 4 or 5 functionality is because today’s software is not good enough to predict the future. It’s still nowhere close to matching the instincts of human drivers, which is the single most important factor in road safety.”
At this very moment, companies like Waymo, Tesla, Ford and GM’s Cruise subsidiary are trying to solve the behavior prediction problem with deep learning, an approach in which multi-layered neural networks are trained on enormous data sets. Neural networks achieve better accuracy with more data, so the more data a company can collect, the better its vehicles will be at behavior prediction.
With behavior prediction, the input data might be what another car did over the last five seconds. The desired output data would be a prediction about what that car will do over the next five seconds. If you have a ten-second recording of what a car did, then you have an input-output pair that can be used for training. You don’t need a human to manually label anything.
It may not even be necessary for a company to upload videos. Instead, a vehicle can simply save a recording of an abstracted representation of what is going on around it.
According to analyst, Trent Eady, Tesla may have a sustainable competitive advantage in behavior prediction because it now has over 500,000 Hardware 2 and Hardware 3 vehicles on the road. These vehicles have eight cameras covering 360 degrees, a forward-facing radar, a so-called Full Self-Driving Computer (or FSD) for running neural networks, and the ability to save data while driving and later upload it via the customers’ Wi-Fi network when parked at home. Even if Tesla uploads only the abstracted representations from Hardware 2 and 3 cars and not the raw video, the fleet is a massive source of training data for behavior prediction. With abstracted representations, there is no need for humans to do any labor-intensive labeling. And since neural networks improve with more data, this could become a real advantage for Tesla in behavior prediction. That’s big since behavior prediction is so important for autonomous driving.
To maximize the value of this data, it’s particularly valuable to upload an example where Tesla’s existing behavior prediction neural network made a wrong prediction requiring the driver to intervene. Why? Because training on corrected mistakes is a much quicker way to improve than training on random data.
This is a specific form of the “long tail,” in which some driving scenarios might only occur once in a million miles. Notably, a company whose fleet drives a billion miles a month, as Tesla’s will once it reaches about 1 million vehicles, could collect up to 1,000 examples per month of once-in-a-million-mile occurrences, every month. Such data sets of rare occurrences would be relatively small even for a company with a million vehicles, yet they would be extremely valuable since dealing with the long-tail is critical to autonomous driving.
So, what’s the bottom line?
Three crucial elements are needed to build a mass-market self-driving vehicle.
First, there are data collection systems consisting of (1) camera-arrays providing 360-degree vision, (2) positioning technologies including GPS, and (3) SONAR, RADAR and/or LIDAR sensors; together these provide full-situational awareness. This technology exists and the current challenge is to transform it from rare and expensive to cheap and commonplace. It’s mostly a matter of exploiting Moore’s Law, economies of scale and learning-curve effects. This part of the equation is driven by a wide range of specialized vendors supporting Waymo, Tesla, GM, and others.
Second, there is the transformation of the traditional automobile from a piloted analog device into a robot with a passenger compartment and infotainment. Every legacy automobile company and a wide array of startups have been developing this piece of the puzzle for at least 15 years. Batteries and electric drive motors combine with steering and braking systems to create a car that can be driven by a computer and perhaps by a person, as well. Here software-oriented companies like Waymo have partnered with automakers such as Chrysler, while others, including Tesla, have created vertically integrated solutions of their own. At the moment, legacy automakers like GM, BMW and Ford seem to have an advantage in this area.
The third and most important component of a self-driving vehicle is the enormously powerful “brain of the vehicle,” which uses neural networks to perform behavior prediction. Given the mission of getting safely from point “A” to point “B,” this “brain” analyses status information about the previous five seconds as provided by the data collection system, to direct the robot vehicle to take specific actions in the next five seconds. On an interstate highway at 60 miles per hour, this is pretty easy. At rush hour, on a crowded city street with pedestrians, bicycles and erratic human drivers, it becomes incredibly complex for a machine. But given enough learning and processing power, the neural network can supposedly compete with the average cabbie.
Today, training the neural network and exploiting its power is the chief competitive challenge. At this point, other aspects of Mobility as A Service, pale by comparison.
Given this trend, we offer the following forecasts for your consideration.
First, by 2050 worldwide Mobility As A Service will become a $7 trillion a year global industry.
Research from Strategy Analytics and Intel illustrates the enormous potential of this game-changing technology. Under this scenario, Level 5 self-driving vehicles will not arrive on a large scale until around 2030. However, the Trends editors expect Level 3 and Level 4 vehicles to become widely available before 2025.
Second, Nvidia will be a big winner in terms of supplying the “brains” behind MAAS.
Near-term, there are only two places where someone can get the AI computing horsepower needed for MAAS: NVIDIA and Tesla. And only one of these is an open platform that’s available for the whole industry to build on. Tesla is currently working on a next-generation two-chip FSD computer at 144 trillion operations per second (or Tera-Flops) which competes against the NVIDIA DRIVE AGX Pegasus computer which runs at 320 trillion operations per second for AI perception, localization and path planning. Obviously, companies like Intel will not sit on the sideline and leave everything to NVIDIA and Tesla, but at the moment the prospects for NVIDIA look bright. And,
Third, Tesla will remain highly competitive if it can turn its apparent advantage in learning from Autopilot experience into real functionality.
With half a million cars on the road, Tesla can theoretically learn much faster than any of its competitors. And, at this point, Waymo and GM’s Cruise subsidiary are way behind in terms of cumulative learning experience. Some analysts say, “more data just gets you steeply diminishing returns, so it’s practically useless.” But in very complex situations like driving, this is not true because having billions of interactions gives you a chance to encounter many examples of very rare, but very important interactions. Also, even with diminishing returns, simply piling on more examples has been found remarkably effective in instances like OpenAI’s language generation network called GPT-2.