Co-authored with Ajay Agrawal and Avi Goldfarb
[This post appeared in HBR Online on March 28, 2017]
It doesn’t take a tremendous amount of training to begin a job as a cashier at McDonald’s. Even on their first day, most new cashiers are good enough. And they improve as they serve more customers. Although a new cashier may be slower and make more mistakes than their experienced peers, society generally accepts that they will learn from experience.
We don’t often think of it, but the same is true of commercial airline pilots. We take comfort that airline transport pilot certification is regulated by the U.S. Department of Transportation’s Federal Aviation Administration and requires minimum experience of 1,500 hours of flight time, 500 hours of cross-country flight time, 100 hours of night flight time, and 75 hours of instrument operations time. But we also know that pilots continue to improve from on-the-job experience.
On January 15, 2009, when US Airways Flight 1549 was struck by a flock of Canada geese, shutting down all engine power, Captain Chelsey “Sully” Sullenberger miraculously landed his plane in the Hudson River, saving the lives of all 155 passengers. Most reporters attributed his performance to experience. He had recorded 19,663 total flight hours, including 4,765 flying an A320. Sully himself reflected: “One way of looking at this might be that for 42 years, I’ve been making small, regular deposits in this bank of experience, education, and training. And on January 15, the balance was sufficient so that I could make a very large withdrawal.” Sully, and all his passengers, benefited from the thousands of people he’d flown before.
The difference between cashiers and pilots in what constitutes “good enough” is based on tolerance for error. Obviously, our tolerance is much lower for pilots. This is reflected in the amount of in-house training we require them to accumulate prior to serving their first customers, even though they continue to learn from on-the-job experience. We have different definitions for good enough when it comes to how much training humans require in different jobs.
The same is true of machines that learn.
Artificial intelligence (AI) applications are based on generating predictions. Unlike traditionally programmed computer algorithms, designed to take data and follow a specified path to produce an outcome, machine learning, the most common approach to AI these days, involves algorithms evolving through various learning processes. A machine is given data, including outcomes, it finds associations, and then, based on those associations, it takes new data it has never seen before and predicts an outcome.
This means that intelligent machines need to be trained, just as pilots and cashiers do. Companies design systems to train new employees until they are good enough and then deploy them into service, knowing that they will improve as they learn from experience doing their job. While this seems obvious, determining what constitutes good enough is an important decision. In the case of machine intelligence, it can be a major strategic decision regarding timing: when to shift from in-house training to on-the-job learning.
There is no ready-made answer as to what constitutes “good enough” for machine intelligence. Instead, there are trade-offs. Success with machine intelligence will require taking these trade-offs seriously and approaching them strategically.
The first question firms must ask is what tolerance they and their customers have for error. We have high tolerance for error with some intelligent machines and a low tolerance for others. For example, Google’s Inbox application reads your email, uses AI to predict how you will want to respond, and generates three short responses for the user to choose from. Many users report enjoying using the application even when it has a 70% failure rate (i.e., the AI-generated response is only useful 30% of the time). The reason for this high tolerance for error is that the benefit of reduced composing and typing outweighs the cost of wasted screen real estate when the predicted short response is wrong.
In contrast, we have low tolerance for error in the realm of autonomous driving. The first generation of autonomous vehicles, largely pioneered by Google, was trained using specialist human drivers who took a limited set of vehicles and drove them hundreds of thousands of kilometers. It was like a parent taking a teenager on supervised driving experiences before letting them drive on their own.
The human specialist drivers provide a safe training environment, but are also extremely limited. The machine only learns about a small number of situations. It may take many millions of miles in varying environments and situations before someone has learned how to deal with the rare incidents that are more likely to lead to accidents. For autonomous vehicles, real roads are nasty and unforgiving precisely because nasty or unforgiving human-caused situations can occur on them.
The second question to ask, then, is how important it is to capture user data in the wild. Understanding that training might take a prohibitively long time, Tesla rolled out autonomous vehicle capabilities to all its recent models. These capabilities included a set of sensors that collect environmental data as well as driving data that is uploaded to Tesla’s machine learning servers. In a very short period of time, Tesla can obtain training data just by observing how the drivers of its cars drive. The more Tesla vehicles there are on the roads, the more Tesla’s machines can learn.
However, in addition to passively collecting data as humans drive their Teslas, the company needs autonomous driving data to understand how its autonomous systems are operating. For that, it needs to have cars drive autonomously so that it can assess performance, but also assess when a human driver, required to be there and paying attention, chooses to intervene. Tesla’s ultimate goal is not to produce a copilot, or a teenager who drives under supervision, but a fully autonomous vehicle. That requires getting to the point where real people feel comfortable in a self-driving car.
Herein lies a tricky trade-off. In order to get better, Tesla needs its machines to learn in real situations. But putting its current cars in real situations means giving customers a relatively “young and inexperienced” driver — although perhaps as good as or better than many young human drivers. Still, this is far riskier than beta testing, for example, whether Siri or Alexa understood what you said, or whether Google Inbox correctly predicts your response to an email. In the case of Siri, Alexa, or Google Inbox, it means a lower-quality user experience. In the case of autonomous vehicles, it means putting lives at risk.
As Backchannel documented in a recent article, that experience can be scary. Cars can exit freeways without notice, or put on the brakes when mistaking an underpass for an obstruction. Nervous drivers may opt not to use the autonomous features, and, in the process, may hinder Tesla’s ability to learn. Furthermore, even if the company can persuade some people to become beta testers, are those the people it wants? After all, a beta tester for autonomous driving may be someone with a taste for more risk than the average driver. In that case, who is the company training their machines to be like?
Machines learn faster with more data, and more data is generated when machines are deployed in the wild. However, bad things can happen in the wild and harm the company brand. Putting products in the wild earlier accelerates learning but risks harming the brand (and perhaps the customer!); putting products in the wild later slows learning but allows for more time to improve the product in-house and protect the brand (and, again, perhaps the customer).
For some products, like Google Inbox, the answer to the trade-off seems clear because the cost of poor performance is low and the benefits from learning from customer usage are high. It makes sense to deploy this type of product in the wild early. For other products, like cars, the answer is less clear. As more companies seek to take advantage of machine learning, this is a trade-off more and more will have to make.
2 Replies to “The Trade-Off Every AI Company Will Make”
You leave out an important point: Each unit of experience is available to all the robots of the same type — at least by their next update. It is like having Sullenberger’s experience (and that of all other pilots) available to every pilot.
Right now robots mostly need more experience than humans to learn the same skill. That is improving, hard to say how quickly. But robots can gain experience from all their siblings, something humans cannot do nearly as quickly or as completely. So as robots are more broadly deployed their skills will advance faster than ours.