Does collecting more data lead to better decision-making? Competitive, data-savvy companies like Amazon, Google and Netflix have learned that data analysis alone doesn’t always produce optimum results. In this talk, data scientist Sebastian Wernicke breaks down what goes wrong when we make decisions based purely on data — and suggests a brainier way to use it.
00:11-Roy Price is a man that most of you have probably never heard about, even though he may have been responsible for 22 somewhat mediocre minutes of your life on April 19, 2013. He may have also been responsible for 22 very entertaining minutes, but not very many of you. And all of that goes back to a decision that Roy had to make about three years ago.
00:34-So you see, Roy Price is a senior executive with Amazon Studios. That’s the TV production company of Amazon. He’s 47 years old, slim, spiky hair, describes himself on Twitter as “movies, TV, technology, tacos.” And Roy Price has a very responsible job, because it’s his responsibility to pick the shows, the original content that Amazon is going to make. And of course that’s a highly competitive space. I mean, there are so many TV shows already out there, that Roy can’t just choose any show. He has to find shows that are really, really great. So in other words, he has to find shows that are on the very right end of this curve here.
01:16-So this curve here is the rating distribution of about 2,500 TV shows on the website IMDB, and the rating goes from one to 10, and the height here shows you how many shows get that rating. So if your show gets a rating of nine points or higher, that’s a winner. Then you have a top two percent show. That’s shows like “Breaking Bad,” “Game of Thrones,” “The Wire,” so all of these shows that are addictive,whereafter you’ve watched a season, your brain is basically like, “Where can I get more of these episodes?” That kind of show. On the left side, just for clarity, here on that end, you have a show called “Toddlers and Tiaras” —
01:58— which should tell you enough about what’s going on on that end of the curve.
02:02-Now, Roy Price is not worried about getting on the left end of the curve, because I think you would have to have some serious brainpower to undercut “Toddlers and Tiaras.” So what he’s worried about is this middle bulge here, the bulge of average TV, you know, those shows that aren’t really good or really bad,they don’t really get you excited. So he needs to make sure that he’s really on the right end of this.
02:26-So the pressure is on, and of course it’s also the first time that Amazon is even doing something like this,so Roy Price does not want to take any chances. He wants to engineer success. He needs a guaranteed success, and so what he does is, he holds a competition.
02:42-So he takes a bunch of ideas for TV shows, and from those ideas, through an evaluation, they select eight candidates for TV shows, and then he just makes the first episode of each one of these shows and puts them online for free for everyone to watch. And so when Amazon is giving out free stuff, you’re going to take it, right? So millions of viewers are watching those episodes.
03:07-What they don’t realize is that, while they’re watching their shows, actually, they are being watched. They are being watched by Roy Price and his team, who record everything. They record when somebody presses play, when somebody presses pause, what parts they skip, what parts they watch again. So they collect millions of data points, because they want to have those data points to then decide which show they should make. And sure enough, so they collect all the data, they do all the data crunching, and an answer emerges, and the answer is, “Amazon should do a sitcom about four Republican US Senators.”They did that show.
03:42-So does anyone know the name of the show? (Audience: “Alpha House.”) Yes, “Alpha House,” but it seems like not too many of you here remember that show, actually, because it didn’t turn out that great.It’s actually just an average show, actually — literally, in fact, because the average of this curve here is at 7.4, and “Alpha House” lands at 7.5, so a slightly above average show, but certainly not what Roy Price and his team were aiming for. Meanwhile, however, at about the same time, at another company, another executive did manage to land a top show using data analysis, and his name is Ted, Ted Sarandos, who is the Chief Content Officer of Netflix, and just like Roy, he’s on a constant mission to find that great TV show, and he uses data as well to do that, except he does it a little bit differently. So instead of holding a competition, what he did — and his team of course — was they looked at all the data they already had about Netflix viewers, you know, the ratings they give their shows, the viewing histories, what shows people like, and so on. And then they use that data to discover all of these little bits and pieces about the audience: what kinds of shows they like, what kind of producers, what kind of actors. And once they had all of these pieces together, they took a leap of faith, and they decided to license not a sitcom about four Senators but a drama series about a single Senator. You guys know the show?
05:06-Yes, “House of Cards,” and Netflix of course, nailed it with that show, at least for the first two seasons.
05:16-“House of Cards” gets a 9.1 rating on this curve, so it’s exactly where they wanted it to be.
05:23-Now, the question of course is, what happened here? So you have two very competitive, data-savvy companies. They connect all of these millions of data points, and then it works beautifully for one of them, and it doesn’t work for the other one. So why? Because logic kind of tells you that this should be working all the time. I mean, if you’re collecting millions of data points on a decision you’re going to make, then you should be able to make a pretty good decision. You have 200 years of statistics to rely on. You’re amplifying it with very powerful computers. The least you could expect is good TV, right?
05:56-And if data analysis does not work that way, then it actually gets a little scary, because we live in a time where we’re turning to data more and more to make very serious decisions that go far beyond TV. Does anyone here know the company Multi-Health Systems? No one. OK, that’s good actually. OK, so Multi-Health Systems is a software company, and I hope that nobody here in this room ever comes into contact with that software, because if you do, it means you’re in prison.
06:30-If someone here in the US is in prison, and they apply for parole, then it’s very likely that data analysis software from that company will be used in determining whether to grant that parole. So it’s the same principle as Amazon and Netflix, but now instead of deciding whether a TV show is going to be good or bad, you’re deciding whether a person is going to be good or bad. And mediocre TV, 22 minutes, that can be pretty bad, but more years in prison, I guess, even worse.
07:01-And unfortunately, there is actually some evidence that this data analysis, despite having lots of data, does not always produce optimum results. And that’s not because a company like Multi-Health Systemsdoesn’t know what to do with data. Even the most data-savvy companies get it wrong. Yes, even Google gets it wrong sometimes.
07:19-In 2009, Google announced that they were able, with data analysis, to predict outbreaks of influenza, the nasty kind of flu, by doing data analysis on their Google searches. And it worked beautifully, and it made a big splash in the news, including the pinnacle of scientific success: a publication in the journal “Nature.”It worked beautifully for year after year after year, until one year it failed. And nobody could even tell exactly why. It just didn’t work that year, and of course that again made big news, including now a retraction of a publication from the journal “Nature.” So even the most data-savvy companies, Amazon and Google, they sometimes get it wrong. And despite all those failures, data is moving rapidly into real-life decision-making — into the workplace, law enforcement, medicine. So we should better make sure that data is helping.
08:18-Now, personally I’ve seen a lot of this struggle with data myself, because I work in computational genetics, which is also a field where lots of very smart people are using unimaginable amounts of data to make pretty serious decisions like deciding on a cancer therapy or developing a drug. And over the years, I’ve noticed a sort of pattern or kind of rule, if you will, about the difference between successful decision-making with data and unsuccessful decision-making, and I find this a pattern worth sharing, and it goes something like this.
08:49-So whenever you’re solving a complex problem, you’re doing essentially two things. The first one is, you take that problem apart into its bits and pieces so that you can deeply analyze those bits and pieces, and then of course you do the second part. You put all of these bits and pieces back together again to come to your conclusion. And sometimes you have to do it over again, but it’s always those two things: taking apart and putting back together again.
09:13-And now the crucial thing is that data and data analysis is only good for the first part. Data and data analysis, no matter how powerful, can only help you taking a problem apart and understanding its pieces.It’s not suited to put those pieces back together again and then to come to a conclusion. There’s another tool that can do that, and we all have it, and that tool is the brain. If there’s one thing a brain is good at,it’s taking bits and pieces back together again, even when you have incomplete information, and coming to a good conclusion, especially if it’s the brain of an expert.
09:47-And that’s why I believe that Netflix was so successful, because they used data and brains where they belong in the process. They use data to first understand lots of pieces about their audience that they otherwise wouldn’t have been able to understand at that depth, but then the decision to take all these bits and pieces and put them back together again and make a show like “House of Cards,” that was nowhere in the data. Ted Sarandos and his team made that decision to license that show, which also meant, by the way, that they were taking a pretty big personal risk with that decision. And Amazon, on the other hand, they did it the wrong way around. They used data all the way to drive their decision-making, first when they held their competition of TV ideas, then when they selected “Alpha House” to make as a show. Which of course was a very safe decision for them, because they could always point at the data, saying, “This is what the data tells us.” But it didn’t lead to the exceptional results that they were hoping for.
10:41S-o data is of course a massively useful tool to make better decisions, but I believe that things go wrongwhen data is starting to drive those decisions. No matter how powerful, data is just a tool, and to keep that in mind, I find this device here quite useful. Many of you will …
11:00-Before there was data, this was the decision-making device to use.
11:06-Many of you will know this. This toy here is called the Magic 8 Ball, and it’s really amazing, because if you have a decision to make, a yes or no question, all you have to do is you shake the ball, and then you get an answer — “Most Likely” — right here in this window in real time. I’ll have it out later for tech demos.
11:23-Now, the thing is, of course — so I’ve made some decisions in my life where, in hindsight, I should have just listened to the ball. But, you know, of course, if you have the data available, you want to replace this with something much more sophisticated, like data analysis to come to a better decision. But that does not change the basic setup. So the ball may get smarter and smarter and smarter, but I believe it’s still on us to make the decisions if we want to achieve something extraordinary, on the right end of the curve.And I find that a very encouraging message, in fact, that even in the face of huge amounts of data, it still pays off to make decisions, to be an expert in what you’re doing and take risks. Because in the end, it’s not data, it’s risks that will land you on the right end of the curve.