Calculating Tournament Odds
What are the chances of a 16-seed winning it all?

The NCAA Basketball Tournament is one of the purest forms of competition that the world has ever seen. Once a team gets there anything can happen. Many Americans participate in office pools and fill out brackets, hoping to be able to pick the upsets that nobody else will see.

One of the most intriguing stories of the tournament is the lowly 16-seed. Through 2007 no 16-seed has ever won their first-round NCAA Tournament game. What are the odds of it happening? Is it possible to calculate the chances of an outcome that has never occurred? If they win the first game, what are the chances of them winning the tournament? How would you calculate it?

I've come up with a pretty good methodology for predicting a first round upset. And the chances of a 16-seed beating a 1-seed in the first round of the NCAA Basketball Tournament is 1.84%, or 54-to-1. After that it will take a little more creativity to come up with a number. So now I have to prove it, huh?

The Calculation

I originally did this analysis back in 1997  and posted it to a newsgroup. It was published in the online Journal of Basketball Studies (JoBS) , an electronic clearinghouse of scientific research about the game of basketball. I made several assumptions back then and the author of  JoBS, Dean Oliver, pretty much agreed that my assumptions were valid. The assumptions I made are:

(1) The final scores are representative of how well the teams played.
(2) That the NCAA Tournament committee who does the seedings knew what they were doing and that everyone got the seeds they deserved.
(3) All 1 seeds are created equal and all 16 seeds are created equal. I know this isn't true, but as a group, we can predict their performances.
(4) I assume that the difference in the winning and losing score will be a normal distribution.

I took all of the results for first round games from 1985 to 2007 and massaged the data into a format I could use. For 1 vs. 16, I calculated the average margin of victory by the 1 seeds to be 24.56 points with a standard deviation of 12.24. Using a value of x=-1 (The margin of victory indicating a win for the 16 seed), I plugged it into your standard normal curve function (NORMDIST in Microsoft Excel) and the value returned was 0.0184, or 1.84%. That's a 1 in 54 chance of victory for a given 16 seed in a 1 vs. 16 game.

There have been 92 matchups so far, so chances are that it should've happened by now. But the fact that a 54 to 1 chance hasn't happened in 92 chances isn't an unlikely scenario. It's not a huge statistical anomaly. How many games would have to be played before this evaluation is suspect? If I live to be that old, I'll look forward to you rubbing it in my face.You won't be in such great shape yourself.

How valid is this method?

According to my research, a 16 seed should have beaten a 1 seed by now. But they haven't. Does that invalidate my methodology? Is it a good predictor?

As a test I ran the same analysis on all other first round games from 1985 to 2007. Using the calculated value of the chance for an upset in one game, I multiplied it by 92 to get the number of upsets that this methodology predicts. Those values are listed in the table below, along with the actual number of upsets that have occurred during that time period.

 Matchup Mean Stdev Upset  Prob. Proj. Upsets Actual Upsets 1 vs. 16 24.56 12.24 0.0184 1.62 0 2 vs. 15 16.40 10.92 0.0556 4.89 4 3 vs. 14 11.03 10.63 0.1288 11.33 14 4 vs. 13 9.34 11.35 0.1811 15.94 17 5 vs. 12 4.76 10.13 0.2847 25.06 29 6 vs. 11 4.19 10.39 0.3086 27.15 26 7 vs. 10 2.48 10.86 0.3745 32.95 35 8 vs. 9 -0.56 10.95 0.4839 42.58 47

I'll leave it as an exercise to the reader to show that the Projected Upsets and Actual Upsets are close enough to validate the methodology. Frankly, I was shocked to see what an accurate predictor this method was!

According to the above numbers, there's a much better than even chance that at least one 2,3, or 4 seed will lose in the first round. It's actually 1.462 to 1. (4*0.0556 + 4*0.1288 + 4*0.1811). That's pretty phoenomenal. If you're filling out your bracket and you need a different strategy, take a chance and pick a 2,3, or 4 to lose. Or bet somebody even-money on it. You may not win every year, but the odds are with you.

So what's it going to take for a 16 seed to make me look good?

How can a 16 seed do it?

In the past 23 years, 92 different coaches of 16-seed teams have tried to be the Cinderella story of the tournament and all of them have failed. Some have gotten close. In 1990 Michigan State and Murray State went to overtime. In 1989 Princeton took the last two shots in their 60-59 loss to Georgetown. In 2006,  Albany was up by 12 with 12 minutes to go until UConn woke up and went on a run to win it (Amazingly, I was interviewed prior to the game ). 5 games have been decided by 5 points or less. 11 have been decided by 9 points or less. What's it going to take?

Perfect Execution: This is the way most coaches want to make it happen. Play smart. Minimize turnovers. Execute our game plan. A total team effort. Believe in yourselves and don't be intimidated. Frustrate the opponent early and get them out of their rhythm. When it's close late in the game, the crowd gets on your side, adrenaline kicks in, and hope you can keep it close down the stretch. Failing that...

Bombs Away: Another possible scenario is that one or two players get into a zone and can't miss from 3-point range. If a team is making 3 points every time down the floor, it's hard for any team to overcome that.

The Gang's All Here: In 1983, NC State was hobbled by injuries all year. When the ACC Tournament came around, everyone was finally healthy. They won the ACC tournament and got a 6-seed in the NCAA tournament. With everybody healthy they played much better than a 6-seed. Of course they got a little lucky too, but they ended up being the lowest seed to win the Big Dance. It's possible that a 16-seed is really better than their seeding due to injuries or the team finally gelling. That lowers the odds in their favor.

Busted!!  The 1-seed comes into town, the players decide to visit a strip club, a fight breaks out, and half of the starters end up in jail. This is one of those random chance scenarios that no one can "make" happen, but it is a possibility when discussing how the odds. We can't discount the chances that an off-court incident will somehow make the impossible happen. Other variants of this involve injuries at the beginning of the game or team illness for the 1-seed.

How it's getting less likely

Unfortunately the more recent numbers paint a bleaker story for the hopes of the 16-seed. When I did my analysis in 1997, I had calculated the odds of the 1 vs. 16 upset at 40 to 1. In 2007 it's 54 to 1. What has happened in the last 10 years that make it less likely?

Play-in game: For the past few years there have actually been 65 teams in the Big Dance, with two teams playing to win the "right" to be the 16-seed in one of the brackets. So 1/4 of the 16 seeds are playing a game right before they play the 1-seed. That's not a rested team, which is definitely needed before playing in what could be the toughest game of their season. This is a permanent part of the tournament, and if the NCAA hands out any more automatic bids to new conferences and doesn't take any away, we'll have more play-in games to see who gets to be worn out for the 1-seed.

Teams stay closer to home: The NCAA tries to get teams to play as close to home as possible, with 1-seeds getting the highest priority. There's less of a chance for a home-court disadvantage or teams being weary from travel.

Everyone knows everyone: It's no longer the case that it's hard to find information about a 16-seed. There is so much information on the Internet and ESPN that the coach of a 1-seed will be able to find out a good deal about the team he's playing. The 1-seed won't be shocked when they see the team for the first time, which takes away from a possible "element of surprise".

No sleeper players: 15 years ago you could say that there's a chance that a future Hall of Famer will end up on a team that's a potential 16-seed. With all of the scouting that's going on now, the chances are smaller that a potential superstar will escape the sights of any of the top college programs. Players are scouted from the 7th grade and at countless basketball camps and rated on the Internet. The teams in the lower ranked conferences just aren't getting the players that they used to, and their chances of getting a "diamond in the rough" goes down as the high school players get more scrutiny.

No mercy...ever: No coach of a 1-seed wants the dubious distinction of losing in the first round of the NCAAs. If they're a 1-seed they have National Championship aspirations, and losing that early would certainly not endear them to their schools or booster clubs. As the years go by, the pressure to win increases. They know about the close games and nothing scares a coach more than a potential upset by a team that they should beat handily, especially during March Madness. Coaches use this game to send a signal to everyone else in the tournament: We're the best team in the country, so fear us. Also this is probably the last game that the scrubs could potentially see any significant playing time and coaches like to reward them with some minutes in the tournament. That's not going to happen unless the game is well in hand. Chivalry be damned. Blowout city, here we come.

Welcome to the Computer Age: I don't know how the seeding committee for the NCAA Tournament did things 15 years ago, but now they use RPI. In the past there were probably some 16-seeds that should have been ranked higher, hence the chance of an upset increases. With RPI ruling the selections, there's no chance that someone like the Ivy League champ will get only a 16-seed if they deserve higher.

Beyond Round 1...

So if a 16-seed pulls off the upset in the first round, what are their chances after that? I'm sure that one of the starting points they use in Las Vegas for determining a point spread is a formula concerning the RPI of the two teams involved. An 8/9 seed will have an RPI of about 30 or 35, and the 16-seed will most likely have one between 100 and 200. This would be an interesting study. My gut feeling is that the 8/9 seed would be a 10 to 1 favorite to win the game.

After the victory in Round 2? As analytical as I am, if this were to happen, there's really no way to calculate the chances after that. How many upsets have there been in the rest of the tournament? Are Cinderella teams going to be playing each other? Did the seeding committee really screw up and seed a team 16 when they are really a 10 or a 5? Perhaps the best thing going for them is that after winning two games, the 16-seed is going to feel like they have a 1 out of 16 chance to win the tournament, and who am I to argue with that? If they truly believe in themselves, it won't be mere numbers that determine their chances of winning, but their heart and determination.

Any comments? Criticism? Please feel free to contact me at graybill@mindspring.com . If I get enough good comments, I may start a blog or have some other type of feedback mechanism.