Calculating Tournament Odds
What are the chances of a 16-seed winning it all?
The NCAA Basketball Tournament is one of the purest forms of
competition that the world has ever seen. Once a team gets there
anything can happen. Many Americans participate in office pools
and fill out brackets, hoping to be able to pick the upsets that nobody
else will see.
One of the most intriguing stories of the tournament is the lowly
16-seed. Through 2007 no 16-seed has ever won their first-round NCAA
Tournament game. What are the odds of it happening? Is it possible to
calculate the chances of an outcome that has never occurred? If they
win the first game, what are the chances of them winning the
tournament? How would you calculate it?
I've come up with a pretty good methodology for predicting a first
round upset. And the chances of a 16-seed beating a 1-seed in
the first round of the NCAA Basketball Tournament is 1.84%, or 54-to-1.
After that it will take a little more creativity to come up with a
number. So now I have to prove it, huh?
The Calculation
I originally did this analysis back in 1997
and posted it to a newsgroup. It was published in the online Journal of Basketball Studies
(JoBS)
, an electronic clearinghouse of scientific research about the game of
basketball. I made several assumptions back then and the author
of JoBS, Dean Oliver, pretty much agreed that my assumptions
were valid. The assumptions I made are:
(1) The final scores are representative of how well the teams
played.
(2) That the NCAA Tournament committee who does the seedings
knew what they were doing and that everyone got the seeds they
deserved.
(3) All 1 seeds are created equal and all 16 seeds are created
equal. I know this isn't true, but as a group, we can predict their
performances.
(4) I assume that the difference in the winning and
losing score will be a normal
distribution.
I took all
of the results for first round games from 1985 to 2007 and massaged
the data into a format I could
use. For 1 vs.
16, I calculated the average margin of victory by the 1 seeds to be
24.56 points with a standard deviation of 12.24. Using a value of x=-1
(The margin of victory indicating a win for the 16 seed), I plugged it
into your standard normal curve function (NORMDIST in Microsoft Excel)
and the value returned was 0.0184, or 1.84%. That's a 1 in 54 chance of
victory for a given 16 seed in a 1 vs. 16 game.
There have been 92 matchups so far, so chances are that it should've
happened by now. But the fact that a 54 to 1 chance hasn't happened in
92 chances isn't an unlikely scenario. It's not a huge statistical
anomaly. How many games would have to be played before this evaluation
is suspect? If I live to be that old, I'll look forward to you rubbing
it in my face.You won't be in such great shape yourself.
How valid is this method?
According to my research, a 16 seed should have beaten a
1 seed by now. But they haven't. Does that invalidate my methodology?
Is it a good predictor?
As a test I ran the same analysis on all other first round games from
1985 to 2007. Using the calculated value of the chance for an upset in
one game, I multiplied it by 92 to get the number of upsets that this
methodology predicts. Those values are listed in the table below, along
with the actual number of upsets that have occurred during that time
period.
Matchup
|
Mean
|
Stdev
|
Upset Prob.
|
Proj. Upsets
|
Actual Upsets
|
1 vs. 16
|
24.56
|
12.24
|
0.0184
|
1.62
|
0
|
2 vs. 15
|
16.40
|
10.92
|
0.0556
|
4.89
|
4
|
3 vs. 14
|
11.03
|
10.63
|
0.1288
|
11.33
|
14
|
4 vs. 13
|
9.34
|
11.35
|
0.1811
|
15.94
|
17
|
5 vs. 12
|
4.76
|
10.13
|
0.2847
|
25.06
|
29
|
6 vs. 11
|
4.19
|
10.39
|
0.3086
|
27.15
|
26
|
7 vs. 10
|
2.48
|
10.86
|
0.3745
|
32.95
|
35
|
8 vs. 9
|
-0.56
|
10.95
|
0.4839
|
42.58
|
47
|
I'll leave it as an exercise to the reader to show that the Projected
Upsets and Actual Upsets are close enough to validate the methodology.
Frankly, I was shocked to see what an accurate predictor this method
was!
According to the above numbers, there's a much better than even chance
that at least one 2,3, or 4 seed will lose in the first round. It's
actually 1.462 to 1. (4*0.0556 + 4*0.1288 + 4*0.1811). That's pretty
phoenomenal. If you're filling out your bracket and you need a
different strategy, take a chance and pick a 2,3, or 4 to lose. Or bet
somebody even-money on it. You may not win every year, but the odds are
with you.
So what's it going to take for a 16 seed to make me look good?
How can a 16 seed do it?
In the past 23 years, 92 different coaches of 16-seed teams have tried
to be the Cinderella story of the tournament and all of them have
failed. Some have gotten close. In 1990 Michigan State and Murray
State went to overtime. In 1989 Princeton took the last two shots in
their 60-59 loss to Georgetown. In 2006, Albany
was up by 12 with 12 minutes to go until UConn woke up and went on
a run to win it (Amazingly, I was interviewed prior
to the game ). 5 games have been decided by 5 points
or less. 11 have been decided by 9 points or less. What's it going to
take?
Perfect
Execution: This is the way most coaches want to make it happen.
Play smart. Minimize turnovers. Execute our game plan. A total team
effort. Believe in yourselves and don't be intimidated. Frustrate the
opponent early and get them out of their rhythm. When it's close late
in the game, the crowd gets on your side, adrenaline kicks in, and hope
you can keep it close down the stretch. Failing that...
Bombs
Away: Another possible scenario is that one or two players get
into a zone and can't miss from 3-point range. If a team is making 3
points every time down the floor, it's hard for any team to overcome
that.
The
Gang's All Here: In 1983, NC State was hobbled by injuries all
year. When the ACC Tournament came around, everyone was finally
healthy. They won the ACC tournament and got a 6-seed in the NCAA
tournament. With everybody healthy they played much better than a
6-seed. Of course they got a little lucky too, but they ended up
being the lowest seed to win the Big Dance. It's possible that
a 16-seed is really better than their seeding due to injuries or the
team finally gelling. That lowers the odds in their favor.
Busted!!
The 1-seed comes into town, the players decide to visit a
strip club, a fight breaks out, and half of the starters end up in
jail. This is one of those random chance scenarios that no one can
"make" happen, but it is a possibility when discussing how the odds. We
can't
discount the chances that an off-court incident will somehow make the
impossible happen. Other variants of this involve injuries at the
beginning of the game or team illness for the 1-seed.
How it's getting less likely
Unfortunately the more recent numbers paint a bleaker story for the
hopes of the
16-seed. When I did my analysis in 1997, I had calculated the odds of
the 1 vs. 16 upset at 40 to 1. In 2007 it's 54 to 1. What has happened
in
the last 10 years that make it less likely?
Play-in
game: For the past few years there have actually been 65 teams
in the Big Dance, with two teams playing to win the "right" to be the
16-seed in one of the brackets. So 1/4 of the 16 seeds are playing a
game right before they play the 1-seed. That's not a rested team, which
is definitely needed before playing in what could be the toughest game
of their season. This is a
permanent part of the tournament, and if the NCAA hands out any more
automatic bids to new conferences and doesn't take any away, we'll have
more play-in games to see who gets to be worn out for the 1-seed.
Teams
stay closer to home: The NCAA tries to get teams to play as
close to home as possible, with 1-seeds getting the highest priority.
There's less of a chance for a home-court disadvantage or teams being
weary from travel.
Everyone
knows everyone: It's no longer the case that it's hard to find
information about a 16-seed. There is so much information on the
Internet and ESPN that the coach of a 1-seed will be able to find out a
good deal about the team he's playing. The 1-seed won't be shocked when
they see the team for the first time, which takes away from a possible
"element of surprise".
No
sleeper players: 15 years ago you could say that there's a
chance that a future Hall of Famer will end up on a team that's a
potential 16-seed. With all of the scouting that's going on now, the
chances are smaller that a potential superstar will escape the sights
of any of the top college programs. Players are scouted from the 7th
grade and at countless basketball camps and rated on the Internet. The
teams in the lower ranked conferences just aren't getting the players
that they used to, and their chances of getting a "diamond in the
rough" goes down as the high school players get more scrutiny.
No
mercy...ever: No coach of a 1-seed wants the dubious distinction
of losing in the first round of the NCAAs. If they're a 1-seed they
have National Championship aspirations, and losing that early would
certainly not endear them to their schools or booster clubs. As the
years go by, the pressure to win increases. They know about the close
games and nothing scares a coach more than a potential upset by a team
that they should beat handily, especially during March Madness. Coaches
use this game to send a signal
to everyone else in the tournament: We're the best team in the country,
so fear us. Also this is probably the last game that the scrubs could
potentially see any significant playing time and coaches like to
reward them with some minutes in the tournament. That's not going
to happen unless the game is well in hand. Chivalry be damned. Blowout
city, here we
come.
Welcome
to the Computer Age: I don't know how the seeding committee for
the NCAA Tournament did things 15 years ago, but now they use RPI. In
the past there were probably some 16-seeds that should have been
ranked higher, hence the chance of an upset increases. With RPI ruling
the selections, there's no chance that someone like the Ivy League
champ will get only a 16-seed if they deserve higher.
Beyond Round 1...
So if a 16-seed pulls off the upset in the first round, what are their
chances after that? I'm sure that one of the starting points they use
in Las Vegas for determining a point spread is a formula concerning the
RPI of the two teams involved. An 8/9 seed will have an RPI of about 30
or 35, and the 16-seed will most likely have one between 100 and 200.
This would be an interesting study. My gut feeling is that the 8/9 seed
would be a 10 to 1 favorite to win the game.
After the victory in Round 2? As analytical as I am, if this were
to happen, there's really no way to calculate the chances after that.
How many upsets have there been in the rest of the tournament? Are
Cinderella teams going to be
playing each other? Did the seeding committee really screw up and seed
a team 16 when they are really a 10 or a 5? Perhaps the best thing
going for them is that after winning two
games, the 16-seed is going to feel like they have a 1 out of 16 chance
to win the tournament, and who am I to argue with that? If they truly
believe in themselves, it won't be mere numbers that determine their
chances of winning, but their heart and determination.
Any comments? Criticism? Please feel free to contact me at graybill@mindspring.com . If
I get enough good comments, I may start a blog or have some other type
of feedback mechanism.
Thanks for reading.
Gil Graybill
link to Home Page
link
to Custom Made Ribbon Boards
link to the Chair Doc of Boone