Mastering March Madness through Mathematics
Last updated: Tuesday April 6th, 2021
With the selections for March Madness right around the corner, it's fitting to take a look into some Linear Algebra that can help you master your bracket for March Madness this year! Perhaps the most popular method to use was created by Kenneth Massey in 1997, and is known as the Massey method. Though it was initially used for American football, it has since grown in popularity for producing brackets that are significantly more-accurate-than-usual.
So what exactly is it that Massey did to create his rankings? Well, Massey essentially takes the point differentials between any two teams from all games that that team played and puts it into a matrix. If you aren't familiar, a point differential is how much more (or less) a team's score is than its opponent. So if Team A scores 95 points and Team B scores 75 points, then the point differential for Team A is 20, while that of Team B is -20.
So each game can be represented by a linear equation. In our example above, we could say
A - B = 20.And matrices are a great way to take a bunch of linear equations like this, and combine them. So say that Team A also plays Team C and wins 102-30. In the game between Team B and Team C, Team B wins 80-55. Taking this data, we create the following equations:
The next part may be foreign to you if you have no background in linear algebra / matrices. A system of equations, like the one above, can be written as an equation of matrices, like this:
And if we can solve this equation for the matrix that has A, B, and C, then we would get the values for A, B, and C, which would essentially be ratings for each team, based on their point differentials from all the games they've played. Unfortunately, this system of equations is usually inconsistent, or unsolvable. But a trick using what's called the transposition of matrices (where you swap rows with columns) fixes this error and helps bring us closer to finding those ratings. Some more algebra is needed to work out the details of this adjustment, but in the end we get
Notice that each row represents one of the teams. The entry along the diagonal in the row is the number of games that team played, and a -1 means that that team played 1 game against the team in that column. Unfortunately, this once again not solvable because you need to be able to isolate the matrix with A, B, and C in it, which isn't possible because the big matrix before it is invertible, which means that it doesn't have an inverse matrix. Massey fixes this error by changing the last row to have all 1's, and changing the last entry in the matrix on the right to be a 0. Mathematically, this is like saying all of the point differentials added up needs to equal 0, which should already be true in the first place. Then we get a system of equations,
which can finally be solved! So we can multiply both sides by the first matrix's inverse, solving our system of equations to ultimately get
This means that Team A has a rating of 30.7, Team B has a rating of 1.7, and Team C has a rating of -32.3.
The Big Picture
You might look at this and say "wait a minute...I could've calculated that without the linear algebra!! It's just the point differential divided by the number of teams!" While this is true, do you think it would be as easy to calculate with 68 teams? Plus, this only worked out nicely like that because each team played the same number of games. In a real scenario, the matrices look a lot more wild and much, much bigger. Doing the calculations by hand isn't really an option. That's why computers can be very nifty with statistics problems like this one.
So now you're probably dying to use the Massey method to create a bracket superior to your friends and family. How can you do it? At this website you can find how the teams currently rank based off Massey's method.