Braves, Brewers, and Outliers
Yesterday's gambling picks went 3-3 which is the equivalent to a soft tap on the nuts when you bet mostly favorites. Something noteworthy did happen though, the Braves and Brewers decided to completely embarrass their competition on national and local television. I thought this would be a good opportunity to discuss outliers and how they impact data.
Below is a dashboard that compares teams' hits to runs scored in the 2020 season. Both graphs are the same to allow for team comparisons.
So for those who were not the biggest fans of statistics in high school, let me explain what is going on here. Both charts, which contain the same information, are comparing a teams hits on a given day to their runs scored on the same day. Intuitively, as the hits increase the runs should also increase. If you hover over the orange trend line, you should see a box appear that looks something like this:
Example used: Milwaukee Brewers
In broad terms, the coefficient .820057 would indicate that for every hit the Brewers make, they score 82.0057% of the time. But what was that number before last night? Tableau is nice because you can filter data points that you do/don't need. So, using the Brewers as an example again, if you highlight only the points that look to be related, the trend line shifts drastically.
This would imply that based on last nights performance alone, the Brewers expected runs per hit rose from 63.378% to 82.0057%. -nearly a 20% increase. Does this mean that the Brewers are a much more improved team after last night? Certainly no. But going forward, when we look at metrics and modeling, the Brewers numbers will look better than normal because of these blue moon events.
Teams I like Today:
St. Louis Cardinals (Game 1) (-220)
Texas Rangers (+145)
New York Yankees (-240)
San Diego Padres (-180)
Philadelphia Phillies (-112)