Baseball Analytics Are Broken

Opinion Article – Mark Wendling

Now that I have your attention, and may have my SABR membership card rescinded, I will explain the reason I make this bold statement. First, they are not broken, statistics are statistics, they are what they are. The way they are used, interpreted and implemented is broken and it continues to get worse in the game we love.

Let us take a quick look at WAR. There are so many fans today that base everything on WAR. They also do not at present count there are at least three different calculations of what is known as WAR. The number of times I see items such as Larry Walker was a better player than Tony Gwynn because Walker had higher WAR. Full disclosure I am Canadian, and I love Larry Walker, he is the type of guy you would love to sit over a few “pops” and shoot the breeze with all day, in other words a good Canadian guy. Tony Gwynn though was amazing. Some argue he was more worried about his stats than his team, but this is not about their personalities. Both are Hall of Famers but let’s face it if we were picking sides for a game, Tony Gwynn is going to be chosen over Walker most times.

WAR is one stat, it is not “the” stat as many would believe it to be, it is a statistic that has to be considered among all the others. The very thing that using analytics was to do was to eliminate bias, yet bias creep has taken over baseball analytics. Hearing the reason why Walker was better than Gwynn helps to demonstrate this. I could use this entire article to discuss what is wrong with WAR, but I will save that for a future article; WAR What is it good for, Absolutely Nothing Say It Again.

We need to remember that in the use of analytics there will always be outliers, and when a human being is playing on any given day their performance could be one of those outliers. How many times can you predict a player will go 5 for 5 in a game? This would be an outlier. Two of the best examples are pitching in playoff games. The first people point to is Game 6 of the 2020 World Series. Blake Snell by many was throwing the best game of his life, now that is subjective in many ways, but he was pitching very well cannot be denied. Based on analytics he was taken out in the 5^th inning. After that the runner Snell left on when he left the game scored, as did another against Anderson. What if this was his outlier game? We know he had an outlier later in his career as he threw a complete game no hitter with 114 pitches. A second example is closer to my heart as a Blue Jays fan. In the fourth inning of the Wild Card Series against the Minnesota Twins in 2023 they pulled Berrios in the 4^th inning. Berrios was pitching lights out as many fans and even his teammates were surprised, and management had to explain its rationale to pull him. It was all about the analytics of righties and lefties, and it backfired.

These arguments about WAR and outliers are nothing new, they are used all the time by people who are not fully aligned with analytics in Baseball. This week though there was an interesting round table on MLB Radio where they were discussion playing first base. I believe it came from the Red Sox and Devers soap opera, and although that could be a lot of fun to talk about, it is not the point that stuck with me.

The discussion was around how clubs are now using metrics to see how good a player can be playing defensively at positions they never played in their pro career. The argument is that they are athletic they can play that position. If we look at two players considered to be incredible athletes in Shohei Ohtani and Aaron Judge, are you going to play either of them at Shortstop? Third Base? How about second base? Most people would answer no to all those scenarios.

The discussion was about foot placing and how to play the ball, now the people in this round table are much smarter than I am when it comes to baseball, but it makes sense. Think about the movie Moneyball and the scene where they sign Hatteberg. Billy Beane says playing first base is not that hard, right Wash? Ron Washington replies it is incredibly hard. Yet we are using the analytics to say a player can player there with little notice because they are athletic.

How does this relate to baseball simulations? That is easy. Many people have played MLB The Show. When you move a player to a secondary position you get a warning it is their secondary position and there are minor penalties, if you put them in a position they do not play, you get red warnings out of position and major penalties. Expect errors galore.

In OOTP games, when you draft a player, they rarely are at their full defensive level they will hit at any position. Also, there are maximums if you have the player trained at other positions. It all makes sense. The use of analytics to determine how good a player will be at another position is very weak.

As we have had multiple programmers as guests in our committee, one thing is constant from all of them. There is a random factor used in every computer simulation. The use of dice also uses random factors. People are not computers; we cannot project exactly what will happen in a specific situation. If we could why then would we play the games?

Board and Dice games also use a random factor. When ever there is a player up to bat dice are rolled, and it is very unlikely that any player is going to have the same amount of numbers in the range for each possibility that happens. It just does not make sense that a player would have the same chance for an out as a home run. We all know that a .400 batting average is considered to be an incredible accomplishment, and yet that is still rewarding being out six times out of ten.

The problem with analytics in baseball is not that they are broken, they cannot break. The way they are used is broken, and it just gets worse. The analytics or numbers that the old-timers used to call them as they kept track as well, are just one tool. Eyes are another tool, that unfortunately has taken a beating. Over and over, I read and hear the eye test is a failing test. Even I who never played any real baseball growing up, the joy of being a Canadian, can see when a pitcher is struggling, and when a batter is struggling. Yet announcers, coaches we hear all the time is that the player is doing the right things, his launch angles are right, his barrel rates are great. Guess what, results matter.

There is no problem with baseball analytics, the problem is with the people using them. The use of statistics in decision making was to take out the bias, yet now we have biases about what are the right stats, which ones tell us what we want to know, and as we know some try to bring it all to that one number WAR, and that is how we evaluate the whole of a player. Bias hurt teams before, when we know, we know, and they didn’t (more Moneyball). Now something they thought they had to help them know, they misuse and have filled it with bias. It is beyond bias creep.

Ultimately baseball analytics are not broken, the people using them are, but we knew that already.