BU Today

Arts & Entertainment + Science & Tech

MLB Umpires Missed 34,294 Ball-Strike Calls in 2018. Bring on Robo-umps?

After studying four million game pitches, BU researcher suggests how to fix a broken baseball system

This article is based on 11 seasons of Major League Baseball data, almost 4 million pitches culled and analyzed over two months by Boston University Master Lecturer Mark T. Williams and a team of graduate students at the Questrom School of Business experienced in data mining, analytics, and statistics.

Baseball is here, another season of amazing catches, overpowering pitching, tape-measure home runs, overpriced beers, and, yes, television replays of every missed call by umpires, revealed in painful, high-definition slow motion.

It’s time for Major League Baseball to put an end to the agony caused by at least some of those blown calls—the balls and strikes.

Each season, MLB home plate umpires make tens of thousands of incorrect calls (read on for evidence backing up that assertion). These controllable errors impact players, managers, batters, pitchers, performance statistics, game outcomes, and even the big business of fantasy baseball. They shorten careers and diminish fan experience. Pace of play is also impeded.

In 2018, umpires made 34,294 incorrect ball and strike calls. That’s 14 per game.

But throughout its history, MLB has protected its error-prone umpires, resisted adopting strong performance measurements, and not taken advantage of available technology that could better the game. At a time of autonomous cars and machine learning, MLB needs to embrace useful change.

The duty of an umpire is complex: get the split-second call right. It is a mentally and physically demanding job. For 2018, there were 89 MLB umpires, all of them with a profile of male, average age of 46 and 13 years of experience. Each season, umpires individually participate in an average of 112 games, one fourth of them (28) from behind home plate, calling about 4,200 pitches. A crew of four umpires is assigned for each game, assuming one of four designated field positions (except for the World Series, when seven umps are used).

To minimize the chance of undue influence, these game assignments are not publicly announced until 10 to 20 minutes prior to each scheduled start. The home plate umpire exerts the most influence in the game, making judgment calls on any pitch that is not hit. Currently umpires carry out this important role without technical assistance.

This human element of the game adds color but it comes at a high cost: too many mistakes. In 2018, MLB umpires, made 34,294 incorrect ball and strike calls for an average of 14 per game or 1.6 per inning. Many umpires well exceeded this number. Some of these flubbed calls were game changing.

YouTube is flooded with videos of bad umpires in action. Watching these uploads has turned into a sport itself. Titles such as worst ball, strike, and check-swing calls in baseball history to the biggest umpire blunders of all time have gained wide viewer attention. Blown calls only undermine the integrity of the game, slow down pace, hurt averages, and prevent the athletes from being able to maximize their potential performance.

Right after the 2018 All-Star break, the Colorado Rockies and Arizona Diamondbacks met at Chase Field for an important National League game. The Rockies were up 6-5 in the ninth, but Arizona, with two outs and two on base, was threatening a comeback. Wade Davis, the Rockies closer, got ahead with a 1-2 count on slugger Nick Ahmed. The next pitch, a 90-mph cutter thrown towards the right-handed batter’s box, landed significantly outside the strike zone. To the disbelief of Diamondback fans, umpire Paul Nauert called the stray ball a strike, ending the ballgame.

Yet when analyzing the data, this call should not have come as a total surprise given Nauert’s performance over the last 11 seasons, which landed him firmly on the Bottom 10 of MLB’s list of umpires (see chart). Moreover, MLB umpires have a pronounced biased, greatly increasing the odds, on a two-strike count, that a true ball will incorrectly be called a strike. In 2018, a total of 55 games were ended when umpires made incorrect calls.

Umpires are at the heart of baseball, every single pitched ball requires at least one, and sometimes multiple umpires to make some call. Yet, even though MLB has begun evaluating umpires with internal systems (such as Trackman), their performance statistics are not widely known, tracked, or readily shared. Fans can recite starting pitcher information but when it comes to who is umpiring behind home plate and their error rate, these relevant statistics are not public.

To take the debate beyond YouTube videos, anecdotes, and fan emotion, we applied a clinical approach in assessing MLB umpire performance. Our goal was to let data-driven evidence determine strong, weak, and rising star performance. And to determine just how accurate umpires are in calling balls and strikes.

DATA DOESN’T LIE

For this research, we looked at game data from Baseball Savant, MLB.com, and Retrosheet. The time period chosen, the most recent 11 baseball regular seasons (2008-2018), presented nearly four million called pitches. Similar to players, MLB umpires were assigned numbers, so that games behind the plate could be easily tracked. All active umpires were included in this performance study, and their ability to accurately call balls and strikes was closely observed. All 30 major league parks are outfitted with triangulated tracking cameras that follow baseballs from the pitcher’s hand to across home plate. Ball location can be tracked up to 50 times during each pitch and accuracy is claimed to be within one inch. Statcast, a MLB subsidiary, is at the center of this system—the backbone of strike zone graphics used during televised and live-streamed games. Called pitches and strike zone overlay were populated from Baseball Savant, Pitch F/X (2008-16), and Statcast data (2017-18).

The top 10 performing umps averaged 2.7 years of experience. The bottom 10 averaged 20.6 years of experience.

Experience level and age of each umpire were also compiled. Once the data was assembled, our team of researchers used available technology that compares the strike zone to the actual calls umpires made on each pitch, separating the correct from the incorrect calls.

After ball and strike accuracy performance was calculated by inning, game, month, and season, a bad call ratio (BCR) for each umpire was computed. This ratio was generated by dividing all incorrect calls by the total of judged pitches. The higher the BCR score, the more incorrect calls made. This rating process was repeated for each MLB umpire for each season. Once all umpire BCR scores were completed, groupings and trends emerged. Umpires were then rank ordered and separated into top, average, and bottom performers. Standard data mining, analytics and statistical methods were applied, and performance ratios studied. The results that emerged from this study were troubling.

SUMMARY FINDINGS

This deep-dive analysis demonstrated that MLB umpires make certain incorrect calls at least 20 percent of the time, or one in every five calls. Research results revealed clear two-strike bias and pronounced strike zone blind spots. Less-experienced younger umpires in their prime routinely outperformed veterans, and umpires selected in recent World Series were not the best performers. Results showed a declining but still unacceptably high BCR score, but on a positive note, only a marginal inter-inning call inconsistency. Findings also identified new and rising star umpires and highlighted the pressing need to recruit higher performers.

Given how MLB is heavily dependent on performance statistics when evaluating players, it is surprising the league has been sluggish to apply similar rigor to umpire hiring, promotion, and retention.

The following five sections explore our summary findings in greater detail.

1

Two-strike bias—balls called strikes

Research results demonstrate that umpires in certain circumstances overwhelmingly favored the pitcher over the batter. For a batter with a two-strike count, umpires were twice as likely to call a true ball a strike (29 percent of the time) than when the count was lower (15 percent). These error rates have declined since 2008 (35.20 percent), but still are too high. During the 2018 season, this two-strike count error rate was 21.50 percent and repeated 2,107 times. The impact of constant miscalls include overinflated pitcher strikeout percentages and suppressed batting averages. Last season, umpires were three times more likely to incorrectly send a batter back to the dugout than to miss a ball-4 walk call (7 percent). Based on the 11 regular seasons worth of data analyzed, almost one third of batters called out looking at third strikes had good reason to be angry.

Such game-changing biases give a new meaning to the need for batters to aggressively protect the plate. It also provides pitchers with added incentive to gain an early two-strike advantage.

Umpires’ biased judgment when a batter has a two-strike count:

Source: Mark T. Williams Boston University Study 2019

2

Strike-zone blind spots abound

Umpires from 2008 through 2018 also exhibited a pronounced and persistent blind spot with a number of incorrect calls at the top of the strike zone. Remarkably, pitches thrown in the top right and left part of the strike zone were called incorrectly 26.99 percent of the time on the right side to 26.78 percent on the left. And while there was marked improvement in umpiring, the incorrect calls around the bottom right strike zone in 2018 was still a mind-boggling 18.25 percent. Data results confirm that strike zone blind spots penalized certain pitchers more than others. This time, however, batters benefited from such flubbed calls, as strike zones shrank, forcing pitchers to throw fewer pitches up in the zone. High strikes are typically harder to hit than low strikes for most batters.

Umpire blind spots—top part of the strike zone (right and left):

Source: Mark T. Williams Boston University Study 2019

3

Less experienced and younger umpires outperformed veterans

Based on the research, professional umpires, similar to professional baseball players, have a standard peak. The study revealed that home plate umpires who made the Top 10 MLB performance list (2008-2018) had an average of 2.7 years of experience, and averaged 33 years of age with a BCR of 8.94 percent. None of these top performers had more than five years of experience or were older than 37.

Nic Lentz, the youngest umpire to make this list, was 29. Logically this should not be a surprise finding given the physical demands and required reflexes needed to adequately perform this challenging job.

Taking into account standard peaking, MLB should consider moving away from the traditional four-person crew rotation, which gives every umpire time behind the plate, no matter how young or old, experienced or not, or how strong or weak a performer they are. A better system would assign the top performers to the most physically and mentally demanding field positions. At some point, prime is reached, and surpassed, and the body and statistics do not lie.

Source: Mark T. Williams Boston University Study 2019

In contrast to the overall top performers, research uncovered that umpires on the Bottom 10 MLB performance list (2008-2018) had an average experience level of 20.6 years, were 56.1 years of age, and had an average BCR of 13.96 percent. This group’s error rate was a staggering 56 percent higher than the top 10 MLB performers. Umpire Jerry Layne, with 29 years on the job and at age 61, sported the highest BCR, 14.18 percent. This performance research clearly indicates that more experience and age does not necessarily produce the best umpires.

Source: Mark T. Williams Boston University Study 2019

The counterargument to “younger is better” is that these umpires lack enough games under their belts to make many errors. However, there is another plausible reason why newer umpires tend to be stronger performers: they are more motivated to prove their worth. It also could be that they are beneficiaries of improved training and mentoring from older umpires. Regardless of the rationale, the data does not lie: younger MLB umpires are hitting the ball out of the park.

For the 2018 season, when compiling the Top 10 MLB umpires, only 2 on this all-star list had 10 or more years of experience. These exemplary umpires had an average of 6.3 years of experience, were 37.8 years of age, and enjoyed a BCR of only 7.78 percent.

Source: Mark T. Williams Boston University Study 2019

The 2018 season performance also helps to illustrate the tight grouping of top performers (low BCR scores). Notice how the table is markedly sloped in favor of this younger and less experienced group. These umpires relative to the second clustered group appear to be in their prime.

For the 2018 season, the profile of those relegated to the Bottom 10 MLB list were entirely populated by veteran umpires with an average of 23.05 years of experience, who were 56.6 years of age, and earned a double digit BCR of 10.88 percent. For the 2018 season, the Bottom 10 generated 40 percent more incorrect calls than the Top 10 MLB umpires.

Source: Mark T. Williams Boston University Study 2019

Graphing performance results also highlighted a natural divide–umpires with at least 20 years of experience made more incorrect calls than those with 10 years or less of experience. Within peer groups, there were also strong pockets of poor performance. As highlighted in this 2018 bad call ratio (BCR) graph, the line delineates average umpire performance to experience relative to their peer group. Umpires above the line performed worse than others. The tight cluster of higher errors upon reaching 20 years on the job is also telling.

For 2018, Ted Barrett and Joe West were the top poor performers, making 495 and 512 incorrect home plate calls, for an average of 17.7 and 16.5 errors per game, respectively. Such bad call numbers can produce an array of new outcomes. For example, incorrect calls can extend pitch count and impact pitcher rotation and the reliance on relievers. As a starter gets deeper into his pitch count, one or two more balls can change game outcome. Bad calls in favor of batters can extend innings, and increase scoring opportunities.

John Libka, MLB umpire

John Libka is 32, and with only 1.5 years of experience, has generated a BCR of 7.59 percent. Photo by Tom Szczerbowski/Getty Images

Interestingly enough, Angel Hernandez, while far from having a breakout year, performed stronger in 2018 than his average over the last 11 seasons. Hernandez is routinely derided by MLB players as one of the worst umpires.

Our data also showcases the 2018 performance of new umpires such as John Libka, who at only 32 and with only 1.5 years of experience, had generated an impressive BCR of 7.59 percent. With that low BCR, he should win “Rookie Umpire of the Year Award.” On the more seasoned side, Mark Wegner should win “Veteran Umpire of the Year Award.” Both umpires are at the top of their game.

Anecdotally, veteran umpires such as Joe West (debuted 1978), have long earned the scorn of players and fans for their proclivity for bad calls. The statistics show that West made more incorrect calls than most. In fact, behind the plate, over the last 11 seasons he has averaged 21 incorrect calls a game, or 2.3 per inning. And while Angel Hernandez (debuted 1991) receives similar fan dislike, averaging 19 incorrect calls a game, or 2.2 per inning, even with this high error rate, compared to his peers, he performed better than others, escaping the 2018 Bottom 10 MLB list.

Season-by-season call variability is also a problem. Hernandez’s performance in 2017 was much worse than in 2018. In contrast, Joe West continued to produce a troubling amount of incorrect ball and strike calls.

Relying on gut feel and not armed with accessible and timely performance measurements, players and fans have little ability to objectively assess the league’s 89 umpires. Recently, Hernandez stated he only gets four calls wrong per game. His actual error rate, as evidenced in this research, was almost five times higher.

Unfortunately, while many fans are aware of the predilections of the poor performers, when it comes to the stellar 2017 season umpiring performances of Pat Hoberg and Eric Cooper or the 2018 season of John Libka and Mark Wegner, most fans are left in the dark.

And when it comes to the World Series, it’s official: the 2018 World Series umps were not the best.

Ted Barett, MLB umpire

Ted Barrett was a bottom 10 performer in 2018, yet was still selected as a World Series umpire.

After comparing the BCR performance of all umpires, the top performers were typically not the ones chosen for MLB’s most prestigious, most visible, and highly sought after assignment.

Of the seven umpires chosen for the 2018 World Series, no fewer than five exhibited a higher BCR than the overall league average. For the 2018 season, none of the MLB selected umpires were on the Top 10 performer list. However, Ted Barrett, a 2018 Bottom 10 performer, nonetheless got the top job as crew chief. In his two decades of umpiring, this was his fourth time selected to officiate the postseason finale. This decision by MLB was not a fluke. In 2017, two Bottom 10 list umpires, Paul Nauert and Dan Iassogna, were also picked. For the 2016 World Series, Joe West was selected to umpire again, the sixth time in his career.

Source: Mark T. Williams Boston University Study 2019

In contrast, if MLB used a merit-based system, awarding the 2018 World Series assignment to the umpires with the lowest regular season BCR, a dream team of umps would have been fielded, one with much lower error rates and higher call consistency.

Source: Mark T. Williams Boston University Study 2019

Umpires picked for the 2018 World Series also tended to be considerably older than the league average. Given the apparent inverse relationship between age and top performance identified previously, this is problematic.

Source: Mark T. Williams Boston University Study 2019

MLB is simply ignoring valuable, available data.

Despite the hard evidence, each season, MLB continues to keep questionable performers, some past their prime, on the job. The past three World Series were only the most recent examples. Game by game, season by season, poor performing umpires remain on the field. When the error rate can vary as much as 56 percent between the bottom and top performers, who is behind home plate matters a lot. In 2018, 2 percent of all major league games (55) were ended by incorrect calls, an increase of 41 percent from the previous year (39).

In 2018, 55 games ended when umpires made incorrect calls.

Given the importance of these games and of getting the calls correct, MLB must rethink the process it uses, including incorporating more performance-based measurements when determining hiring, retention, and assignments. If the league is truly committed to game improvement, its officials should aggressively recruit and retain high-performing umpires, as any smart industry does. Unfortunately, the way the current seniority system works, MLB typically has only one or two new umpiring slots open each season. Such a flawed system also prevents promising talent and rising stars from gaining proper recognition or access to best assignments.

Research results also point to the fact that umpire compensation is not closely aligned to performance. The World Umpires Association is the union that represents all MLB umpires. Those with seniority can earn salaries above $450,000 while rookies start at about $150,000. Umpires receive generous travel allowances, including flying first-class. There is also more pay for playoff games. Whether a game takes three to five hours to play, and regardless of whether 2 or 20 incorrect calls are made, umpires enjoy the same pay. The last labor contract was approved in January 2015 and expires at the end of 2019. Study findings support the need for MLB to make stronger performance-based measurements central to the upcoming contract renegotiation process. Longevity alone is hurting the game.

4

Umpire error rate inconsistency by innings was only marginal

Research results demonstrated that while there were high error rates on a per game and season-wide basis, intra-inning inconsistency in calling balls and strikes remained only marginal. Data for the last 11 seasons showed a slight trend, higher error rates in early innings, less in middle innings, and slightly more by the critical ninth inning. When dissecting inning data on a per-ump basis, some exhibited even greater variability.

Umpires’ inconsistent performance by innings:

Source: Mark T. Williams Boston University Study 2019

5

Bad call ratio by year

The error rate for MLB umpires over the last decade (2008-2018) averaged 12.78 percent. For certain strike counts and pitch locations, as discussed earlier, the error rate was much higher. Some years, the incorrect call ratio exceeded 15 percent. In 2018, it was at 9.21 percent. And while MLB might attempt to highlight this trend as a sign of strong umpiring, to the contrary, if there are ways to push error rates even lower–through better hiring practices and integrating useful technology–it should be adopted.

Source: Mark T. Williams Boston University Study 2019

POTENTIAL SOLUTIONS

Technology

As this research has demonstrated, poor umpiring persists. Despite years of data-driven evidence, MLB has been slow to expand the ranks of younger umpires, missing the opportunity to rapidly lower unacceptably high bad call rates. The league has also dragged its feet in putting strike zone–assisted technology behind the plate. In a thinly veiled attempt to silence vocal critics, MLB recently announced it will begin to test robot umpires, but only on a small scale, through the unaffiliated Atlantic League farm program. Instead of addressing this pervasive big-league problem now, MLB continues to stall.

Innovations such as the radar gun, instant replay, pitch graphics, Doppler radar, and strike zone evaluation systems have greatly improved baseball and fan experience. Yet umpires continue to call balls and strikes like they did 100 years ago when Babe Ruth reigned supreme and the Ford Model T ruled the roads. Technology does not have to mean the death of umpires. Rather it’s a tool to allow them to do a better job.

Adopting strike zone technology would free up umpires to remain focused on other aspects of the game and making sure pace of play is maintained. Major League Baseball has been a follower and not a leader in adopting innovative technology. In contrast, other professional sports have increasingly relied on high-tech aids, rapid communication, and centralized control rooms to improve officiating. In European soccer, at the World Cup, and professional tennis, Hawk-Eye technology is the standard. In the National Football League, tech-assisted verification is increasingly the norm. It is also customary for football referees, coaches, and quarterbacks to be wired for real-time communication. In international cricket, umpires have also gained marked improvement through communicating calls via wireless technology.

Tech-assisted umpires

To dramatically improve behind-the-plate umpiring, the solution is not for baseball to bring in the robots and fire the umpires. Baseball has too many one-off situations and complexities to assume a bot could do everything ump-like. However, MLB has a unique opportunity to set a higher standard, apply performance measurements, and strengthen human-software collaboration. For this to move forward, the World Umpires Association would need to acknowledge existing umpiring deficiencies, accept stronger performance-based approaches, and support innovative tech solutions.

Umpires connected to central control could easily be fitted with headsets or earpieces, conveying real-time ball and strike information. These umpires could make calls correctly, quickly, and effortlessly. Time-honored and much beloved behind-the-plate signs, signals, and sounds would not be disrupted. Umpires would remain in control, having override ability under certain circumstances, such as if a ball hit the ground before crossing the plate or if a system outage occurred.

Biases would be eliminated. Strike zone subjectivity would be minimized, freeing up more of the plate for pitchers and allowing batters to focus more on hitting and less on guessing inconsistent strike zones. Pace of play would increase. It would also reduce the escalation of umpires blaming players and managers.

CONCLUSION

Major League Baseball’s goal should not be to resist change, but to adhere to the official strike zone that its own rules make clear–on every pitch. High-tech aids and greater recruitment of competent younger umpires is another important step. Imagine player and fan experience and what baseball would look like if each year the more than 34,000 incorrect calls vanished. Fans could focus more on umpire standouts and rising stars and applaud the veterans who are able to withstand the test of time, just like the best aging ballplayers are appreciated.

Lead researcher Mark Williams with his research assistants

Lead researcher Mark T. Williams (center), a Questrom master lecturer, with his research assistants. Photo by Jake Belcher

It is unrealistic to assume that home-plate umpires, unassisted, can collectively achieve the accuracy rates increasingly demanded by the sports industry and deserving fans. Given that umpires hit standard peaks, hiring and retention policies need to be adjusted accordingly. Adopting a stronger performance-based system coupled with readily available technology would allow the human aspect of the game to remain while respecting the benefits that can come with advancing technologies. At minimum, using a tech-assisted approach would produce results no worse than our existing band of MLB umpires.

Mark. T. Williams (Questrom’93) is the James E. Freeman Lecturer in Management at Boston University Questrom School of Business, where he teaches courses in financial technology and innovation. He can be reached at Williams@bu.edu. A lifelong baseball fan and author of several sports books, he would like to acknowledge the strong contributions made by Tianyang Yang, Brandon Cohen, and the rest of the Boston University student team, all of them master’s in science and mathematical finance students. 

48 Comments

48 Comments on MLB Umpires Missed 34,294 Ball-Strike Calls in 2018. Bring on Robo-umps?

  • Ed Casaccio on 04.08.2019 at 11:55 am

    This is a great article. I agree completely. When I coached my son’s team for many years, we would sometimes scrimmage in practice. I would ump and it was very difficult. I actually started standing by the mound so I could see the balls/strikes more clearly. It would not be safe at the Major League level but I saw first hand how difficult it is to be accurate from behind the plate. I agree with this article completely. Great job !

    • VintageVNvet on 04.27.2019 at 12:42 pm

      AGREE, LIKE TOTALLY DUDE!!
      Cringing at the bad calls while watching the Sox defeat the Rays last Saturday, sitting with my best pal from elementary school with whom i used to compete to catch the fouls over the right fence hit by Ted Williams at spring practice, We discussed this very topic after being total fans for over six decades…
      TIME AND ENOUGH MLB::: DO it, please, if for no other reason than to keep a couple of very old fans from having heart attacks by the badly missed calls…

      Thank you.

  • Robert T Flynn on 04.08.2019 at 1:49 pm

    Great job on this article Mark, you and your team put a lot of work into and it really shows. I consider myself a Baseball purist; I hate the DH, I yearn for the game where the catcher is called for a balk, I dislike the bat-flip and the wave belongs nowhere near a Baseball field. Having said all that I am totally for any type of equipment that can aid the umpire in making better calls… as long as they benefit my beloved Yankees! Seriously, if they can do that and not disrupt the pace of play then I’m all for it. I was against the replay for the longest time but now I like it, so I can change too. This reminds me of a story about a young rookie pitcher that was pitching against Rogers Hornsby and the pitcher threw the pitch and the umpire called it a ball, he threw the next one and the ump called a ball, the pitcher got upset so he finally said to ump, “Hey ump, those two pitches were perfect what gives?” The umpire took his mask off and said, “Young man, when you throw a strike, Mr. Hornsby will let you know.”

  • Gil Imber on 04.08.2019 at 2:13 pm

    Mark, what is the criteria you used to determine a call was incorrect? Specifically, what px values comprised your acceptable call range on the horizontal axis (all else equal), and what pz values relative to sz_bot/sz_top comprised your acceptable call range on the vertical axis? Was there any accounting for strike zone depth, e.g., a pitch, based on spin rate, that breaks back over the plate after crossing the front edge off the plate? Margin of error?

    • Mike on 04.10.2019 at 2:12 am

      Gil, you can question, be cynical even yet that doesn’t mean the study had flaws discrediting the findings. Clearly you are educated and intelligent yet it appears you have a bias in proving the assessment and data is incomplete and thus a poor study.

      • Paul Dimitre on 04.10.2019 at 1:10 pm

        That doesn’t answer his questions.

        • Charles on 04.19.2019 at 9:26 pm

          Indeed, Paul Dmitri, those seem like perfectly reasonable questions to me.

          • PJ on 05.11.2019 at 11:24 pm

            I agree totally with the article, it’s great. But I would still like to know the answers to these questions.

  • James Grimm on 04.08.2019 at 4:04 pm

    Awesome!

  • Scott on 04.08.2019 at 4:45 pm

    I am an obsessed baseball fan. This is a great report. I do think something is missing, though. The biggest problems I see with ball/strike calls is that many umpires seem to give a left-handed pitcher throwing to a right-handed batter a much bigger strike zone than a righty to righty matchup. The reverse is also true. It literally seems as though they are completely guessing about the outside edge of the plate. Also, very few umps will call a strike above the belt. If umpires were calling a complete strike zone, the game times would be much shorter. It would be drastically more effective than a pitch clock. Just call the strike zone the way it was called the first hundred years…

  • Robert Scotton on 04.08.2019 at 6:57 pm

    As an 24 year veteran in umpiring high school and babe ruth games. I thought the article was interesting.
    However what is the strike zone you are looking at? Did you know that any ball above the letters is out of the strike zone. Did you ask how many of these pitches were curveballs, change-ups, fastballs, or screwballs, they all have different reactions when thrown. What about the batting stance that also effects the strike zone.

  • Rickie E Butler on 04.08.2019 at 7:41 pm

    Is there any relation to more pitchers throw harder, mph, than ever before. Umpires with many years of calling balls and strikes, were calling balls and strikes on much slower pitches. And I wonder if over time, the many yearumpires who were consistently calling balls and strikes on much slower pitches, have muscle memory in their eyes.Where pitches normal course of flight were easier to track, I wonder of the experienced umpires realistically do not have the physical tools to track accurate flight of pitches that are mph faster than yesteryear. Where newer umpires don’t have that predisposition in their eye muscle memory??

    • Zac Grant on 04.09.2019 at 2:32 am

      I think you make a great observation Rickie. Speaking as an umpire myself, I think reflexes and having young eyes have a big impact on ball/strike accuracy and shouldn’t be overlooked, but I think you definitely could have a point.

      • David Kritzler on 04.16.2019 at 9:36 am

        What you are offering is reasons why mlb umpire’s are not capable of maintaining a k zone. Digitizing each player is simple enough…humans just cannot do this task acceptably

  • Dale Russell on 04.09.2019 at 1:48 am

    Well.. let’s see…MLB players made a total of 2688 errors last year.
    They struck out approximately 43,698 times.
    Pitchers gave up a total of 16,736 walks.

    And.. while you can argue that some of those were intentional thats still over 62,000 times a player either made a fielding error… misjudged a pitch and struck out… or gave up a walk by not hitting the strike zone.

    Whether those were because of missed calls by umpires… or not… that’s a grievous amount of errors being made by the human ballplayers.

    Why don’t we simply program in abilities and portability statistics and play 162 games for each team on a computer… that way we can remove ALL human error.

    • Mike on 04.10.2019 at 2:15 am

      You’re creating a different debate. Do you want the “judges” in your life, government, police, judges to be accurate and get it right? I assumed you played sports growing up or in college. Did you want officials, referees, umpires to get calls right?

  • Bill Griffith on 04.09.2019 at 9:06 am

    Interesting piece and an amazing amount of man (& woman) hours. Kudos to all in making the case for ball-and-strike accuracy.
    1. Report glosses over that BCR dropped from 16.4 to 9.2 per game since 2008.
    2. Other than game-ending Bad Calls (BCs) and the two-strike tendency, there’s no tracking of how many BCs are in clutch or crucial situations.
    3. In umpire ratings, BCs on bases, HRs, foul balls aren’t factored in.
    4. Nor is Game Management.
    5. Or umps’ personality, ie, the combative quick-ejection folks.
    6. Do certain teams have more BCs against them?
    7. Are certain pitchers able to “work” umpires to get more BCs in their favor?
    8. Did tuition dollars fund this study?
    All-in-all fascinating stuff.

    • Curt W. Walz on 04.24.2019 at 2:21 pm

      3) BC’s on bases, HRs, foul balls are all reviewable, so that would be a different report that would measure accuracy of replay and number of over turns vs total number of calls made.
      6) I’d be curious if more players have BC’s against them (their stance leads to an oddly perceived zone) or catchers with high BC’s for them (they are great at framing a bad pitch; aka “great hands while receiving”)
      8) I’d hope so.

  • Debra Topham on 04.09.2019 at 10:31 am

    While the number of missed pitches seems high, it’s less than 1% of all pitches. The human error factor is 0.8%, which is fantastic given how many times humans make errors. How many pitchers throw 99.92% strikes? How many fielders have a 99.92% fielding percentage? How many batters make it on base 99.92% of the time? We’re all human and its the humanness that is part of the game. Controversy gives sportscasters and fans something to talk about. Adding more technology will lengthen an already too long game. We need adults to help others learn that we’re all human, we all make mistakes. It’s part of the game. It’s part of life. It’s up to the players to make sure the tiny number of errors aren’t a factor in the outcome – of the game or of life.

    • Tom D. on 04.09.2019 at 11:51 am

      You’re missing the point. We want to see all the “humanness that is part of the game”, played by the players on the field, add up to the result that it should, according to the rules of the game: not some random result, superimposed by an umpiring mistake.

      “It’s up to the players” – no, it’s not; good games played by good teams are often close. And if a third of all 2-strike balls are called strikes instead…many outcomes are going to be decided by umpiring mistakes. This makes no sense.

    • Jeremy on 04.18.2019 at 6:38 pm

      Where exactly are you getting your less than 1% of all pitches thrown?
      https://www.bu.edu/today/files/2019/04/Screen-Shot-2019-04-04-at-3.27.47-PM.png

  • Michael on 04.09.2019 at 11:02 am

    This article definitely seems to set out with an agenda. Data doesn’t lie, but those who interpret it do. Even your best attempts in this article seem biased.
    Also, there are some things that are not considered which anyone who plays or watches a lot of baseball would know. For one thing, the strike zone IS intentionally slightly larger with 2 strikes. I learned that when I was 10. Granted, it is an unwritten rule, but it is a known part of the game you ignore.
    And your statistic about strikeouts being more likely to be wrongly called than a ball 4 walk – have you properly controlled for the tendency for ball 4 to be very far from the strike zone? There are plenty of intentional walks, and plenty of pitchers just don’t want to throw a hit-able pitch to the other team’s best player when they have a 3-0 or 3-1 count.
    Also, the graph “Umpires’ inconsistent performance by innings” does seem to show strong correlation by inning. I’m not sure how you can immediately follow that graph with the opposite statement.

  • Kevin on 04.09.2019 at 12:19 pm

    These statistics are a great tool for helping umpires understand their inconsistencies and can show them that they are not perfect………but….

    This article is only addressing one aspect of the MLB Umpire “game”, Balls & Strikes. There are several references to umpires selected for the World Series are not the “best” umpires. The World Series umpires are not specifically selected for having the “best BCR” strike zone. There are other areas of the umpiring game that come into consideration as well. Such as game management, addressing and managing the egos of the players & coaches, having experience at handling many “tough” situations, being able to internally handle the pressure at the general expected level of a World Series, to accurately as humanly possible to administer and judge correctly the rules of the game, and to get the balls & strikes as correct as “humanly” possible (last year’s WS had some games above the 97% mark, which is well above the hitters averages).

    So, it is more than just balls & strikes. Many of these younger umpires, who seem to be well on their way to very successful careers in MLB, would likely falter a bit when it came to the enormity of the WS games and the intense pressure they are under.

  • Greg on 04.09.2019 at 1:38 pm

    Great report, Mark. You and your team should be very proud.

    When will MLB catch up to NBA when it comes to using technology to get the calls right as often as possible?

    Check out Michel Lewis’ last report on “This American Life”

    Hoop Reams
    https://www.thisamericanlife.org/672/no-fair/act-one-3

    The NBA is aggressively using technology to get the calls right.

    Here’s hoping the MLB will use technology to get the calls right in the future.

    Thanks again for this report.

  • Bunselpower on 04.09.2019 at 1:54 pm

    While I do get frustrated with bad calls and wouldn’t be too upset if robot umps were installed, there are some problems with this article.

    1. You don’t have the criteria used to determine what is wrong or right. I assume it’s a difference between the game record and the pitch f/x result. But the assumption that every pitch f/x is correct to the micron is a little much, which brings me to the next point…

    2. You say that pitch f/x “accuracy is claimed to be within one inch”. So let’s say there’s a margin for error of one inch. When there is a +/- of an inch on a (slightly less than) 3 inch ball traveling across a 17 inch plate, isn’t that a somewhat high? YOu can’t have a margin of error measurable in inches for a game that is known as “a game of inches”.

    3. What’s more, where’s the verification that this is correct? I’m sure it’s there, but realistically, how is it verified? Again, the tolerances must be tight here if we’re going to upend the system.

    4. It totally disregards the situation. If Joe West has a left leaning strikezone, and the pitcher and hitters figure that out, then nobody has a problem. What I would love to see is a regression analysis run on each umpire’s strikezones to see how far out of their own zone they call. Because a consistent slightly off zone is not the problem. It’s an unconsistent zone.

    5. Does this use the official strikezone definition? Because there are going to be a lot of calls at the letter that are going to be called if we adhere to the official strikezone (that I think should be called) that the hitters won’t like.

    All in all, the “resisting change” indictment I see a lot and it is a problem. Not changing something when the mobs call for it is not “resisting change”, it’s exercising prudence, especially with things like DH, robo umps, and other things that are pretty core to the sport. Flowing in and out with the tide does not produce a lasting game.

    I understand that people have the onus to change things because they have the wrong idea that change is always progress. But those that exercise patience will ultimately be the ones that avoid unforeseen consequences and actually make progress, not just change.

    • Daniel on 04.29.2019 at 5:35 pm

      I agree on points 1 and 2. This is a legit concern that should be addressed. Because a baseball is more than 2x larger than the accuracy scale, perhaps the standard could be whether 50% of the ball crosses the strike-zone. That way, even if it’s off by a full inch, that still means 15% of the ball hit the zone. If they can pin it down to within a half inch, I think this issue can be ignored. No human being can confidently argue over a half inch for a thrown pitch.

      Disagree on point 4. “If Joe West has a left leaning strike zone, and the pitcher and hitters figure that out, then nobody has a problem.” Until strike 3 or ball 4 is called.

      On point 5, I hope they do match the rule book. If too many strikeouts occur, change the strike zone or move the pitcher’s mound backward. I’ve felt for a long time that the front edge of the pitcher’s plate should be around 63′ 7″. That’s the midpoint between the back of the plate and 2nd base. Though that hurts the pitcher, it does help him out in 2 ways: reaction time to hit ball and better view to 1st base, which would be at a 90° angle rather than 92.8° angle.

  • Barry Silverman on 04.09.2019 at 4:27 pm

    As an amateur umpire for over 30 years, knowing and being friends with current and former MLB and MiLB umpires, and more in NCAA Div 1 baseball….

    calling correctly balls and strikes is something that umpires in the highest levels are concerned about each and every year.

    Apparently data that MLB umpires get from MLB Umpire Supervisors and whatever tracking information is used at those levels of professional baseball indicate that their accuracy of calling balls and strikes has improved over the years from the low 90 percentile to about 97-98%.

    Your data indicates something entirely different. And, while it’s a statistical analysis may have some validity I think it’s important that your data needs to ‘walk in the other persons shoes’ !

    Maybe if your data is compared to what MLB Umpires get regarding calling pitches there would be a better understanding of how to use your data more effectively.

    Speaking from personal experience and umpiring knowledge any part of the ball that touches the 3 dimensional strike zone is generally called a strike.

    However, perception is part of the game, and the players know such. So when a pitch that looks very close to be in that 3 dimensional zone and the catcher does something the make the umpire believe it’s a strike, many umpires will call such a pitch a ball.

    Maybe, a discussion with catchers about how they rate MLB Umpires would be an interesting non data analysis of any specific MLB Umpires effectively calling pitches.

    There’s more to any analysis than statistics and data. As someone teaching and passing on your knowledge to your students the need is there to be as fair and impartial by looking into all the areas where either specific data is available or perceptions of those who are catching each of those thousands of pitches.

    On a positive note, I did really enjoy reading the complete analysis. I thought it looked into many different ideas of analysis.

    Do you think any of it would apply to our current representatives in Congress? LOL

    That would be a very interesting analysis for your students.

  • Dave Cacela on 04.09.2019 at 8:27 pm

    This research appears strong and convincing, however the report is lacking in at least a couple of ways.

    First, the description of exactly how the technology determines a correct call is deficient. The analysts assume the technology as a “perfect” reference so, as such, they ought to rigorously test that assumption.

    Second, and at least as important, is lack of analysis of consistency of individual umpires’ calls. It is well known at all levels of baseball that individual umps have certain biases. By and large, players and coaches adapt to those biases and they tend to judge umpires by their “internal consistency”, not consistency against the book definition of a strike. I believe internal consistency is something that can and should be measured with the available data.

    Finally, there are many other important nuances of tradition in the game that are wholly absent from this analysis, most notably the importance of how a catcher “frames” a pitch. For example, it is widely understood that a pitch that requires the catcher to reach to catch it will very rarely be a called strike regardless of location. As such, fans and players will tend to distrust a robot that calls a strike if the catcher must reach.

    I commend these data scientists for their work, and encourage them to refine it in future interations.

  • Sam on 04.09.2019 at 11:33 pm

    I believe you mean “inter-inning.” Intra-inning means within the same inning, while inter-inning means between different innings.

  • Rickie E Butler on 04.10.2019 at 3:34 pm

    Since there apparently are not available such technology to apply assessing such ball and strike calls from 50’s, 60’s, 70’s 80′ 90’s MLB games, by MLB umpires, are we sure the prior generations of MLB umpires calling balls and strikes, were any more accurate? Or, is technology just improved to the point science technology can applied to any human judgment calls, and inadvertently cast a dim shadow on the current, from 90’s until today,MLB umpires, to the point their abilities to correctly determine which pitches are balls and strikes, make the current roster of umpires less efficient that prior generations of umpires in MLB, regarding the calling of balls and strikes? Maybe the prior of MLB umpires were less efficient in calling balls and strikes. It is easy to be a back seat driver and criticize current MLB umpires, when if you or I were standing behind any major league catcher trying to determine if a thrown baseball fit into an imagery predetermined strike zone of invisible lines, would we be able to do any better? Understandably these are seasoned professionals, with many years of training, but are they really supposed to be perfect? What line of human error is acceptable regarding the calling of balls and strikes in MLB?

  • TigerDoc on 04.12.2019 at 10:40 am

    I generally enjoyed reading this, but it certainly comes at this from a particular bias. Your data does show that umpires are improving, and that is likely due to the use of technology to help grade their performance. Since this is a game played by humans, there will always be an error rate, why should that not be accepted?

    You also fail to realize one critical factor in all of this, the umpires have a very strong union and as such, has resisted change, resists punishing umpires publicly, and protects those with seniority. Even with their CBA expiring at the end of this season, do not expect that to change a whole lot.

    Lastly, with any technology, before you roll it out there needs to be testing. So MLB is not dragging its feat by having the Atlantic League test out technology assisted umpiring. It is doing the right thing. There are always bumps in the road, unforeseen problems when using new technology. So yes, it should have some actual on-field testing before rolled out at the highest level.

  • Jackfruit on 04.12.2019 at 1:21 pm

    The only strike zone I’ve ever known is the one the ump, the catcher, and I created together. The best part about a zone is that it’s the umpires, affected by the batters stance, and the catchers position, and can change and you have to change with it. It’s an integral part of a unique game to keep the game organic and unique. Change that and you might as well fully transform players to player factories only, with the only MLB players being 5 tool from a combine. Reality is you’d never have a Will Clark or an Edgar Martinez, or a Pete Rose ever again. None were 5 tool, and no ump will ever be perfect. It’s the beauty of the game.

  • Rickie E Butler on 04.12.2019 at 6:51 pm

    As a reference point, look at the NFL the year the season opened with “Fill-In” referees and umpires. It wasn’t too many games in to the season it was apparent, although the Union Referees, Umpires, etcs, were far from perfect, the Union Umpires were definitely more efficient in calling an NFL games. Determining valid catches, forward progress, interference, etc. Much the same thing would happen in MLB if “substitute” umpires were in charge of balls and strikes, double play calls, runner out calls, etc. I would almost bet the people who are slamming umpires balls and strikes would quickly change their tune if substitute MLB umpires were suddenly in charge of the MLB games. Your thoughts?

  • Steve Sherman on 04.13.2019 at 8:13 am

    This article assumes that the Statcast strike zone is 100% accurate. It ignores the fact that there is a human element in determining the vertical borders of that strike zone, which according to the rules of baseball should be recalibrated on every pitch as the hitter varies his stance. It ignores the fact that there is a different position for the radar devices in each ballpark.

    I don’t dispute that umpires make mistakes and I find it particularly interesting that the error rate increases with age (though I’m not sure why it doesn’t also increase toward the later innings, when fatigue should be a factor). I’m not convinced, however, that a ‘robot’ strike zone is automatically more accurate. Worse, since it’s the robot strike zone that is being used to judge the umpires, I don’t see any way to make an objective judgment as to which is more accurate.

    At this stage of the technology, a case can be made for using it as an aid to the umpire. But I do not agree that it should have the final call.

  • dominik on 04.13.2019 at 7:21 pm

    I think the 2 strike thing is bad science. The zone actually gets smaller with 2 strikes and larger with 3 balls
    https://blogs.fangraphs.com/the-size-of-the-strike-zone-by-count/

    You have to consider there is a selection bias, sure there are more incorrect strike than ball calls but balls within the zone are also rarer because they tend to get swung at, i.e there are way more takes out of the zone.

  • dominik on 04.13.2019 at 7:36 pm

    While I’m pro robo ump your opinion that MLB is stalling is very wrong. It actually would be irresponsible to introduce a technology without testing it at the highest level.

    Testing it at the lower level is exactly the right thing to do.

    Also I don’t like how strongly you argue that mlb has to act now as if world integrity was in danger.

    Bad ump decisions have been around for decades but are no economic issue for mlb. In the end it is a zero sum game for mlb and some even argue that little human element of chance is good. I don’t agree with that assertion but mlb is still making record revenue, there is no crisis with this. Fans will go off on ump bad calls but they won’t stop watching baseball.

  • Mark Trolio on 04.21.2019 at 6:12 am

    This study me wonder if NY Yankee Don Larsen’s last pitch in his World Series perfect game was really a correctly “called strike”…

  • paul on 04.22.2019 at 4:08 am

    A brit perspective… we are trying out VAR (Video Assisted Refereeing) for football (soccer) at the moment… all that has happened is that the arguments have moved from the play to arguing about the VAR call – with ridiculous amounts of time being used up when challenges are made. Sometimes things are best left well alone and just accept human decision making is the best.

  • Sparks on 04.22.2019 at 7:27 am

    I play historic base ball, the kind where the game is spelled out with two words, with rules written in the 1860’s and ’70’s, when it was played by gentlemen. Our umpire calls balls and strikes only when the hurler and striker can’t get it together. Close plays are decided by the players themselves, and “judgement” called only when they can’t agree. And then, the umpire’s judgement is accepted. Return to a game of honor, of fair play and accept that the game is played by humans, and this, mistakes and all, is how the game should be played. Of course, I understand this is not possible in modern baseball, which is a business after all (note, the research was done by the business school) and there is a lot of money on the line. I am content and thankful to relive a bit of history and leave the current game to the “professionals”.

  • Jeffery Pijanowski on 04.24.2019 at 2:32 pm

    So umpires get it right about 91 percent of the time in 2018, yet you want to change it? I’m lucky in life if I get everything right 91 percent of the time.

  • RoboFan on 04.26.2019 at 12:28 am

    Let’s face it friends, there are some great ball/strike calls I see in MOST games. It’s a really tough job behind the plate. If we’re all honest, we’d mainly just have to tip our hats to the HPU’s. Having said that, boy, there really might not be anything that can make me jump out of my chair and scream at the TV screen quite like some of the gross bad calls from time to time behind the plate. The ones that are close you just kinda figure, buddy maybe you should think about protecting the plate when they are that close (to the batter). But it’s those gross game changing bad calls that have driven me to say, “enough is enough”, bring on the robot-caller. Use the technology, give the ump a in-ear device or vibrating device or something.

  • Glenn Norris on 05.02.2019 at 3:14 pm

    I favor the use of technology to help the umpires get it right. One of the most attractive aspects of baseball is its capacity to put athletes in competitive situations that reward excellence. There are so many things that are right about Baseball, it’s a true shame that something this wrong is allowed to diminish the game’s ability to facilitate the truth.

    We have the technology to tell us when the pall passes through the strike zone: a column with five corners, its footprint being home plate and its top and bottom defined by the rules and the player’s anatomy. The rectangle we see on TV shows us only a plane at the front of the zone and ignores the rest of the column. Breaking pitches can miss the front plane and still pass through the column to become a strike. Nobody can reasonably expect a human to get that right from a crouched position behind the catcher.

    For MLB to have access to that technology and not use it in service to the integrity of the game is irresponsible at best.

  • Joe Wilhelm on 05.03.2019 at 6:13 pm

    Robo umps provide the consistency that human umps are just not capable of. It doesn’t matter if a pitch is 1/2 an inch off the plate and is called a strike or not, just as long as every player gets the exact same call. Machines provide that, a human does not. The outcome of a game should be determined by the performance of the athletes playing under the same rules without advantage and not by some random inconsistency or a mistake of an umpire. It is beyond logical comprehension to defend the idea of an umpire randomly making incorrect calls, sometimes egregiously, which may or may not determine the outcome as “Part of the Game”.

    Baseball has been suffering decreasing attendance and interest for several years and MLB should be looking at every way to improve the experience. I don’t watch games to see umpire mistakes determine the outcome, nor do I watch to see players argue with the umps. The accuracy and consistency of calling balls and strikes is such a major part of the game that it needs to be fixed since it readily can be.

  • Will Lana on 05.06.2019 at 4:35 pm

    I agree. Tech assisted balls & strikes is the #1 step MLB can take to improve the game. Bad calls are not a tradition worth protecting. The fact that the strike zone can be called extremely accurately with tech is a big advantage for baseball relative to football or basketball where ambiguous calls like holding, pass interference and fouls will continue to frustrate fans. Hope MLB realizes tech assisted strike zones is an opportunity not a threat.

  • Mark Chatterley on 05.18.2019 at 10:15 pm

    Do you happen to have the data that compares teams? Which teams got more favorable calls and which teams got less? I would almost put money that the lower smaller market teams got less favorable calls more often than not

  • steve on 05.23.2019 at 10:03 am

    Completely agree with comment on high strikes. However, people have been criticizing the paucity of “high” strike calls for decades, so this is nothing new.

    One remedy: Change the rules. This would be like recognizing that people routinely drive at least 5 mph over the posted speed limit (closer to 10, maybe) and so we should increase the limit accordingly.

  • Eric McMurray on 05.23.2019 at 10:54 pm

    The study is flawed, at least to the same level that the study shows is the error rate of umpires. Due simply because the strike zone measuring equipment impose a stagnant zone on each batter, from Aaron Judge who is 6’7” to Garcia on the Wh Sox at 5’7”. Unless the strike zone, as defined in the Official Rules of Baseball, can be programmed and adjusted for every batter – their height, and individual batting stances when ready to swing – then the analysis cannot be considered definitive in terms of percentages of accuracy or inaccuracy of a given umpire. That being said, is Angel still horrible? Yes. There are 1000 non-MLB umpires who are better.

  • Anthony Quaff on 05.24.2019 at 12:39 am

    Let them play.

Post Your Comment

(never shown)