Data Analytics in Baseball

Brad Pitt playing Billy Beane in Moneyball (courtesy of Sony Pictures)

One of the most well-known applications of data analytics in sports is Billy Beane’s unconventional approach to organizing a competitive professional baseball team. Later popularized by the book and movie Moneyball, the story exemplifies the power of data analytics and provides insight into the field’s transition from a criticized amateur approach to an integral component of a team’s success.

The 2001 season for the Oakland Athletics ended with a disappointing loss to the New York Yankees in the playoffs. The contracts for some of the team’s star players, such as Jason Giambi and Johnny Damon, had expired, and the As were expected to lose them to free agency in the offseason. General manager Billy Beane was suddenly challenged to recreate a team of similar caliber. The team’s budget made this task all the more daunting.

As the chart below shows, the Oakland Athletics had the third lowest payroll in the MLB at $41 million, far behind the Yankees’ $125 million. So, how could Beane develop a roster that could compete with the Yankees and the Red Sox, while limited to a third of the money that these big-market titans enjoyed? Enter data analytics.

Budget of each MLB team for the 2002 season (courtesy of Ridgeway Research)

During the offseason, Beane met Paul DePodesta, assistant general manager for the Cleveland Indians. DePodesta was a data analytics enthusiast, so his theories and scouting reports were often disregarded. However, Beane was able to discern the value of this unorthodox approach.

Billy Beane (left; courtesy of the San Francisco Chronicle) and Paul DePodesta (right; courtesy of Awesome Stories)

Beane hired Depodesta as an assistant general manager, and they quickly got to work. After analyzing volumes of data, they discovered that certain undervalued statistics, such as on-base percentage and slugging percentage, provided strong correlations to winning. Surprisingly, the players who excelled in these categories were cheaper to sign. The value of these statistics was quickly refuted by the team’s more traditional scouts, who instead focused more on visibly apparent qualities, like speed and power. Thus, tension emerged, creating a divide between the scouts and the analysts.

The 2002 season got off to a rough start for the Athletics, who were not showing any promising signs of another postseason appearance. However, by midseason, the clubhouse was finally on-board with this new emphasis on advanced statistics, known as sabermetrics. The new-look As won 20 games in a row – a record at the time – and won their division, inviting them back to the postseason.

A shot of the team after winning their division and clinching another postseason appearance (courtesy of Awesome Stories)

Though they experienced another disappointing playoff loss, Billy Beane, Paul DePodesta, and the As created a success story much larger than an individual season.

An episode of the Simpsons parodying the Moneyball story (courtesy of Fandom)

After the season, numerous teams identified the new future of sabermetrics and adopted elements of Beane and DePodesta’s strategy. The Boston Red Sox was one of these teams and offered Beane their general manager position. While Beane declined, the Red Sox implemented versions of his sabermetric models, leading to a World Series Championship in 2004.

Today, it is becoming more and more common for professional and even some collegiate sports teams to have their own data analytics department. Additionally, more data is being collected, which allows for a multitude of different ways to analyze and interpret the meaning behind it. Catalyzed by Moneyball, Billy Beane and Paul DePodesta revolutionized the game of baseball, sports, and the industry of data analytics.

Leave a Reply

Your email address will not be published. Required fields are marked *