ESPN's Bill Simmons (aka The Sports Guy) recently suggested that the primary cause of dwindling interest in Red Sox games by fans is that baseball games these days are too long. "It's not that fun to spend 30-45 minutes driving to a game, paying for parking, parking, waiting in line to get in, finding your seat ... and then, spend the next three-plus hours watching people play baseball", he says.Revolutions (New about R &c) offers a plot in ggplot2 to determine, anyway, whether the data support the claim that games are getting longer.
Erm, I always thought the reason I thought baseball games were too long was that I was not interested in baseball. Had not considered the possibility that a 3-hour game might put off people who actually liked the game.
Showing posts with label baseball. Show all posts
Showing posts with label baseball. Show all posts
Thursday, August 12, 2010
Saturday, July 31, 2010
Sunday, July 29, 2007
Bivariate Baseball Plot
Rafe Donahue, a biostatistician at the University of Vanderbilt, has sent me a link to an interactive website that uses the statistical graphic program R to produce a bivariate baseball plot. Devised in collaboration with Tatsuki Koyama, Jeffrey Horner and Cole Beck (as Rafe as pointed out in the comments), it works like this:
The user selects the team and year in which s/he is interested
As you'll have noticed from the menus, you can then print out your graphic as a PDF.
The Baseball Scoreplot blog explains how to read a baseball bivariate score plot, discusses known issues and analyses the graphic Rafe generated for the Astros, with Roger Clemens as starting pitcher
We never see this kind of thing in fiction.
The user selects the team and year in which s/he is interested
then goes on to select from: Day of the Week, Opponent Team
Opponent League, Day/Night, Starting Pitcher
(I know readers have seen drop-down menus before, but they are not usually this much fun), Opponent Starting Pitcher, Home/Away, Pitcher with Decision, Opponent Pitcher with Decision, Month, or First/Second Half .
R then produces a bivariate plot displaying the results:
Opponent League, Day/Night, Starting Pitcher
(I know readers have seen drop-down menus before, but they are not usually this much fun), Opponent Starting Pitcher, Home/Away, Pitcher with Decision, Opponent Pitcher with Decision, Month, or First/Second Half .
R then produces a bivariate plot displaying the results:
As you'll have noticed from the menus, you can then print out your graphic as a PDF.
The Baseball Scoreplot blog explains how to read a baseball bivariate score plot, discusses known issues and analyses the graphic Rafe generated for the Astros, with Roger Clemens as starting pitcher
The Astros’ opponents’ marginal distribution (on the left) shows how teams fare against teams that beat them: their average rpg is just over 3.5 rpg compared with nearly 4.5 rpg for the Astros. Where the Astros were held to 1 run 27 times, their opponents were held to 1 or fewer on 42 occasions. Note that Clemens started 2 games that were shutouts and started 11 games where the opponents were held to fewer than 2 runs. He also started a game where the opponents scored 9 runs.(Graphic available on blog.)
The joint distributions reveals details of Clemens’ abysmal run support. The bottom-left corner of the distribution shows five games which Clemens started in which the Astros lost 1-0, a pitcher’s nightmare. So, of the 11 games that Clemens started and the opponents were held to one run, 5 of those games failed to produce a single Houston run. In fact, Clemens was the only Astros pitcher to start a game in which the team lost 1-0.
We never see this kind of thing in fiction.
Subscribe to:
Posts (Atom)