The information used here was obtained free of charge from and is copyrighted by Retrosheet. Interested parties may contact Retrosheet at

Friday, August 03, 2007

JSM handout available

Last Sunday, the 29th of July, I presented the BBSP at the Joint Statistical Meetings (JSM) in Salt Lake City as a 15-minute talk in a session run by the Statistics in Sports Section of the American Statistical Association. A copy of the handout that I, um, handed out at the talk is available (in pdf) for your viewing pleasure.

The handout contains description of how to read the plots and the Roger Clemens / Houston Astros example. As high-resolution pdf, it is suitable to be blown up from its normal size (8.5 x 11 inches), and might make a pretty print if printed at 17 x 22. I'm not sure if it is 'suitable for framing', but it might make a unique and inexpensive Christmas (or your favorite gift-giving holiday!) present for your favorite baseball fan.


autosocratic said...

I was forwarded the pdf of your amazing display, and have since used the online version (a lot)!:

One of the interesting items is the difference viewing this online versus the pdf I was forwarded in the way I can "see" differences. I had trouble with the pdf differentiating Clemens from the rest of the team by way of the markers. Would other colors or markers make this more clear?

Likely you played with different ways of displaying the distribution along the axis' as well. I've been playing with simple line graphs for "Clemens" versus "Non-Clemens" and it shows the distribution much better than the handout, for me at least. However, this too looks better on your website.

One thing that strikes me about the Clemens example is this: how might one catch illegal behavior in MLB? The fact Clemens had all 1-0 games - 5 of them - seems such an outlier it's, as Dr. Deming says, a special cause worthy of targeted investigation.

One final thought: is it possible to create the graphs for the entire league for a year? It would not only make an interesting way to check the integrity of the programming (it should look symmetric, right, as one team's 7-2 win is another's 7-2 loss), but I'd also be interested to see how the shading changes. Likely very light at the 1-0, 2-1, etc., very heavy at 5-2, 6-3, 6-5, etc., with it becoming light again at 8-1, etc.

An amazing and wonderful display of data. Thanks!

autosocratic said...

Two more interesting plots ...

Gibson in 1968 ... it's no wonder they lowered the mound!

Steve Carlton (Cardinals) in 1972 versus the rest of the team ... This has always been a trivia question (what picture won the largest % of his team's games). Well, here it is!

Mike Round

beetama74 said...

We have tried many colors and shapes before the final version. Maybe these should be options available to the users??

It is possible to create the graph for the entire league for a year with minor change in the original code. It may be a very crowded graph but should be doable. Stay tuned.

autosocratic said...

A fun exercise I found accidentally is keeping the focus on a year, and the using the arrow keys and scrolling through a team's history to see how they changed. Kansas City, for example, shows a remarkable shift from the early 80s through 2006.

A thought on the upper limit of the display: why was "15+" chosen? There seems to be little activity above "10", so reducing the upper limit to "10+" would elimiate about 55% of the dead space, while better highlighting where all the action is.

Mike Round