Thursday, April 24, 2008

Clay Buchholz Pitch F/X Data

This is ultimately an exercise in an untrained eye having too much data to play with and drawing some uninformed conclusions. So with that out of the way, we’re off. The following plots show some of the pitch f/x data for Clay Buchholz’s first four starts of 2008. The gamelogs for these games can be found here. Most would agree that of the four starts, he had two pretty good ones (the two Fenway starts) and two mediocre-at-best starts (@TOR, @NYA). I’ve used symbols accordingly (filled symbols for good, crosses/hatches for bad). Colors correspond to date. Also, when looking at these plots, remember that the pitch tracking systems in place at various parks may have some inconsistencies in the calibration, so with this small sample size, it is possible that some outliers may not be statistically significant due to systematic errors in the measurement system. For an explanation of the pitch f/x data and system, here is a good place to start.
This plot shows vertical break (relative to a pitch with no spin) vs. pitch speed. The pitch type assignments are mine. This plot shows the separation between pitch types very well. Overall, it appears that Buchholz does better when he’s throwing harder, at least for his curveball and fastball (especially for his fastball). I guess that’s not too surprising. His curveball is interesting; he has had better success (apparently) when throwing it harder and with slightly less vertical break. What this plot doesn’t show you is where these pitches are located. Taken abstractly, it would seem to me that a slightly slower curve with more break would be better, but if those pitches are going into the dirt while the harder, smaller-breaking curves are dropping in for strikes, then obviously you’ll do better with the latter case – we’ve seen what happens when Beckett can and can’t throw his curve for strikes, so it’s probably not all that different here. The changeup is interesting – it almost looks like he’s throwing two distinct types of changeup, as the small separate cluster indicates. Those may be pitches that Buchholz is having trouble throwing (as John has suggested), though we really can’t assess that from this plot.
This plot is similar to the previous plot, but shows horizontal break rather than vertical break. These plots are from the catchers’ point of view; positive horizontal break is to the catcher’s right. It’s pretty clear that Buchholz struggles when his fastball has too much negative horizontal break (in on right handed hitters). Not sure why this would be unless he’s throwing it too much over the middle of the plate and letting it come too far in on righties. He appears to have more success when his fastball is essentially straight (while a straight fastball isn’t necessarily a good thing, it appears to be a good thing for Buchholz, relatively speaking). His curve has a very interesting spread in horizontal break. There are sort of two clusters, one around 0 to +5, and one from 0 to -8 or so. According to John, Kevin Cash calls Buchholz’s harder curveball a slider, but that doesn’t appear to make sense from this plot; the harder curve balls break in on right handed hitters, which is the opposite of what a slider should do. Regardless, Buchholz appears to struggle when his curve does that (bear in mind that I’m oversimplifying to a large extent here by extrapolating from the fact that his April 16 start was “bad” to calling the results of all curveballs thrown in that start “bad”. There is obviously more to the story, but we’d need to get much more sophisticated in making plots like this to get into it at that level). Another potential point here is that his changeup is (apparently) almost completely absent from the cluster for the April 16 start. This begs the question as to whether that second cluster of “curveballs” aren’t actually “changeups” thrown slower and with less horizontal break. If so, this only appears to have happened in what was Buchholz's worst start, and might be one indicator of why it went so badly. I’m somewhat confused by these plots, enough so to wonder if I’ve mis-classified the pitches in this plot, as the breaks seem to be opposite what might be expected from typical curves and changes. I can’t really distinguish the clusters as well in this plot; what is really needed is a way to classify pitches by type in the pfx-z vs. speed plot and then plot pitch types by color in the pfx-x vs. speed plot. Stay tuned.
Lastly, a plot of release point from the catcher’s perspective. The only obvious outlier is the April 16 start at Yankee Stadium. The interesting correlation is that in Buchholz’s worst start, his release point was apparently inconsistent with his other starts, appearing to be more of a ¾ release than his more normal over-the-top. This is the only game for which I can find pitch F/X data for Buchholz at Yankee Stadium, so we have to consider that there might be some bias to these results, given that there is such a systematic offset. However, this bears watching for his next start in the Bronx, to see if his release point is consistent with other starts (in other words, to see if the difference in release points between that start and other starts is real, or if it’s a calibration problem with the tracking system at Yankee Stadium). Again, stay tuned, but it’s potentially an interesting observation that his worst results (only 2 strikeouts) may have come in a game where his release point was off. Disregard the three outliers in the lower left; this is probably garbage data. Buchholz does not throw sidearm.

So, remember the small sample size caveat. Stay tuned for more – once I figure out how to plot pitch types by colors, we’ll be able to do a lot more, especially with the pfx-x vs. speed plots. What else do people see here?

Labels: , ,


Blogger John said...

i was confused too on the horizontal break since it seemed to be opposite of what i would have thought, so i just kind of assumed the negative sign was opposite when i was interpreting them. i haven't looked closely at how pitch f/x data is done yet.

it will be real nice to have a season worth of stats broken into "good start" and "bad start" plots.

12:55 AM  
Blogger John said...

gun readings only had buchh topping out at 91 in detroit, i think this trend has merit.

4:42 PM  

Post a Comment

Subscribe to Post Comments [Atom]

<< Home