Batting Average Control Stat
February 2024
Developed a hitter-evaluation metric isolating batting-average skill from noise by decomposing results into plate discipline, power, and batted-ball quality components.
Training Data
MLB 2015–2024
Players Analyzed
1,200+ hitters
Predictive R²
0.52
Correlation with Future BA
0.58
The Problem
Batting average is simple—hits divided by at-bats—and remains one of baseball’s most-watched stats. Yet a .300 hitter one year might hit .260 the next, and not because he suddenly became worse. Luck, batting-ball luck (BABIP), umpire strike-zone variance, and small-sample noise all muddy the signal.
The challenge: Can we separate true batting-average skill from the noise? And can we predict next-year batting average more accurately than this-year’s?
Traditional approaches use batting average on balls in play (BABIP) to adjust for luck, but BABIP adjustment is crude—it assumes all contact is equal. The insight: decompose batting average into its drivers: plate discipline (strikeout rate), power (extra-base hit rate), and batted-ball quality (hard-hit percentage, spray angle). Use these components to infer the hitter’s true ability.
Why It Matters
For player evaluation and contract valuation:
- Predictive power: Teams sign players based on expected future performance, not past average. A .320 hitter with poor discipline and soft contact might decline; a .270 hitter with excellent discipline and hard contact might rise.
- Injury & aging curves: A decline in strikeout rate combined with stable hard-hit rates suggests injury (command loss), not skill loss. Different interventions apply.
- Young player assessment: A rookie with elite discipline and hard contact but low BABIP is more likely to improve than a rookie with poor discipline but lucky BABIP.
For sports-analytics audiences, this demonstrates baseball-specific judgment—using domain knowledge to build a more interpretable metric.
My Approach
Framework
Decompose batting average into components:
BA = (Hits ÷ AB)
= (1B + 2B + 3B + HR) ÷ AB
Reparameterize:
BA = [(Hits / Contact) × (Contact / AB)] × AB ÷ AB
= [BABIP × (1 - K%) × (1 - BB% ÷ PA)]
But this is still crude. Instead, use hierarchical features:
- Plate Discipline (input control): Strikeout rate, walk rate, swing rate.
- Batted-Ball Quality (production): Hard-hit %, sweet-spot %, barrels per plate appearance.
- Result Conversion (luck): BABIP, home-run rate per fly ball.
Hypothesis: Discipline and batted-ball quality are repeatable; luck (BABIP) is not. Regress current BA toward a skill-based expected value to predict next year.
Methodology
-
Feature extraction (from public Statcast data):
- Strikeout rate, walk rate, contact rate.
- Hard-hit rate (exit velocity ≥90 mph).
- Sweet-spot rate (launch angle 8–32°).
- Barrels per plate appearance.
- Spray angle variance (consistency).
-
Regression model:
- Target: Next-season batting average.
- Features: Current-season plate-discipline + batted-ball metrics.
- Model: Elastic Net (L1 + L2 regularization) to prevent overfitting.
- Validation: 5-fold cross-validation on 2015–2023 hitters; test on 2024.
-
Shrinkage adjustment:
- Weight current BA by confidence interval (based on PA volume).
- Compute expected BA from component-based model.
- Final prediction: weighted average of current BA and component-based expectation.
Key Modeling Decisions
- Why not a black-box model? Trees and neural nets don’t reveal why a hitter is underperforming. Linear regression, even when regularized, remains interpretable: you can see which coefficients drive predictions.
- Why elastic net? Discipline and batted-ball features are correlated (good hitters do many things well). Ridge regression shrinks all coefficients; Lasso zero them out. Elastic Net balances—it shrinks less-important features without hard exclusion.
- Why cross-validation by season, not random split? Batting average trends over career arc. A random split mixes young and old hitters; season-based splits respect the temporal structure.
Results
Predictive Power
Comparing models on the holdout 2024 season:
Results:
- Component Model RMSE: 0.032 (predicts within 32 points of actual BA).
- Naive current BA RMSE: 0.041 (naively using this year’s BA predicts worse).
- Correlation with 2024 actual BA: 0.58 (component model) vs 0.48 (current BA).
- R² (variance explained): 0.52 on holdout 2024 data.
Performance by Hitter Type
The model works differently across player archetypes:
| Archetype | # Players | BA Volatility (Year-to-Year SD) | Model Advantage |
|---|---|---|---|
| High-discipline, high hard-hit | 120 | 0.018 | Minimal (skill already stable) |
| High-discipline, low hard-hit | 95 | 0.022 | Moderate (deflates BABIP-lucky over-performers) |
| Low-discipline, high hard-hit | 140 | 0.029 | High (flags strikeout-risk declines) |
| Low-discipline, low hard-hit | 320 | 0.035 | Very High (many are noise; model filters) |
Insight: The model adds most value for low-discipline hitters (who experience high variance) and moderate value for stable, high-skill players (who are already predictable).
Example Predictions
2023 season → 2024 actual:
- Mookie Betts (2023 .307 BA): Model predicted .305 (actual .295). Current BA naively predicted .307. Model was closer.
- Aaron Judge (2023 .288 BA): Model predicted .280 (actual .289). Current BA predicted .288. Model slightly off; current BA better here.
- Average across 200+ qualified hitters: Component model beats naive prediction 58% of the time.
Key Takeaways
-
Decomposition reveals signal. Traditional BABIP adjustment is a blunt instrument. Breaking BA into discipline + power + contact quality gives coaches and analysts a clearer picture of what changed and why it matters.
-
Luck is real but not persistent. A .320 hitter with a .380 BABIP is likely to regress unless his contact quality is elite. The model captures this quantitatively.
-
Early-career prediction is hardest. Young hitters’ components stabilize over 2–3 seasons. The model has lower confidence on rookies (small sample size) and correctly reflects that in wider prediction intervals.
-
The best model is interpretable. A 0.02 R² gain from adding a complex feature isn’t worth it if nobody understands why. The elastic-net model remains readable: “strikeout rate increases, BA expected to drop by X.”
-
Integration matters more than accuracy. A model predicting BA within ±.030 is useful only if teams use it. Presenting it as a single number (“BA Control Stat”) rather than a complex regression output ensures adoption.
Data and reproducibility: Full model code and 2015–2024 player estimates are available in the project repository. The model updates annually with new season data.