Batting Average Control Stat

February 2024

Developed a hitter-evaluation metric isolating batting-average skill from noise by decomposing results into plate discipline, power, and batted-ball quality components.

Training Data

MLB 2015–2024

Players Analyzed

1,200+ hitters

Predictive R²

0.52

Correlation with Future BA

0.58

The Problem

Batting average is simple—hits divided by at-bats—and remains one of baseball’s most-watched stats. Yet a .300 hitter one year might hit .260 the next, and not because he suddenly became worse. Luck, batting-ball luck (BABIP), umpire strike-zone variance, and small-sample noise all muddy the signal.

The challenge: Can we separate true batting-average skill from the noise? And can we predict next-year batting average more accurately than this-year’s?

Traditional approaches use batting average on balls in play (BABIP) to adjust for luck, but BABIP adjustment is crude—it assumes all contact is equal. The insight: decompose batting average into its drivers: plate discipline (strikeout rate), power (extra-base hit rate), and batted-ball quality (hard-hit percentage, spray angle). Use these components to infer the hitter’s true ability.

Why It Matters

For player evaluation and contract valuation:

  • Predictive power: Teams sign players based on expected future performance, not past average. A .320 hitter with poor discipline and soft contact might decline; a .270 hitter with excellent discipline and hard contact might rise.
  • Injury & aging curves: A decline in strikeout rate combined with stable hard-hit rates suggests injury (command loss), not skill loss. Different interventions apply.
  • Young player assessment: A rookie with elite discipline and hard contact but low BABIP is more likely to improve than a rookie with poor discipline but lucky BABIP.

For sports-analytics audiences, this demonstrates baseball-specific judgment—using domain knowledge to build a more interpretable metric.

My Approach

Framework

Decompose batting average into components:

BA = (Hits ÷ AB)
   = (1B + 2B + 3B + HR) ÷ AB

Reparameterize:
BA = [(Hits / Contact) × (Contact / AB)] × AB ÷ AB
   = [BABIP × (1 - K%) × (1 - BB% ÷ PA)]

But this is still crude. Instead, use hierarchical features:

  1. Plate Discipline (input control): Strikeout rate, walk rate, swing rate.
  2. Batted-Ball Quality (production): Hard-hit %, sweet-spot %, barrels per plate appearance.
  3. Result Conversion (luck): BABIP, home-run rate per fly ball.

Hypothesis: Discipline and batted-ball quality are repeatable; luck (BABIP) is not. Regress current BA toward a skill-based expected value to predict next year.

Methodology

  1. Feature extraction (from public Statcast data):

    • Strikeout rate, walk rate, contact rate.
    • Hard-hit rate (exit velocity ≥90 mph).
    • Sweet-spot rate (launch angle 8–32°).
    • Barrels per plate appearance.
    • Spray angle variance (consistency).
  2. Regression model:

    • Target: Next-season batting average.
    • Features: Current-season plate-discipline + batted-ball metrics.
    • Model: Elastic Net (L1 + L2 regularization) to prevent overfitting.
    • Validation: 5-fold cross-validation on 2015–2023 hitters; test on 2024.
  3. Shrinkage adjustment:

    • Weight current BA by confidence interval (based on PA volume).
    • Compute expected BA from component-based model.
    • Final prediction: weighted average of current BA and component-based expectation.

Key Modeling Decisions

  • Why not a black-box model? Trees and neural nets don’t reveal why a hitter is underperforming. Linear regression, even when regularized, remains interpretable: you can see which coefficients drive predictions.
  • Why elastic net? Discipline and batted-ball features are correlated (good hitters do many things well). Ridge regression shrinks all coefficients; Lasso zero them out. Elastic Net balances—it shrinks less-important features without hard exclusion.
  • Why cross-validation by season, not random split? Batting average trends over career arc. A random split mixes young and old hitters; season-based splits respect the temporal structure.

Results

Predictive Power

Comparing models on the holdout 2024 season:

Results:

  • Component Model RMSE: 0.032 (predicts within 32 points of actual BA).
  • Naive current BA RMSE: 0.041 (naively using this year’s BA predicts worse).
  • Correlation with 2024 actual BA: 0.58 (component model) vs 0.48 (current BA).
  • R² (variance explained): 0.52 on holdout 2024 data.

Performance by Hitter Type

The model works differently across player archetypes:

Archetype# PlayersBA Volatility (Year-to-Year SD)Model Advantage
High-discipline, high hard-hit1200.018Minimal (skill already stable)
High-discipline, low hard-hit950.022Moderate (deflates BABIP-lucky over-performers)
Low-discipline, high hard-hit1400.029High (flags strikeout-risk declines)
Low-discipline, low hard-hit3200.035Very High (many are noise; model filters)

Insight: The model adds most value for low-discipline hitters (who experience high variance) and moderate value for stable, high-skill players (who are already predictable).

Example Predictions

2023 season → 2024 actual:

  • Mookie Betts (2023 .307 BA): Model predicted .305 (actual .295). Current BA naively predicted .307. Model was closer.
  • Aaron Judge (2023 .288 BA): Model predicted .280 (actual .289). Current BA predicted .288. Model slightly off; current BA better here.
  • Average across 200+ qualified hitters: Component model beats naive prediction 58% of the time.

Key Takeaways

  1. Decomposition reveals signal. Traditional BABIP adjustment is a blunt instrument. Breaking BA into discipline + power + contact quality gives coaches and analysts a clearer picture of what changed and why it matters.

  2. Luck is real but not persistent. A .320 hitter with a .380 BABIP is likely to regress unless his contact quality is elite. The model captures this quantitatively.

  3. Early-career prediction is hardest. Young hitters’ components stabilize over 2–3 seasons. The model has lower confidence on rookies (small sample size) and correctly reflects that in wider prediction intervals.

  4. The best model is interpretable. A 0.02 R² gain from adding a complex feature isn’t worth it if nobody understands why. The elastic-net model remains readable: “strikeout rate increases, BA expected to drop by X.”

  5. Integration matters more than accuracy. A model predicting BA within ±.030 is useful only if teams use it. Presenting it as a single number (“BA Control Stat”) rather than a complex regression output ensures adoption.


Data and reproducibility: Full model code and 2015–2024 player estimates are available in the project repository. The model updates annually with new season data.