Architectural excerpt

Fight attribution comparative axes

The central challenge in building fight-level features is the symmetry problem. The model takes a matchup as input and produces a six-way probability distribution as output. Swapping Fighter A and Fighter B should flip that distribution correctly. Raw statistics do not generally behave that way by construction. Think about the problem of percentage changes. You need much larger percent growth to balance out lesser percent declines, so we needed to make mathematical adjustments to keep things symmetrical.

The forecast model uses exactly twelve inputs: head-to-head differences plus clash-of-styles terms. There are no share-of-total ratio columns in the shipped version. That is an intentional simplification worth stating upfront.

Why style axes rather than raw stats

Raw striking accuracy against mixed opposition is noisy in a way that is hard to correct after the fact. The model builds four constructed axes: striker, grappler, finish threat, and finish vulnerability. Related signals are pooled and normalized for opponent quality before the forecast ever sees them. Striker and grappler axes require full UFC stat tables (hence ESPN MMA Fight Center, not Sherdog) even though the finish axes can be computed using outcome-only history.

Striker score summarizes net striking pressure per minute. Grappler score blends takedown rate, control time, and submission attempts. That is the difference between measuring what a fighter does and measuring what a fighter does against whom.

Axes are built for offense and defense separately. KO finish rate as an attacker and KO victim rate as a defender are different signals. A fighter can be an elite finisher and also carry a chin vulnerability. Both matter.

How the striker score is built

For each prior UFC bout in the same weight class, we take net significant strikes per minute: strikes landed minus strikes absorbed, divided by fight time. That raw rate is squashed onto a 0-to-1 scale with a logistic curve centered at zero net pressure. Positive net striking pushes the score up; getting out-struck pushes it down.

Each bout then gets two weights before it enters the average. Quality weight scales with opponent ELO at fight time: beating or performing well against stronger opposition counts more; padding stats against weak opposition counts less. Recency weight decays exponentially with calendar age, using roughly three fights per year as the time unit. A fight from last month weighs far more than a fight from several years ago.

The striker score is the quality-and-recency-weighted average of those per-fight signals across Tier-1 UFC bouts with usable strike tables. If a fighter has no qualifying strike history in the division, the data estimate defaults to 0.50 until real bouts accumulate.

How the grappler score is built

Grappler score uses the same weighting scheme, but the per-fight signal is a composite of up to three grappling sub-scores, averaged over whichever components are available for that bout:

Takedown rate — takedowns landed per minute, squashed to 0-to-1.
Control share — control time as a fraction of total fight time.
Submission attempt rate — submission attempts per minute, squashed to 0-to-1.

A bout with full grappling tables contributes all three; a bout with partial tables contributes whatever is there. The grappler score is again a quality-and-recency-weighted average across Tier-1 UFC history in the division. With no qualifying grappling history, the data estimate defaults to 0.40.

How thin samples blend toward priors

When the effective quality-weighted fight count is still below two bouts, the data estimate is not trusted on its own. Right now, when insufficent information exists, striker and grappler scores are linearly blended toward .5 or .4 respectively. Though it isnt implemented right now, down the road the archetecture exists for adjusting those priors. Say, if the person was say a D1 national wrestling champion, we would expect them to be significantly above average.

Striker prior comes from boxing or muay thai pedigree when set; otherwise 0.50.
Grappler prior comes from the stronger of wrestling or BJJ pedigree when set; otherwise 0.40.

At zero effective fights the score is all prior. At two or more effective fights the blend is all data. In between, the mix is proportional.

Why head-to-head differences and clash-of-styles terms

Once each fighter has axis scores, the forecast still needs matchup-level inputs. The model uses exactly twelve, all built from Fighter A's point of view. Positive values mean A has the edge on that dimension.

Five simple gaps

Subtract B from A:

ELO gap — who rates higher in this division right now.
Striker gap — whose striking-axis score is higher.
Grappler gap — whose grappling-axis score is higher.
Finish-threat gap — who finishes opponents more often.
Finish-vulnerability gap — who gets finished more often.

If you swap A and B, each gap changes sign. That is the basic antisymmetry guarantee: the model is always asked "what does A have over B?" not "what are two isolated résumés?"

Three clash-of-styles terms

Gaps alone only say who is stronger on an axis. They do not say how A's strength meets B's weakness in the same domain. Three extra terms encode that explicitly, with fixed formulas chosen before the regression runs:

Striking matchup = A's striker score × (1 − B's striker score). High when A is a strong striker and B is weak in the striking domain.
Grappling matchup = A's grappler score × (1 − B's grappler score). Same idea on the mat.
Finish matchup = A's finish threat × B's finish vulnerability. High when A tends to finish people and B tends to get finished.

Example: two fighters with similar striker scores can show a near-zero striker gap, but very different striking-matchup values if one is a specialist striker facing a weak defensive striker versus another well-rounded striker facing an equally well-rounded opponent. The clash terms capture that leverage; gaps alone do not.

Four physical inputs

Reach gap, height gap, stance mismatch (1 when orthodox meets southpaw, else 0), and age gap in days. These are fixed pre-fight facts, not rolling performance stats.

If reach or height is missing for either fighter, that gap is imputed as zero: no measured advantage either way. That is a conservative default, not proof that neither fighter actually has an edge.

Why recency weighting within career aggregates

Eight of the twelve matchup inputs come from rolling career summaries: the four style-axis scores, then the gaps and clash terms built from them. Those summaries are not lifetime averages. The fighter who fought in 2015 is not necessarily the same athlete in the cage tonight. Camps change, injuries heal, skills sharpen or fade. The recency weights in the striker and grappler formulas above keep the estimate pointed at who is showing up on fight night.

If a decade-old bout counted the same as last month's, outdated striking and grappling tendencies would leak into the forecast. Exponential decay, with roughly three fights per year as the time unit, is the compromise: recent cards dominate, but older history still matters when it is all we have.

Why physical features are kept separate

Reach, height, stance mismatch, and age differential are fixed attributes. They do not decay with time and should not be mixed into rolling performance aggregates. They belong in their own group because they do not need recency weighting.

These are not the most important inputs in the model. They are among the most reliable because they cannot be gamed by sample selection or era effects.

← All decisions