Standard recruitment metrics — completion%, progressive passes, key passes — measure execution outcomes. A player completing 87% in a low-intensity league and 61% in a high-pressing one may be making identical decisions. Your metric cannot tell them apart.
The surface metrics trap: high completion% under a protective tactical system inflates value assessments. The collapse happens after signing, when the system changes. This framework makes it visible beforehand.
All metrics derived exclusively from Opta Vision feeds — no external data required. Every metric traces to an inspectable pass event. Pull any DQI score and watch the underlying decision. Pairwise Spearman correlations confirm each metric captures a genuinely independent cognitive dimension.
w_S = 0.47 (safety), w_T = 0.53 (threat). Additive formulation validated across 470K options. Multiplicative collapses to 12% event survival (xT near zero); additive achieves 94%. Not a preference — a sample-size constraint. LBP boost (+0.15/+0.25) on chosen pass for POR only.OVS_COMPETITOR_TAU (removes reflex decisions). Range [0,1]; 1.0 = chose optimal. Softmax temperature T = 0.10.activeRun=True from passTargets.
All four hypotheses pre-registered before Stage 2 analysis. Pre-registration prevents post-hoc recalibration and preserves submission credibility. Every validation test ran against its pre-specified target and passed.
Results reported as-computed with no post-hoc recalibration.
| Validation test | Result | Target | Status |
|---|---|---|---|
| DQI split-half reliability | r = 0.748 / SB = 0.856 | ≥ 0.70 | ✓ |
| DQI temporal stability | r = 0.766 | ≥ 0.65 | ✓ |
| DQI ΔR² over completion% | 0.071 | ≥ 0.05 | ✓ |
| External criterion (turnover under pressure) | ρ = −0.642, r_rb = −0.858, p < 0.0001 | ρ < 0, large effect | ✓ |
| Turnover dose-response Q1→Q4 | 33.8% → 26.0% → 20.9% → 17.5% | Monotonic | ✓ |
| OVS weight sensitivity (rank stability) | ρ = 0.984–0.997 | Stable | ✓ |
| OVS collinearity (xP ↔ xT) | Scale ratio within safe range | Non-redundant | ✓ |
| JSD split-half / SB-corrected | r = 0.560 / SB = 0.718 | ≥ 0.50 | ✓ |
| JSD cross-match stability | Stable across match sample | Stable | ✓ |
| JSD position sensitivity (Kruskal-Wallis) | C.Mids 0.218 vs Wide Fwds 0.310 | Significant p < 0.05 | ✓ |
| ROER positional gradient | 0.226 → 0.330 → 0.380–0.452 | Gradient confirmed | ✓ |
| ROER ↔ DQI orthogonality (partial r | completion%) | r = −0.052, p = 0.487 | Non-significant | ✓ |
| DQI ↔ JSD orthogonality | |r| = 0.131 | < 0.40 | ✓ |
| DQI ↔ POR_z orthogonality | r = −0.036, p = 0.91 | Non-significant | ✓ |
A structured multi-layer filter, not a ranking algorithm. Apply layers in sequence — each eliminates noise before the next. The output is a ranked shortlist with auditable evidence for every flag. Every recommendation traces to a logged pass event.
A complete recruitment cycle applied to 131 qualifying midfielders. From population landscape to individual profiles to final scouting recommendations — every step uses only glass-box metrics, every flag traces to a logged pass event.
The most operationally dangerous finding in the dataset. Several players pass standard completion% screens but fail the framework. P-544 is the canonical example of the trap.
| Player | Tier | DQI | UPDD | JSD | ROER | Primary risk reason |
|---|---|---|---|---|---|---|
| P-28 | Flag | 0.637 | 0.167 | 0.342 | 0.400 | Triple flag: pressure fragile + style misalignment + below-threshold DQI |
| P-340 | Flag | 0.761 | 0.078 | 0.328 | 0.000 | Style misalignment (JSD=0.328) + zero run exploitation |
| P-338 | Flag | 0.726 | 0.083 | 0.390 | 0.400 | High JSD (0.390) + below-threshold DQI |
| P-544 | Flag | 0.758 | 0.216 | 0.162 | 0.500 | THE SURFACE TRAP: completion% looks safe — UPDD = 0.216 severe collapse |
| P-251 | Flag | 0.751 | 0.107 | 0.238 | 0.000 | Pressure fragile + zero run exploitation |
| P-629 | Flag | 0.823 | 0.106 | 0.207 | 0.000 | High DQI but pressure fragile + zero run exploitation |
Download the complete technical report, launch the interactive app, or review the analysis pipeline. Every claim is reproducible from the source notebook.
Scope: One competition, 50 matches — cross-league robustness is the priority extension.
Coverage: ROER reliable at ≥10 opportunities (41% provisional); UPDD uncomputable for 24.9% — reported as Unknown, not neutral.
By design: No reception quality, ball-carrying, or off-ball positioning. Analytical focus enables validation depth that broader frameworks cannot achieve.