Overview
The 6-factor model measures satisfaction and frustration across three domains: Ambition, Belonging, and Craft.
Validation Gates
R/lavaan CFA and Python scoring pipeline verified these thresholds.
Subscale Summary
Mean scores on the 0-10 normalised scale. Error bars show ±1 SD.
Archetypes
The scoring pipeline assigns each participant one of 8 motivational archetypes based on which domains are activated (satisfaction ≥ 5.5). Frustration is reported as a continuous score per domain, not as a categorical modifier.
Type Distribution
Each type has a fixed colour matching its strong-domain pattern. Colours are consistent with the Type Guide below.
Type Guide
Click a type to see its profile, strengths, typical subscale pattern, and growth edge.
Domain States
Crossing satisfaction and frustration at 5.5 classifies each domain into one of four states: Thriving, Vulnerable, Mild, or Distressed.
State Distribution
| Domain | State | Count | % |
|---|
Satisfaction vs Frustration
Each dot represents one participant. Dashed lines at 5.5 divide the four state quadrants.
The Four States
Satisfaction (horizontal) and frustration (vertical) cross at 5.5 to form four quadrants. Click any cell for the full profile including mental fatigue risk.
Belbin Roles
Subscale score patterns map each participant to one or more Belbin team roles.
Role Distribution
One participant can hold several roles. Percentages sum to more than 100%.
| Role (Qualifier) | Count | % of Participants |
|---|
Role Guide
Click a role to see its definition, ABC alignment, and the types most likely to hold it.
Risk Signals
Frustration signatures flag participants whose score patterns signal psychological strain.
Frustration Signatures
High frustration paired with extreme satisfaction scores reveals six distinct strain patterns.
| Signature | Risk | Count | % of Participants |
|---|
Score Distributions
How subscale scores (0-10) and Big Five percentiles spread across the population.
Subscale Statistics
| Subscale | Mean | SD | Min | Max |
|---|
Individual Participants
Click a row to see the full profile. Search, filter, or click column headers to sort.
| # ▲ | Type ▲ | Domain ▲ | A-State ▲ | B-State ▲ | C-State ▲ | A-Sat ▲ | A-Frust ▲ | B-Sat ▲ | B-Frust ▲ | C-Sat ▲ | C-Frust ▲ |
|---|
Take an Assessment
Experience the ABC Assessment as a user would. Choose a tier, answer the questions, and see your motivational profile.
Onboarding
Quick first impression. One satisfaction and one frustration item per domain gives an early signal of your motivational shape.
Standard
Stronger signal within the first two weeks. Two items per subscale sharpens the profile and adds confidence to the archetype.
Full Assessment
The complete validated instrument. Six items per subscale produces reliable subscale means, type assignment, and risk signals.
Trajectory: Variance from Baseline
The baseline is the athlete's full 36-item assessment at the start of the season. Weekly check-ins track how far all six subscales drift from that starting point. If any subscale exceeds the reassessment threshold, a new full assessment is needed.
Select Participant
Choose a simulated participant to use as the baseline. Their full assessment scores become the starting point for weekly check-ins.
Classification Stability
Each participant is re-measured 100 times with realistic noise (±1 Likert point per item). Stability = % of trials where the same classification is assigned.
Test-Retest Distribution
Distribution of per-participant type stability scores. Higher is more reliable.
Big Five Round-Trip Validation
Ground-truth Big Five scores are generated, mapped to ABC responses via reverse weights, then inferred back. Correlation measures recovery fidelity.
Boundary Participants
Participants closest to classification thresholds, ranked by instability. These are the profiles most likely to flip under measurement error.
| # | Type | Type Stability | Nearest Threshold | Distance |
|---|
Methodology
This section documents the complete scoring pipeline, all mathematical formulas, weight matrices, classification thresholds, and psychometric evidence for auditor review. All evidence is from synthetic simulation; empirical validation is pending.
1. Theoretical Foundation
The ABC assessment operationalises Self-Determination Theory (Deci & Ryan, 2000) through three psychological need domains, each measured by two subscales:
| Domain | SDT Construct | Satisfaction Subscale | Frustration Subscale |
|---|---|---|---|
| Ambition | Autonomy-directed goal pursuit | A-Sat (AS1-AS6) | A-Frust (AF1-AF6) |
| Belonging | Relational connection | B-Sat (BS1-BS6) | B-Frust (BF1-BF6) |
| Craft | Skill-directed competence | C-Sat (CS1-CS6) | C-Frust (CF1-CF6) |
Satisfaction and frustration are measured as independent constructs, not opposite ends of a single continuum (Vansteenkiste & Ryan, 2013; Chen et al., 2015). This allows detection of the "Vulnerable" state (high satisfaction + high frustration), which SDT identifies as a burnout precursor. Murphy et al. (2023) raised concerns that this distinction may be a method artifact from item-keying direction; the simulation includes bifactor and method-factor models (Phase 3) to test this directly.
2. Scoring Pipeline
From raw responses to typed profiles with team roles and risk flags.
- Ingest raw responses. 36 items on a 1-7 Likert scale. 6 items per subscale, 6 subscales. Items 4 and 6 of each subscale are reverse-coded (AS4, AS6, AF4, AF6, BS4, BS6, BF4, BF6, CS4, CS6, CF4, CF6).
- Reverse-score. For reverse-coded items: scored_value = 8 - raw_value.
- Compute subscale means. mean = (item_1 + ... + item_6) / 6. Range: [1.0, 7.0].
- Normalise to 0-10. score = ((mean - 1) / 6) × 10. Range: [0.0, 10.0].
- Classify domain states. Two thresholds per domain create four states. See Section 3.
- Find dominant domain. argmax of satisfaction scores. Ties broken: ambition > belonging > craft.
- Infer Big Five percentiles. Internal only. See Section 4 for the weight matrix.
- Assign base pattern. 8 types from binary satisfaction threshold. See Section 5.
- Map Belbin roles. See Section 6 for the cluster mapping and scoring formula.
- Flag frustration signatures. See Section 7 for the risk classification matrix.
3. Domain State Classification
Each domain is classified into one of four mutually exclusive states using two thresholds:
| Frustration < 4.38 | Frustration ≥ 4.38 | |
|---|---|---|
| Satisfaction ≥ 6.46 | Thriving | Vulnerable |
| Satisfaction < 6.46 | Mild | Distressed |
Threshold derivation status: The thresholds 6.46 and 4.38 were positioned between discrete score boundaries on the normalised 0-10 scale to prevent classification artifacts from the 7-point Likert resolution. Phase 2 ROC analysis on synthetic criterion data produced empirical thresholds of 6.09 (satisfaction) and 4.82 (frustration), both within the bootstrap 95% CIs of the fixed values. Empirical thresholds from ABQ criterion data will replace these when N ≥ 200 paired datasets are available.
Decision consistency (Phase 2b): With 6 items per subscale, per-domain classification agreement across two independent administrations is approximately 75%. Joint agreement across all three domains is approximately 42%. This means classifications should be interpreted with confidence bands, not as definitive labels.
4. Big Five Weight Matrix
Big Five percentiles are inferred from centred subscale scores. This is internal only, used for Belbin role inference, not displayed to athletes.
Step 1: Centre. zi = (scorei - 5.0) / 5.0 for each of 6 subscales.
Step 2: Dot product. raw_trait = Σ (wij × zi) for each trait j.
Step 3: Percentile. percentile = clamp(50 + raw_trait × 30, 1, 99).
Weight matrix W (6 subscales × 5 traits):
| Openness | Conscientiousness | Extraversion | Agreeableness | Neuroticism | |
|---|---|---|---|---|---|
| A-Sat | 0.12 | 0.03 | 0.47 | -0.23 | 0.00 |
| A-Frust | 0.16 | 0.13 | 0.02 | 0.19 | 0.24 |
| B-Sat | -0.36 | 0.20 | 0.27 | 0.43 | 0.05 |
| B-Frust | -0.35 | 0.30 | 0.19 | -0.13 | 0.41 |
| C-Sat | 0.52 | 0.18 | -0.12 | 0.08 | -0.03 |
| C-Frust | 0.33 | -0.45 | 0.11 | 0.18 | 0.05 |
Design constraints: Weights optimised to produce inter-trait correlations |r| < 0.02 and primary-trait distribution of ~20% each, validated against Gosling et al. (2003) benchmarks. No validated Big Five instrument underpins these weights; they are inferential approximations from ABC subscale patterns.
5. Type System (8 Base Patterns)
Types use a binary satisfaction threshold (sat ≥ 5.5 = Strong, sat < 5.5 = Developing) per domain, producing 23 = 8 base patterns. Frustration is reported as a continuous score per domain with confidence bands, not as a categorical modifier.
| Pattern | Ambition | Belonging | Craft |
|---|---|---|---|
| Integrator | Strong | Strong | Strong |
| Captain | Strong | Strong | Developing |
| Architect | Strong | Developing | Strong |
| Mentor | Developing | Strong | Strong |
| Pioneer | Strong | Developing | Developing |
| Anchor | Developing | Strong | Developing |
| Artisan | Developing | Developing | Strong |
| Seeker | Developing | Developing | Developing |
Why 8, not 24: The previous 24-type system (8 patterns × 3 frustration modifiers: Steady/Striving/Resolute) produced only ~31% type agreement across readministrations (Phase 2b). Reducing to 8 base patterns nearly doubles agreement to ~50-55%. Frustration information is preserved as continuous values with no binning loss. The categorical modifier may return if empirical decision consistency (kappa ≥ 0.60) supports it with the current 6 items per subscale.
Activation threshold (5.5): Positioned below the domain state threshold (6.46) because activation precedes full satisfaction. A need can be engaged and energising without yet reaching "Thriving." This threshold is assumed, not empirically derived. ROC analysis against coach engagement ratings will produce domain-specific empirical thresholds when criterion data is available.
6. Belbin Role Inference
Cluster mapping:
| Domain | Cluster | Roles | Differentiating Trait |
|---|---|---|---|
| Craft | Thinking | Plant, Specialist, Monitor-Evaluator | Openness, Conscientiousness, Neuroticism |
| Belonging | People | Teamworker, Resource Investigator, Coordinator | Agreeableness, Extraversion, Conscientiousness |
| Ambition | Action | Shaper, Implementer, Completer-Finisher | Extraversion, Conscientiousness, Neuroticism |
Scoring formula: role_score = domain_affinity × (trait_percentile / 100).
Affinity weights: Primary domain = 1.0, Secondary domain = 0.5, Tertiary domain = 0.0 (excluded).
Firing threshold: Roles with score ≥ 0.30 fire. The highest-scoring role always fires regardless of threshold.
Qualifier: "Natural" if trait percentile ≥ 60, "Manageable" if below 60.
Caveat: This is a heuristic mapping, not a validated Belbin instrument. Role scores are continuous approximations.
7. Frustration Signatures
Frustration signatures fire when a domain's frustration score ≥ 4.38. The risk level depends on whether satisfaction is also high:
| Domain | Sat ≥ 6.46 + Frust ≥ 4.38 | Sat < 6.46 + Frust ≥ 4.38 |
|---|---|---|
| Ambition | Blocked Drive (medium risk) | Controlled Motivation (high risk) |
| Belonging | Conditional Belonging (medium risk) | Active Exclusion (high risk) |
| Craft | Evaluated Mastery (medium risk) | Competence Threat (high risk) |
8. Psychometric Properties (Simulation)
All values below are from synthetic data. Empirical validation will replace them.
| Property | Method | Result | Basis |
|---|---|---|---|
| IRT theta recovery | EAP scoring vs true theta | r > 0.90 | Bock & Mislevy (1982)1 |
| Bifactor omega-h | Bifactor model | 0.246 (subscales independent) | Reise (2012); Reise, Bonifay & Haviland (2013)2 |
| ECV | Explained Common Variance | 0.061 (specific factors dominate) | Reise, Bonifay & Haviland (2013)2 |
| Per-domain classification agreement | Simulated readministration | ~75% | APA Standards (2014), Standard 2.163 |
| Type agreement (8 patterns) | Simulated readministration | ~50-55% | Cohen (1960)4 |
| Difference score reliability | rdiff = (rx + ry - 2rxy) / (2 - 2rxy) | 0.70-0.90 | APA Standards (2014), Standard 2.43 |
| Conditional SEM at thresholds | 1 / √I(θ) at cutpoints | ~0.20 theta units | Baker & Kim (2004)5 |
| 36-item tier reliability | IRT marginal reliability | 0.943 | Samejima (1969)6; APA Standards (2014), Standard 2.93 |
| 18-item tier reliability | IRT marginal reliability | 0.870 | Samejima (1969)6 |
| 6-item tier reliability | IRT marginal reliability | 0.714 | Samejima (1969)6 |
| Cascade detection lag | Vulnerable-to-Distressed simulation | 1.5 timepoints | Vansteenkiste & Ryan (2013); Lonsdale & Hodge (2011)7 |
| Alert sensitivity | RCI-based detection | 81.1% at optimal threshold | Jacobson & Truax (1991); Youden (1950)8 |
Footnotes: 1 EAP estimation method. 2 Bifactor model specification and omega coefficient definitions. 3 APA/AERA/NCME Standards for Educational and Psychological Testing govern decision consistency (2.16), difference score reliability (2.4), and tier-specific reliability (2.9). 4 Cohen's kappa for chance-corrected classification agreement. 5 IRT information function and conditional SEM derivation. 6 Graded Response Model for polytomous IRT. 7 SDT theoretical basis for the Vulnerable-to-Distressed cascade (need frustration as burnout precursor). 8 Jacobson-Truax Reliable Change Index for detecting meaningful change; Youden Index for optimal diagnostic threshold selection.
9. How a Synthetic Participant Is Created
Each simulated participant is generated through this exact sequence. Every step is deterministic given a seed, which ensures reproducibility across runs.
- Draw 6 latent subscale z-scores. For each of the 6 subscales (A-Sat, A-Frust, B-Sat, B-Frust, C-Sat, C-Frust), sample a value from N(z_mean, 1). The z-means are tuned offsets: satisfaction z-mean = +0.30, frustration z-mean = -0.31. These offsets compensate for structural biases in the type distribution (see Section 12). The 6 draws are independent (identity correlation matrix).
- Generate 36 item responses from the z-scores. For each subscale, generate 6 item-level responses by adding independent Gaussian noise (SD = 0.3 × noise_multiplier) to the subscale z-score, then transform to the 1-7 Likert scale: item = clamp(round(z × scale + midpoint), 1, 7). This simulates the measurement noise that real items would introduce.
- Apply reverse coding. Items 4 and 6 of each subscale are reverse-scored: scored = 8 - raw (AS4, AS6, AF4, AF6, BS4, BS6, BF4, BF6, CS4, CS6, CF4, CF6). This mirrors the actual instrument design where two items per subscale are reverse-keyed to detect acquiescence bias.
- Compute 6 subscale means. Average the 6 scored items per subscale. Range: [1.0, 7.0].
- Normalise to 0-10 scale. score = ((mean - 1) / 6) × 10. This maps the Likert midpoint (4.0) to 5.0 on the normalised scale.
- Run the full scoring pipeline. The 6 normalised scores enter the 10-step pipeline (Section 2), producing domain states, dominant domain, Big Five percentiles, base pattern type, Belbin roles, and frustration signatures.
What this means: A participant's type is not assigned directly. It emerges from the interaction of 6 latent z-scores, 36 noisy item responses, and 5 classification thresholds. Two participants with similar z-scores may receive different types because item-level noise pushes their subscale means above or below a threshold. This is the source of the classification instability documented in Section 8.
10. Worked Example: One Participant
This traces a single simulated participant through every computation. All values are illustrative.
| Step | Computation | Result |
|---|---|---|
| 1. Latent z-scores | Draw from N(+0.30, 1) for sat, N(-0.31, 1) for frust | A-Sat z=0.82, A-Frust z=-0.55, B-Sat z=1.14, B-Frust z=0.03, C-Sat z=-0.21, C-Frust z=-0.89 |
| 2. Item responses | z × scale + midpoint + noise, clamp to [1,7] | AS1=6, AS2=5, AS3=6, AS4=2 (reverse), AS5=6, AS6=3 (reverse), AF1=3, AF2=2, AF3=3, AF4=5 (reverse), AF5=2, AF6=4 (reverse), ... |
| 3. Reverse coding | AS4: 8-2=6, AS6: 8-3=5, AF4: 8-5=3, AF6: 8-4=4 | AS4 scored=6, AS6 scored=5, AF4 scored=3, AF6 scored=4 |
| 4. Subscale means | (6+5+6+6+6+5)/6 = 5.67 | A-Sat mean=5.67, A-Frust mean=2.83, B-Sat mean=6.50, B-Frust mean=4.00, C-Sat mean=4.25, C-Frust mean=2.25 |
| 5. Normalise | ((5.67-1)/6)×10 = 7.78 | A-Sat=7.78, A-Frust=3.05, B-Sat=9.17, B-Frust=5.00, C-Sat=5.42, C-Frust=2.08 |
| 6. Domain states | A: sat 7.78≥6.46, frust 3.05<4.38 | Ambition: Thriving, Belonging: Vulnerable (sat 9.17≥6.46, frust 5.00≥4.38), Craft: Mild (sat 5.42<6.46, frust 2.08<4.38) |
| 7. Dominant domain | argmax(7.78, 9.17, 5.42) | Belonging (9.17) |
| 8. Base pattern | A-Sat 7.78≥5.5: Strong. B-Sat 9.17≥5.5: Strong. C-Sat 5.42<5.5: Developing. | Captain (A Strong, B Strong, C Developing) |
| 9. Frustration levels | Continuous per domain | Ambition: 3.05 (low), Belonging: 5.00 (elevated), Craft: 2.08 (low) |
| 10. Frustration signatures | Belonging: sat 9.17≥6.46, frust 5.00≥4.38 | Conditional Belonging (medium risk) |
This participant is a Captain with elevated belonging frustration. The Vulnerable state in Belonging, combined with the Conditional Belonging signature, signals that relational connection is strong but under strain. In a longitudinal context, if belonging frustration continues to rise while satisfaction remains high, the trajectory engine would detect this as the early phase of a Vulnerable-to-Distressed cascade.
11. Trajectory Analysis
The trajectory system monitors how an athlete's scores change over repeated administrations. It operates on continuous theta scores with standard errors, not on categorical domain state labels, because domain state classifications flip on ~33% of readministrations (Section 8). The trajectory system answers three questions:
What is reliable change?
Not every score change is meaningful. Some change is measurement noise. The Jacobson-Truax Reliable Change Index (RCI) [14] distinguishes signal from noise:
SEdiff = √(SEt² + SEt+1²)
RCI = (scoret+1 - scoret) / SEdiff
|RCI| > 1.96 = reliable change (95% confidence)
If the RCI exceeds 1.96, the change is larger than what measurement error alone could produce. If it does not, the change may be noise.
What is the trajectory pattern?
Over multiple timepoints, the trajectory engine classifies each athlete's score history into one of five patterns [15]:
| Pattern | Definition | Simulated Prevalence |
|---|---|---|
| Stable | No significant trend, low variability | 40% |
| Gradual Decline | Significant negative slope over the observation window | 20% |
| Gradual Rise | Significant positive slope over the observation window | 20% |
| Acute Event | Single large reliable drop (RCI < -3.0) between consecutive points | 10% |
| Volatile | Many direction changes with large amplitude (> 60% of timepoints change direction) | 10% |
Pattern detection uses a linear trend test over a sliding window. The slope is tested against an SE-adjusted null: slope is significant if |slope| > 1.96 × SEslope, where SEslope incorporates both measurement error and sampling variability.
What is the Vulnerable-to-Distressed cascade?
SDT predicts that need frustration rises before need satisfaction drops [26]. The cascade model simulates this: frustration increases at rate rfrust per timepoint starting at T=0, while satisfaction decreases at rate rsat per timepoint starting at T=lag. The lag is the detection window: the time during which frustration is elevated but satisfaction has not yet declined.
frustration(t) = initial_frust + rfrust × t + noise
satisfaction(t) = initial_sat - rsat × max(0, t - lag) + noise
Mean simulated lag: 1.5 timepoints
In the simulation, this produces the leading indicator signal: a practitioner monitoring frustration would see it rising 1.5 measurement points before satisfaction begins to drop. The alert system fires when a reliable decline (RCI < threshold) is detected. At the optimal threshold (RCI = -3.00), the alert achieves 81% sensitivity with 16% false positive rate [14, 27].
How trajectories are simulated
For each simulated athlete, the trajectory generator:
- Assigns a trajectory type by random draw from the prevalence distribution (40% stable, 20% decline, 20% rise, 10% acute, 10% volatile).
- Generates a base score trajectory using the type-specific formula (e.g., for decline: score(t) = base - rate × t + noise).
- Adds measurement noise at each timepoint (SD = 0.4) to simulate the imprecision of real item responses.
- Detects burnout onset when the score falls below 3.5 on the 0-10 scale (for decline and acute trajectories).
- Computes RCI between consecutive timepoints to flag reliable changes.
- Classifies the pattern based on volatility, acute drops, and trend significance.
What this does not capture: Real athlete trajectories are shaped by events (injury, selection, exam periods, relationship changes) that the simulation cannot model. The simulation tests whether the detection infrastructure works under controlled noise conditions. Whether the patterns it detects correspond to real burnout transitions can only be determined with empirical longitudinal data paired with a criterion measure such as the ABQ [20].
12. Simulation Parameters
The dashboard generates synthetic participants using these fixed parameters:
- Distribution: Six subscale scores drawn from independent normal distributions. Satisfaction z-mean = +0.24, frustration z-mean = -0.31 (tuned to flatten type distribution).
- Correlation matrix: Identity (zero inter-subscale correlation). This is a conservative baseline; real data will show within-domain negative correlations (sat vs frust) and cross-domain positive correlations.
- Item-level noise: Independent Gaussian noise (SD = 0.3 × noise multiplier), clamped to [1, 7], rounded to integers.
- Convergence: At scale, every run converges to the same population shape. Variability comes from item-level noise, not population-level parameter changes.
13. Limitations and Transparency
- Synthetic data only. Every participant was generated from parametric distributions. No real athletes contributed data.
- No empirical validation yet. The pipeline validates against its own generative model. CFA fit of 1.000 on synthetic data is circular. Empirical CFA on real responses is the true test.
- This is an AI-developed instrument. No licensed SDT materials were used. The ABC assessment is a purpose-built instrument developed through AI-assisted research and iterative simulation. We are transparent about what has confidence (the analytic infrastructure, the theoretical framework) and where further research is needed (empirical criterion validation, item-level calibration on real data).
- Big Five estimates are inferential. The weight matrix approximates Big Five percentiles from subscale patterns. No validated Big Five instrument underpins it.
- Belbin roles rest on heuristics. Domain satisfaction selects a cluster and Big Five percentiles differentiate within it. This is not a Belbin instrument.
- Thresholds will shift with empirical data. All thresholds (6.46, 4.38, 5.5) are calibrated to the simulation. ROC analysis against the Athlete Burnout Questionnaire (Raedeke & Smith, 2001; Grugan et al., 2024) will produce empirical thresholds when paired data is available.
- Classification instability is a known limitation. Per-domain agreement is ~75% with 6 items per subscale (36 total). All classifications should carry confidence bands. Empirical validation may warrant further item pool expansion if agreement falls below this estimate.
14. References
- [1] APA, AERA, & NCME. (2014). Standards for Educational and Psychological Testing.
- [2] Baker, F. B., & Kim, S.-H. (2004). Item Response Theory: Parameter Estimation Techniques (2nd ed.). Marcel Dekker.
- [3] Bartholomew, K. J., et al. (2011). Self-determination theory and diminished functioning. Personality and Social Psychology Bulletin, 37(11), 1459-1473.
- [4] Bock, R. D., & Mislevy, R. J. (1982). Adaptive EAP estimation of ability. Applied Psychological Measurement, 6(4), 433-444.
- [5] Chen, B., et al. (2015). Basic psychological need satisfaction, need frustration, and need strength across four cultures. Motivation and Emotion, 39, 216-236.
- [6] Chen, F. F. (2007). Sensitivity of goodness of fit indexes. Structural Equation Modeling, 14(3), 464-504.
- [7] Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20, 37-46.
- [8] Costa, P. T., & McCrae, R. R. (1992). Revised NEO Personality Inventory. Psychological Assessment Resources.
- [9] Deci, E. L., & Ryan, R. M. (2000). The "what" and "why" of goal pursuits. Psychological Inquiry, 11(4), 227-268.
- [10] Efron, B., & Tibshirani, R. J. (1993). An Introduction to the Bootstrap. Chapman & Hall.
- [11] Gosling, S. D., et al. (2003). A very brief measure of the Big-Five personality domains. Journal of Research in Personality, 37(6), 504-528.
- [12] Grugan, M. C., et al. (2024). Factorial validity and measurement invariance of the ABQ. Psychology of Sport and Exercise, 73, 102638.
- [13] Hu, L., & Bentler, P. M. (1999). Cutoff criteria for fit indexes. Structural Equation Modeling, 6(1), 1-55.
- [14] Jacobson, N. S., & Truax, P. (1991). Clinical significance. Journal of Consulting and Clinical Psychology, 59(1), 12-19.
- [15] Lonsdale, C., & Hodge, K. (2011). Temporal ordering of motivational quality and athlete burnout. Medicine & Science in Sports & Exercise, 43(5), 913-921.
- [16] Lonsdale, C., et al. (2009). Athlete burnout in elite sport: A self-determination perspective. Journal of Sports Sciences, 27(8), 785-795.
- [17] Lovibond, S. H., & Lovibond, P. F. (1995). Manual for the Depression Anxiety Stress Scales. Psychology Foundation.
- [18] McDonald, R. P. (1999). Test Theory: A Unified Treatment. Erlbaum.
- [19] Murphy, J., et al. (2023). The BPNSFS probably does not validly measure need frustration. Motivation and Emotion, 47, 899-919.
- [20] Raedeke, T. D., & Smith, A. L. (2001). Development and preliminary validation of an athlete burnout measure. Journal of Sport & Exercise Psychology, 23(4), 281-306.
- [21] Reise, S. P. (2012). The rediscovery of bifactor measurement models. Multivariate Behavioral Research, 47(5), 667-696.
- [22] Reise, S. P., Bonifay, W. E., & Haviland, M. G. (2013). Scoring and modeling issues in bifactor analysis. Psychological Assessment, 25(2), 404-415.
- [23] Samejima, F. (1969). Estimation of Latent Ability Using a Response Pattern of Graded Scores. Psychometrika Monograph No. 17.
- [24] Swets, J. A. (1988). Measuring the accuracy of diagnostic systems. Science, 240(4857), 1285-1293.
- [25] Vandenberg, R. J., & Lance, C. E. (2000). A review and synthesis of the measurement invariance literature. Organizational Research Methods, 3(1), 4-70.
- [26] Vansteenkiste, M., & Ryan, R. M. (2013). On psychological growth and vulnerability. Journal of Psychotherapy Integration, 23(3), 263-280.
- [27] Youden, W. J. (1950). Index for rating diagnostic tests. Cancer, 3(1), 32-35.