Overview

The 6-factor model measures satisfaction and frustration across three domains: Ambition, Belonging, and Craft.

Participants
Archetypes Observed
Vulnerable %
Avg Roles / Person
With Risk Signals
Type Stability

Validation Gates

R/lavaan CFA and Python scoring pipeline verified these thresholds.

Scoring Correlation
r = 1.000
Target: ≥ 0.85 per subscale
Domain State Accuracy
100.0%
Target: ≥ 80%
Vulnerable Sensitivity
100.0%
Target: ≥ 75%
Max Type Frequency
Target: ≤ 15%
CFA Model Fit (CFI)
1.000
Target: ≥ 0.95
Code Coverage
97.5%
Target: ≥ 85%

Subscale Summary

Mean scores on the 0-10 normalised scale. Error bars show ±1 SD.

Archetypes

The scoring pipeline assigns each participant one of 8 motivational archetypes based on which domains are activated (satisfaction ≥ 5.5). Frustration is reported as a continuous score per domain, not as a categorical modifier.

Type Distribution

Each type has a fixed colour matching its strong-domain pattern. Colours are consistent with the Type Guide below.

Type Guide

Click a type to see its profile, strengths, typical subscale pattern, and growth edge.

Domain States

Crossing satisfaction and frustration at 5.5 classifies each domain into one of four states: Thriving, Vulnerable, Mild, or Distressed.

State Distribution

DomainStateCount%

Satisfaction vs Frustration

Each dot represents one participant. Dashed lines at 5.5 divide the four state quadrants.

The Four States

Satisfaction (horizontal) and frustration (vertical) cross at 5.5 to form four quadrants. Click any cell for the full profile including mental fatigue risk.

Satisfaction ↑
High Frustration Low Frustration

Belbin Roles

Subscale score patterns map each participant to one or more Belbin team roles.

Role Distribution

One participant can hold several roles. Percentages sum to more than 100%.

Role (Qualifier)Count% of Participants

Role Guide

Click a role to see its definition, ABC alignment, and the types most likely to hold it.

Risk Signals

Frustration signatures flag participants whose score patterns signal psychological strain.

Frustration Signatures

High frustration paired with extreme satisfaction scores reveals six distinct strain patterns.

SignatureRiskCount% of Participants

Score Distributions

How subscale scores (0-10) and Big Five percentiles spread across the population.

Subscale Statistics

SubscaleMeanSDMinMax

Individual Participants

Click a row to see the full profile. Search, filter, or click column headers to sort.

# Type Domain A-State B-State C-State A-Sat A-Frust B-Sat B-Frust C-Sat C-Frust

Take an Assessment

Experience the ABC Assessment as a user would. Choose a tier, answer the questions, and see your motivational profile.

Onboarding

6 items · ~2 minutes

Quick first impression. One satisfaction and one frustration item per domain gives an early signal of your motivational shape.

Standard

12 items · ~4 minutes

Stronger signal within the first two weeks. Two items per subscale sharpens the profile and adds confidence to the archetype.

Full Assessment

36 items · ~12 minutes

The complete validated instrument. Six items per subscale produces reliable subscale means, type assignment, and risk signals.

Trajectory: Variance from Baseline

The baseline is the athlete's full 36-item assessment at the start of the season. Weekly check-ins track how far all six subscales drift from that starting point. If any subscale exceeds the reassessment threshold, a new full assessment is needed.

Select Participant

Choose a simulated participant to use as the baseline. Their full assessment scores become the starting point for weekly check-ins.

Classification Stability

Each participant is re-measured 100 times with realistic noise (±1 Likert point per item). Stability = % of trials where the same classification is assigned.

Type Stability
State Stability
Signature Stability
Belbin Combos Observed

Test-Retest Distribution

Distribution of per-participant type stability scores. Higher is more reliable.

Big Five Round-Trip Validation

Ground-truth Big Five scores are generated, mapped to ABC responses via reverse weights, then inferred back. Correlation measures recovery fidelity.

Boundary Participants

Participants closest to classification thresholds, ranked by instability. These are the profiles most likely to flip under measurement error.

#TypeType StabilityNearest ThresholdDistance

Methodology

This section documents the complete scoring pipeline, all mathematical formulas, weight matrices, classification thresholds, and psychometric evidence for auditor review. All evidence is from synthetic simulation; empirical validation is pending.

1. Theory 2. Pipeline 3. Domain States 4. Big Five 5. Types 6. Belbin 7. Signatures 8. Psychometrics 9. Population 10. Worked Example 11. Trajectory 12. Sim Params 13. Limitations 14. References

1. Theoretical Foundation

The ABC assessment operationalises Self-Determination Theory (Deci & Ryan, 2000) through three psychological need domains, each measured by two subscales:

DomainSDT ConstructSatisfaction SubscaleFrustration Subscale
AmbitionAutonomy-directed goal pursuitA-Sat (AS1-AS6)A-Frust (AF1-AF6)
BelongingRelational connectionB-Sat (BS1-BS6)B-Frust (BF1-BF6)
CraftSkill-directed competenceC-Sat (CS1-CS6)C-Frust (CF1-CF6)

Satisfaction and frustration are measured as independent constructs, not opposite ends of a single continuum (Vansteenkiste & Ryan, 2013; Chen et al., 2015). This allows detection of the "Vulnerable" state (high satisfaction + high frustration), which SDT identifies as a burnout precursor. Murphy et al. (2023) raised concerns that this distinction may be a method artifact from item-keying direction; the simulation includes bifactor and method-factor models (Phase 3) to test this directly.

2. Scoring Pipeline

From raw responses to typed profiles with team roles and risk flags.

  1. Ingest raw responses. 36 items on a 1-7 Likert scale. 6 items per subscale, 6 subscales. Items 4 and 6 of each subscale are reverse-coded (AS4, AS6, AF4, AF6, BS4, BS6, BF4, BF6, CS4, CS6, CF4, CF6).
  2. Reverse-score. For reverse-coded items: scored_value = 8 - raw_value.
  3. Compute subscale means. mean = (item_1 + ... + item_6) / 6. Range: [1.0, 7.0].
  4. Normalise to 0-10. score = ((mean - 1) / 6) × 10. Range: [0.0, 10.0].
  5. Classify domain states. Two thresholds per domain create four states. See Section 3.
  6. Find dominant domain. argmax of satisfaction scores. Ties broken: ambition > belonging > craft.
  7. Infer Big Five percentiles. Internal only. See Section 4 for the weight matrix.
  8. Assign base pattern. 8 types from binary satisfaction threshold. See Section 5.
  9. Map Belbin roles. See Section 6 for the cluster mapping and scoring formula.
  10. Flag frustration signatures. See Section 7 for the risk classification matrix.

3. Domain State Classification

Each domain is classified into one of four mutually exclusive states using two thresholds:

Frustration < 4.38Frustration ≥ 4.38
Satisfaction ≥ 6.46ThrivingVulnerable
Satisfaction < 6.46MildDistressed

Threshold derivation status: The thresholds 6.46 and 4.38 were positioned between discrete score boundaries on the normalised 0-10 scale to prevent classification artifacts from the 7-point Likert resolution. Phase 2 ROC analysis on synthetic criterion data produced empirical thresholds of 6.09 (satisfaction) and 4.82 (frustration), both within the bootstrap 95% CIs of the fixed values. Empirical thresholds from ABQ criterion data will replace these when N ≥ 200 paired datasets are available.

Decision consistency (Phase 2b): With 6 items per subscale, per-domain classification agreement across two independent administrations is approximately 75%. Joint agreement across all three domains is approximately 42%. This means classifications should be interpreted with confidence bands, not as definitive labels.

4. Big Five Weight Matrix

Big Five percentiles are inferred from centred subscale scores. This is internal only, used for Belbin role inference, not displayed to athletes.

Step 1: Centre. zi = (scorei - 5.0) / 5.0 for each of 6 subscales.

Step 2: Dot product. raw_trait = Σ (wij × zi) for each trait j.

Step 3: Percentile. percentile = clamp(50 + raw_trait × 30, 1, 99).

Weight matrix W (6 subscales × 5 traits):

OpennessConscientiousnessExtraversionAgreeablenessNeuroticism
A-Sat0.120.030.47-0.230.00
A-Frust0.160.130.020.190.24
B-Sat-0.360.200.270.430.05
B-Frust-0.350.300.19-0.130.41
C-Sat0.520.18-0.120.08-0.03
C-Frust0.33-0.450.110.180.05

Design constraints: Weights optimised to produce inter-trait correlations |r| < 0.02 and primary-trait distribution of ~20% each, validated against Gosling et al. (2003) benchmarks. No validated Big Five instrument underpins these weights; they are inferential approximations from ABC subscale patterns.

5. Type System (8 Base Patterns)

Types use a binary satisfaction threshold (sat ≥ 5.5 = Strong, sat < 5.5 = Developing) per domain, producing 23 = 8 base patterns. Frustration is reported as a continuous score per domain with confidence bands, not as a categorical modifier.

PatternAmbitionBelongingCraft
IntegratorStrongStrongStrong
CaptainStrongStrongDeveloping
ArchitectStrongDevelopingStrong
MentorDevelopingStrongStrong
PioneerStrongDevelopingDeveloping
AnchorDevelopingStrongDeveloping
ArtisanDevelopingDevelopingStrong
SeekerDevelopingDevelopingDeveloping

Why 8, not 24: The previous 24-type system (8 patterns × 3 frustration modifiers: Steady/Striving/Resolute) produced only ~31% type agreement across readministrations (Phase 2b). Reducing to 8 base patterns nearly doubles agreement to ~50-55%. Frustration information is preserved as continuous values with no binning loss. The categorical modifier may return if empirical decision consistency (kappa ≥ 0.60) supports it with the current 6 items per subscale.

Activation threshold (5.5): Positioned below the domain state threshold (6.46) because activation precedes full satisfaction. A need can be engaged and energising without yet reaching "Thriving." This threshold is assumed, not empirically derived. ROC analysis against coach engagement ratings will produce domain-specific empirical thresholds when criterion data is available.

6. Belbin Role Inference

Cluster mapping:

DomainClusterRolesDifferentiating Trait
CraftThinkingPlant, Specialist, Monitor-EvaluatorOpenness, Conscientiousness, Neuroticism
BelongingPeopleTeamworker, Resource Investigator, CoordinatorAgreeableness, Extraversion, Conscientiousness
AmbitionActionShaper, Implementer, Completer-FinisherExtraversion, Conscientiousness, Neuroticism

Scoring formula: role_score = domain_affinity × (trait_percentile / 100).

Affinity weights: Primary domain = 1.0, Secondary domain = 0.5, Tertiary domain = 0.0 (excluded).

Firing threshold: Roles with score ≥ 0.30 fire. The highest-scoring role always fires regardless of threshold.

Qualifier: "Natural" if trait percentile ≥ 60, "Manageable" if below 60.

Caveat: This is a heuristic mapping, not a validated Belbin instrument. Role scores are continuous approximations.

7. Frustration Signatures

Frustration signatures fire when a domain's frustration score ≥ 4.38. The risk level depends on whether satisfaction is also high:

DomainSat ≥ 6.46 + Frust ≥ 4.38Sat < 6.46 + Frust ≥ 4.38
AmbitionBlocked Drive (medium risk)Controlled Motivation (high risk)
BelongingConditional Belonging (medium risk)Active Exclusion (high risk)
CraftEvaluated Mastery (medium risk)Competence Threat (high risk)

8. Psychometric Properties (Simulation)

All values below are from synthetic data. Empirical validation will replace them.

PropertyMethodResultBasis
IRT theta recoveryEAP scoring vs true thetar > 0.90Bock & Mislevy (1982)1
Bifactor omega-hBifactor model0.246 (subscales independent)Reise (2012); Reise, Bonifay & Haviland (2013)2
ECVExplained Common Variance0.061 (specific factors dominate)Reise, Bonifay & Haviland (2013)2
Per-domain classification agreementSimulated readministration~75%APA Standards (2014), Standard 2.163
Type agreement (8 patterns)Simulated readministration~50-55%Cohen (1960)4
Difference score reliabilityrdiff = (rx + ry - 2rxy) / (2 - 2rxy)0.70-0.90APA Standards (2014), Standard 2.43
Conditional SEM at thresholds1 / √I(θ) at cutpoints~0.20 theta unitsBaker & Kim (2004)5
36-item tier reliabilityIRT marginal reliability0.943Samejima (1969)6; APA Standards (2014), Standard 2.93
18-item tier reliabilityIRT marginal reliability0.870Samejima (1969)6
6-item tier reliabilityIRT marginal reliability0.714Samejima (1969)6
Cascade detection lagVulnerable-to-Distressed simulation1.5 timepointsVansteenkiste & Ryan (2013); Lonsdale & Hodge (2011)7
Alert sensitivityRCI-based detection81.1% at optimal thresholdJacobson & Truax (1991); Youden (1950)8

Footnotes: 1 EAP estimation method. 2 Bifactor model specification and omega coefficient definitions. 3 APA/AERA/NCME Standards for Educational and Psychological Testing govern decision consistency (2.16), difference score reliability (2.4), and tier-specific reliability (2.9). 4 Cohen's kappa for chance-corrected classification agreement. 5 IRT information function and conditional SEM derivation. 6 Graded Response Model for polytomous IRT. 7 SDT theoretical basis for the Vulnerable-to-Distressed cascade (need frustration as burnout precursor). 8 Jacobson-Truax Reliable Change Index for detecting meaningful change; Youden Index for optimal diagnostic threshold selection.

9. How a Synthetic Participant Is Created

Each simulated participant is generated through this exact sequence. Every step is deterministic given a seed, which ensures reproducibility across runs.

  1. Draw 6 latent subscale z-scores. For each of the 6 subscales (A-Sat, A-Frust, B-Sat, B-Frust, C-Sat, C-Frust), sample a value from N(z_mean, 1). The z-means are tuned offsets: satisfaction z-mean = +0.30, frustration z-mean = -0.31. These offsets compensate for structural biases in the type distribution (see Section 12). The 6 draws are independent (identity correlation matrix).
  2. Generate 36 item responses from the z-scores. For each subscale, generate 6 item-level responses by adding independent Gaussian noise (SD = 0.3 × noise_multiplier) to the subscale z-score, then transform to the 1-7 Likert scale: item = clamp(round(z × scale + midpoint), 1, 7). This simulates the measurement noise that real items would introduce.
  3. Apply reverse coding. Items 4 and 6 of each subscale are reverse-scored: scored = 8 - raw (AS4, AS6, AF4, AF6, BS4, BS6, BF4, BF6, CS4, CS6, CF4, CF6). This mirrors the actual instrument design where two items per subscale are reverse-keyed to detect acquiescence bias.
  4. Compute 6 subscale means. Average the 6 scored items per subscale. Range: [1.0, 7.0].
  5. Normalise to 0-10 scale. score = ((mean - 1) / 6) × 10. This maps the Likert midpoint (4.0) to 5.0 on the normalised scale.
  6. Run the full scoring pipeline. The 6 normalised scores enter the 10-step pipeline (Section 2), producing domain states, dominant domain, Big Five percentiles, base pattern type, Belbin roles, and frustration signatures.

What this means: A participant's type is not assigned directly. It emerges from the interaction of 6 latent z-scores, 36 noisy item responses, and 5 classification thresholds. Two participants with similar z-scores may receive different types because item-level noise pushes their subscale means above or below a threshold. This is the source of the classification instability documented in Section 8.

10. Worked Example: One Participant

This traces a single simulated participant through every computation. All values are illustrative.

StepComputationResult
1. Latent z-scoresDraw from N(+0.30, 1) for sat, N(-0.31, 1) for frustA-Sat z=0.82, A-Frust z=-0.55, B-Sat z=1.14, B-Frust z=0.03, C-Sat z=-0.21, C-Frust z=-0.89
2. Item responsesz × scale + midpoint + noise, clamp to [1,7]AS1=6, AS2=5, AS3=6, AS4=2 (reverse), AS5=6, AS6=3 (reverse), AF1=3, AF2=2, AF3=3, AF4=5 (reverse), AF5=2, AF6=4 (reverse), ...
3. Reverse codingAS4: 8-2=6, AS6: 8-3=5, AF4: 8-5=3, AF6: 8-4=4AS4 scored=6, AS6 scored=5, AF4 scored=3, AF6 scored=4
4. Subscale means(6+5+6+6+6+5)/6 = 5.67A-Sat mean=5.67, A-Frust mean=2.83, B-Sat mean=6.50, B-Frust mean=4.00, C-Sat mean=4.25, C-Frust mean=2.25
5. Normalise((5.67-1)/6)×10 = 7.78A-Sat=7.78, A-Frust=3.05, B-Sat=9.17, B-Frust=5.00, C-Sat=5.42, C-Frust=2.08
6. Domain statesA: sat 7.78≥6.46, frust 3.05<4.38Ambition: Thriving, Belonging: Vulnerable (sat 9.17≥6.46, frust 5.00≥4.38), Craft: Mild (sat 5.42<6.46, frust 2.08<4.38)
7. Dominant domainargmax(7.78, 9.17, 5.42)Belonging (9.17)
8. Base patternA-Sat 7.78≥5.5: Strong. B-Sat 9.17≥5.5: Strong. C-Sat 5.42<5.5: Developing.Captain (A Strong, B Strong, C Developing)
9. Frustration levelsContinuous per domainAmbition: 3.05 (low), Belonging: 5.00 (elevated), Craft: 2.08 (low)
10. Frustration signaturesBelonging: sat 9.17≥6.46, frust 5.00≥4.38Conditional Belonging (medium risk)

This participant is a Captain with elevated belonging frustration. The Vulnerable state in Belonging, combined with the Conditional Belonging signature, signals that relational connection is strong but under strain. In a longitudinal context, if belonging frustration continues to rise while satisfaction remains high, the trajectory engine would detect this as the early phase of a Vulnerable-to-Distressed cascade.

11. Trajectory Analysis

The trajectory system monitors how an athlete's scores change over repeated administrations. It operates on continuous theta scores with standard errors, not on categorical domain state labels, because domain state classifications flip on ~33% of readministrations (Section 8). The trajectory system answers three questions:

What is reliable change?

Not every score change is meaningful. Some change is measurement noise. The Jacobson-Truax Reliable Change Index (RCI) [14] distinguishes signal from noise:

SEdiff = √(SEt² + SEt+1²)
RCI = (scoret+1 - scoret) / SEdiff
|RCI| > 1.96 = reliable change (95% confidence)

If the RCI exceeds 1.96, the change is larger than what measurement error alone could produce. If it does not, the change may be noise.

What is the trajectory pattern?

Over multiple timepoints, the trajectory engine classifies each athlete's score history into one of five patterns [15]:

PatternDefinitionSimulated Prevalence
StableNo significant trend, low variability40%
Gradual DeclineSignificant negative slope over the observation window20%
Gradual RiseSignificant positive slope over the observation window20%
Acute EventSingle large reliable drop (RCI < -3.0) between consecutive points10%
VolatileMany direction changes with large amplitude (> 60% of timepoints change direction)10%

Pattern detection uses a linear trend test over a sliding window. The slope is tested against an SE-adjusted null: slope is significant if |slope| > 1.96 × SEslope, where SEslope incorporates both measurement error and sampling variability.

What is the Vulnerable-to-Distressed cascade?

SDT predicts that need frustration rises before need satisfaction drops [26]. The cascade model simulates this: frustration increases at rate rfrust per timepoint starting at T=0, while satisfaction decreases at rate rsat per timepoint starting at T=lag. The lag is the detection window: the time during which frustration is elevated but satisfaction has not yet declined.

frustration(t) = initial_frust + rfrust × t + noise
satisfaction(t) = initial_sat - rsat × max(0, t - lag) + noise
Mean simulated lag: 1.5 timepoints

In the simulation, this produces the leading indicator signal: a practitioner monitoring frustration would see it rising 1.5 measurement points before satisfaction begins to drop. The alert system fires when a reliable decline (RCI < threshold) is detected. At the optimal threshold (RCI = -3.00), the alert achieves 81% sensitivity with 16% false positive rate [14, 27].

How trajectories are simulated

For each simulated athlete, the trajectory generator:

  1. Assigns a trajectory type by random draw from the prevalence distribution (40% stable, 20% decline, 20% rise, 10% acute, 10% volatile).
  2. Generates a base score trajectory using the type-specific formula (e.g., for decline: score(t) = base - rate × t + noise).
  3. Adds measurement noise at each timepoint (SD = 0.4) to simulate the imprecision of real item responses.
  4. Detects burnout onset when the score falls below 3.5 on the 0-10 scale (for decline and acute trajectories).
  5. Computes RCI between consecutive timepoints to flag reliable changes.
  6. Classifies the pattern based on volatility, acute drops, and trend significance.

What this does not capture: Real athlete trajectories are shaped by events (injury, selection, exam periods, relationship changes) that the simulation cannot model. The simulation tests whether the detection infrastructure works under controlled noise conditions. Whether the patterns it detects correspond to real burnout transitions can only be determined with empirical longitudinal data paired with a criterion measure such as the ABQ [20].

12. Simulation Parameters

The dashboard generates synthetic participants using these fixed parameters:

  • Distribution: Six subscale scores drawn from independent normal distributions. Satisfaction z-mean = +0.24, frustration z-mean = -0.31 (tuned to flatten type distribution).
  • Correlation matrix: Identity (zero inter-subscale correlation). This is a conservative baseline; real data will show within-domain negative correlations (sat vs frust) and cross-domain positive correlations.
  • Item-level noise: Independent Gaussian noise (SD = 0.3 × noise multiplier), clamped to [1, 7], rounded to integers.
  • Convergence: At scale, every run converges to the same population shape. Variability comes from item-level noise, not population-level parameter changes.

13. Limitations and Transparency

  • Synthetic data only. Every participant was generated from parametric distributions. No real athletes contributed data.
  • No empirical validation yet. The pipeline validates against its own generative model. CFA fit of 1.000 on synthetic data is circular. Empirical CFA on real responses is the true test.
  • This is an AI-developed instrument. No licensed SDT materials were used. The ABC assessment is a purpose-built instrument developed through AI-assisted research and iterative simulation. We are transparent about what has confidence (the analytic infrastructure, the theoretical framework) and where further research is needed (empirical criterion validation, item-level calibration on real data).
  • Big Five estimates are inferential. The weight matrix approximates Big Five percentiles from subscale patterns. No validated Big Five instrument underpins it.
  • Belbin roles rest on heuristics. Domain satisfaction selects a cluster and Big Five percentiles differentiate within it. This is not a Belbin instrument.
  • Thresholds will shift with empirical data. All thresholds (6.46, 4.38, 5.5) are calibrated to the simulation. ROC analysis against the Athlete Burnout Questionnaire (Raedeke & Smith, 2001; Grugan et al., 2024) will produce empirical thresholds when paired data is available.
  • Classification instability is a known limitation. Per-domain agreement is ~75% with 6 items per subscale (36 total). All classifications should carry confidence bands. Empirical validation may warrant further item pool expansion if agreement falls below this estimate.

14. References

  1. [1] APA, AERA, & NCME. (2014). Standards for Educational and Psychological Testing.
  2. [2] Baker, F. B., & Kim, S.-H. (2004). Item Response Theory: Parameter Estimation Techniques (2nd ed.). Marcel Dekker.
  3. [3] Bartholomew, K. J., et al. (2011). Self-determination theory and diminished functioning. Personality and Social Psychology Bulletin, 37(11), 1459-1473.
  4. [4] Bock, R. D., & Mislevy, R. J. (1982). Adaptive EAP estimation of ability. Applied Psychological Measurement, 6(4), 433-444.
  5. [5] Chen, B., et al. (2015). Basic psychological need satisfaction, need frustration, and need strength across four cultures. Motivation and Emotion, 39, 216-236.
  6. [6] Chen, F. F. (2007). Sensitivity of goodness of fit indexes. Structural Equation Modeling, 14(3), 464-504.
  7. [7] Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20, 37-46.
  8. [8] Costa, P. T., & McCrae, R. R. (1992). Revised NEO Personality Inventory. Psychological Assessment Resources.
  9. [9] Deci, E. L., & Ryan, R. M. (2000). The "what" and "why" of goal pursuits. Psychological Inquiry, 11(4), 227-268.
  10. [10] Efron, B., & Tibshirani, R. J. (1993). An Introduction to the Bootstrap. Chapman & Hall.
  11. [11] Gosling, S. D., et al. (2003). A very brief measure of the Big-Five personality domains. Journal of Research in Personality, 37(6), 504-528.
  12. [12] Grugan, M. C., et al. (2024). Factorial validity and measurement invariance of the ABQ. Psychology of Sport and Exercise, 73, 102638.
  13. [13] Hu, L., & Bentler, P. M. (1999). Cutoff criteria for fit indexes. Structural Equation Modeling, 6(1), 1-55.
  14. [14] Jacobson, N. S., & Truax, P. (1991). Clinical significance. Journal of Consulting and Clinical Psychology, 59(1), 12-19.
  15. [15] Lonsdale, C., & Hodge, K. (2011). Temporal ordering of motivational quality and athlete burnout. Medicine & Science in Sports & Exercise, 43(5), 913-921.
  16. [16] Lonsdale, C., et al. (2009). Athlete burnout in elite sport: A self-determination perspective. Journal of Sports Sciences, 27(8), 785-795.
  17. [17] Lovibond, S. H., & Lovibond, P. F. (1995). Manual for the Depression Anxiety Stress Scales. Psychology Foundation.
  18. [18] McDonald, R. P. (1999). Test Theory: A Unified Treatment. Erlbaum.
  19. [19] Murphy, J., et al. (2023). The BPNSFS probably does not validly measure need frustration. Motivation and Emotion, 47, 899-919.
  20. [20] Raedeke, T. D., & Smith, A. L. (2001). Development and preliminary validation of an athlete burnout measure. Journal of Sport & Exercise Psychology, 23(4), 281-306.
  21. [21] Reise, S. P. (2012). The rediscovery of bifactor measurement models. Multivariate Behavioral Research, 47(5), 667-696.
  22. [22] Reise, S. P., Bonifay, W. E., & Haviland, M. G. (2013). Scoring and modeling issues in bifactor analysis. Psychological Assessment, 25(2), 404-415.
  23. [23] Samejima, F. (1969). Estimation of Latent Ability Using a Response Pattern of Graded Scores. Psychometrika Monograph No. 17.
  24. [24] Swets, J. A. (1988). Measuring the accuracy of diagnostic systems. Science, 240(4857), 1285-1293.
  25. [25] Vandenberg, R. J., & Lance, C. E. (2000). A review and synthesis of the measurement invariance literature. Organizational Research Methods, 3(1), 4-70.
  26. [26] Vansteenkiste, M., & Ryan, R. M. (2013). On psychological growth and vulnerability. Journal of Psychotherapy Integration, 23(3), 263-280.
  27. [27] Youden, W. J. (1950). Index for rating diagnostic tests. Cancer, 3(1), 32-35.