Help & Documentation

Learn how to use the NFL Simulation Engine. This guide covers everything from basic concepts to advanced workflows.

Overview: What Does This System Do?

The NFL Simulation Engine predicts NFL game outcomes using a Bayesian statistical model. Here's what it does:

Trains models on historical play-by-play data to learn team strengths, QB effects, and situational factors
Simulates games thousands of times to generate win probabilities and score distributions
Backtests model performance on historical games to evaluate accuracy
Calibrates predictions with betting market data (optional)

Example Use Case

You want to predict the outcome of Chiefs vs Bills. The system:

Uses trained model to estimate each team's offensive/defensive strength
Accounts for QB quality (Mahomes vs Allen)
Simulates the game 10,000 times
Reports: "Chiefs win 58% of simulations, average score 27-24"

Key Concepts

1. EPA (Expected Points Added)

EPA measures how many points a play is worth on average. A +3 EPA play means the team gained about 3 points of expected value. The model predicts EPA for each play based on:

Team offensive/defensive strength
QB quality
Down and distance
Field position
Game situation (score, time remaining)

2. Model Training

Training teaches the model patterns from historical data. You specify:

Seasons: Which years of data to use (e.g., 2016-2024)
Profile: How thorough the training should be
- Dev: Quick test (~5 min, 50k samples)
- Fast: Standard training (~15 min, 120k samples)
- Full: Comprehensive (~2+ hours, all data)
Inference Method: How the model learns
- ADVI: Faster, approximate (good for dev/fast)
- NUTS: Slower, exact MCMC (best for full training)

3. Simulation

Simulation runs the game thousands of times using the trained model. Each simulation:

Generates drives using EPA predictions
Converts drive EPA to points
Alternates possessions between teams
Records final score

After 10,000 simulations, you get distributions: "Chiefs win 58% of the time, average score 27-24."

4. Model Artifacts

Each trained model is saved as an "artifact" with:

Model Hash: Unique identifier (e.g., "a3f2b1c9")
Seasons: Training data used
Metrics: Performance statistics
Active Status: Whether it's used by default

You can have multiple models and switch between them. Only one can be "active" at a time.

Basic Workflow

Here's the typical workflow for using the system:

Step 1: Train a Model

First, you need a trained model. Go to Train Model and:

Select seasons (e.g., "2016,2017,2018,2019,2020,2021,2022,2023,2024")
Choose profile (start with "Fast" for ~15 minutes)
Click "Start Training"
Wait for completion (check job status)

Note: Training can take 15 minutes to several hours depending on profile.

Step 2: Run a Simulation

Once you have an active model:

Go to New Simulation
Enter teams (e.g., Home: "KC", Away: "BUF")
Enter QBs (e.g., "Patrick Mahomes", "Josh Allen")
Set number of simulations (default 10,000 is good)
Click "Submit Simulation"
View results on the job detail page

Step 3: Interpret Results

The results show:

Win Probability: % chance home team wins
Score Distribution: Average scores and ranges
Spread/Total: Predicted point spread and total points
Quantiles: 5th, 25th, 50th, 75th, 95th percentiles

Training a Model: Detailed Guide

When to Train

First time using the system
After new season data becomes available
When you want to test different training configurations
When you want to use different season ranges

Training Parameters Explained

Seasons

Format: Comma-separated list (e.g., "2016,2017,2018,2019,2020,2021,2022,2023,2024")

Recommendation: Use recent seasons (last 5-8 years). Older data may be less relevant due to rule changes.

Training Profile

Profile	Time	Samples	Chains	Use Case
Dev	~5 min	50k	2	Quick testing, development
Fast	~15 min	120k	2	Standard use, good balance
Full	~2+ hours	All data	4	Production, best accuracy
Overnight	~4+ hours	All data	4 (NUTS)	Highest quality, run overnight

Inference Method

Auto: Uses ADVI for dev/fast, NUTS for full (recommended)
ADVI: Faster, approximate Bayesian inference
NUTS: Slower, exact MCMC sampling (best quality)

Example: Your First Training

For Beginners:

Go to Train Model
Seasons: 2016,2017,2018,2019,2020,2021,2022,2023,2024
Profile: Fast (good balance of speed and quality)
Inference: Auto
Click "Start Training"
Wait ~15 minutes (check job status page)
Model will be set as active automatically

Monitoring Training Progress

After starting training:

You'll be redirected to the job detail page
Watch the log for progress updates
Status will change: Queued → Running → Succeeded
When complete, model appears in Model Registry

Running Simulations: Detailed Guide

Required Information

Home Team: 2-3 letter abbreviation (e.g., "KC", "BUF", "SF")
Away Team: 2-3 letter abbreviation
Home QB: Full name (e.g., "Patrick Mahomes")
Away QB: Full name (e.g., "Josh Allen")

Optional Parameters

Number of Simulations: Default 10,000 is good. More = more accurate but slower.
Random Seed: For reproducibility (default 42 is fine)
Model: Use active model (default) or select specific model
Market CSV: Path to betting market data (advanced)
Market Blend Weight: 0-1, how much to blend with market (0 = model only)

Example: Chiefs vs Bills

Scenario: You want to simulate Chiefs (home) vs Bills (away)

Go to New Simulation
Home Team: KC
Away Team: BUF
Home QB: Patrick Mahomes
Away QB: Josh Allen
Simulations: 10000 (default)
Click "Submit Simulation"

Result: You'll see win probability, score distributions, and more.

Understanding QB Names

QB names should match how they appear in NFL data. Common formats:

"Patrick Mahomes" (not "P. Mahomes" or "Mahomes")
"Josh Allen" (not "J. Allen")
"Lamar Jackson" (full name)

Tip: If unsure, check recent game logs or use the QB's full name as it appears in official NFL stats.

Market Calibration (Advanced)

Market calibration blends your model's predictions with betting market consensus:

w = 0.0: Use only model predictions (default)
w = 0.5: Equal blend of model and market
w = 1.0: Use only market predictions

When to use: If you have betting market data and want to incorporate market wisdom. Requires a CSV file with columns: game_id, spread_home, total, home_team, away_team.

Running Backtests: Detailed Guide

What is a Backtest?

A backtest evaluates model accuracy by:

Running simulations on historical games
Comparing predictions to actual outcomes
Calculating metrics (Brier score, log loss, calibration)

When to Run Backtests

After training a new model (validate performance)
Comparing different models
Evaluating model improvements
Understanding model strengths/weaknesses

Backtest Parameters

Seasons: Which seasons to test on (e.g., "2023,2024")
Simulations per Game: Default 5,000 is good
Model: Which model to test (default: active model)
Market CSV: Optional, for comparing to market

Example: Testing 2023 Season

Go to Run Backtest
Seasons: 2023
Simulations: 5000 (default)
Model: Use active model
Click "Submit Backtest"
Wait for completion (can take 30+ minutes)
View results: accuracy metrics, calibration curves, etc.

Understanding Backtest Results

The backtest report includes:

Brier Score: Lower is better (measures prediction accuracy)
Log Loss: Lower is better (measures probability calibration)
Calibration Curve: Shows if probabilities match actual frequencies
Game-by-Game Results: Predictions vs actual for each game

Step-by-Step Examples

Example 1: Complete Beginner Workflow

Goal: Predict Chiefs vs Bills game

Step 1: Train Your First Model

Click "Train Model" in navigation
Seasons field: Enter 2016,2017,2018,2019,2020,2021,2022,2023,2024
Profile dropdown: Select "Fast"
Inference: Leave as "Auto"
Click "Start Training"
Wait ~15 minutes (watch the job status page)

Step 2: Run Your First Simulation

Once training completes, click "New Simulation"
Home Team: KC
Away Team: BUF
Home QB: Patrick Mahomes
Away QB: Josh Allen
Leave other fields as defaults
Click "Submit Simulation"

Step 3: Interpret Results

On the results page, you'll see:

Win Probability: e.g., "58.2%" means Chiefs win 58.2% of simulations
Score Distribution: Average scores and ranges
Spread: Predicted point spread (e.g., "Chiefs by 3.2 points")

Example 2: Comparing Two Models

Goal: See if a model trained on recent seasons performs better

Step 1: Train Model A (All Seasons)

Train with seasons: 2016,2017,2018,2019,2020,2021,2022,2023,2024
Profile: Fast
Note the model hash (e.g., "a3f2b1c9")

Step 2: Train Model B (Recent Seasons Only)

Train with seasons: 2020,2021,2022,2023,2024
Profile: Fast
Note the model hash (e.g., "b4e3c2d1")

Step 3: Run Same Simulation with Both Models

Run simulation: Chiefs vs Bills
First time: Use Model A (select from dropdown)
Second time: Use Model B (select from dropdown)
Compare results: Which gives more realistic predictions?

Step 4: Backtest Both Models

Run backtest on 2023 season with Model A
Run backtest on 2023 season with Model B
Compare Brier scores: Lower is better

Example 3: Using Market Calibration

Goal: Blend model predictions with betting market data

Step 1: Prepare Market Data

Create a CSV file with columns:

game_id,spread_home,total,home_team,away_team
2024_1_KC_BUF,-3.5,52.5,KC,BUF
2024_1_SF_DAL,7.0,48.0,SF,DAL

Step 2: Upload CSV

Place CSV file on server (e.g., /home/azureuser/nfl-sim/market_data.csv)

Step 3: Run Simulation with Market Blend

Go to New Simulation
Enter teams and QBs
Market CSV Path: /home/azureuser/nfl-sim/market_data.csv
Market Blend Weight: 0.3 (30% market, 70% model)
Submit simulation

Result: Predictions blend your model (70%) with market consensus (30%)

Interpreting Results

Simulation Results

Win Probability

Example: "58.2%" means the home team wins in 58.2% of simulations.

50% = toss-up
>60% = strong favorite
<40% = strong underdog

Score Distribution

Example: "Home: 27.3 (std: 7.2), Away: 24.1 (std: 6.8)"

Mean: Average score across all simulations
Std: Standard deviation (higher = more uncertainty)

Quantiles

Example: "Home Score 5th percentile: 15, 95th percentile: 38"

5th percentile: Score exceeded in 95% of simulations (low end)
50th percentile (median): Middle score
95th percentile: Score exceeded in only 5% of simulations (high end)

Interpretation: "There's a 90% chance the home team scores between 15 and 38 points."

Spread

Example: "Spread: 3.2 (std: 10.1)"

Mean: Average point difference (home - away)
Positive = home team favored
Negative = away team favored

Backtest Results

Brier Score

Measures prediction accuracy. Range: 0 (perfect) to 1 (worst).

<0.20 = Excellent
0.20-0.25 = Good
>0.25 = Needs improvement

Log Loss

Measures probability calibration. Lower is better.

<0.50 = Excellent
0.50-0.70 = Good
>0.70 = Needs improvement

Calibration Curve

Shows if predicted probabilities match actual frequencies.

Diagonal line: Perfect calibration
Above diagonal: Model is overconfident
Below diagonal: Model is underconfident

Troubleshooting

Common Issues

Training Job Stuck on "Queued"

Problem: Job never starts running

Solutions:

Check Celery workers are running: sudo systemctl status gamesim-celery-train
Restart workers if needed: sudo systemctl restart gamesim-celery-train
Check logs: sudo journalctl -u gamesim-celery-train -f

Simulation Returns Unexpected Results

Problem: Win probabilities seem wrong

Solutions:

Verify QB names match NFL data exactly
Check team abbreviations are correct (3-letter codes)
Ensure model is trained on recent seasons
Try increasing number of simulations (10,000+ recommended)

Model Training Fails

Problem: Training job fails with error

Solutions:

Check job logs for specific error message
Verify data is available for selected seasons
Try smaller season range first
Check disk space: df -h
Check memory: free -h

Can't Find QB in Model

Problem: QB name not recognized

Solutions:

Use full name as it appears in NFL stats (e.g., "Patrick Mahomes" not "P. Mahomes")
Check if QB played in training seasons
Try "Unknown" if QB is not in training data (model will use average QB effect)

Getting Help

Check job logs for detailed error messages
Review model metrics in Model Registry
Compare with known-good examples
Check system resources (CPU, memory, disk)

Quick Reference

Training Profiles

Dev: ~5 min, quick test
Fast: ~15 min, standard
Full: ~2+ hours, comprehensive
Overnight: ~4+ hours, highest quality

Recommended Settings

Seasons: Last 5-8 years
Simulations: 10,000
Profile: Fast (for most users)
Inference: Auto

Team Abbreviations

Use 2-3 letter codes
Examples: KC, BUF, SF, DAL
Case doesn't matter

Help & Documentation

Table of Contents

Overview: What Does This System Do?

Example Use Case

Key Concepts

1. EPA (Expected Points Added)

2. Model Training

3. Simulation

4. Model Artifacts

Basic Workflow

Step 1: Train a Model

Step 2: Run a Simulation

Step 3: Interpret Results

Training a Model: Detailed Guide

When to Train

Training Parameters Explained

Seasons

Training Profile

Inference Method

Example: Your First Training

Monitoring Training Progress

Running Simulations: Detailed Guide

Required Information

Optional Parameters

Example: Chiefs vs Bills

Understanding QB Names

Market Calibration (Advanced)

Running Backtests: Detailed Guide

What is a Backtest?

When to Run Backtests

Backtest Parameters

Example: Testing 2023 Season

Understanding Backtest Results

Step-by-Step Examples

Example 1: Complete Beginner Workflow

Step 1: Train Your First Model

Step 2: Run Your First Simulation

Step 3: Interpret Results

Example 2: Comparing Two Models

Step 1: Train Model A (All Seasons)

Step 2: Train Model B (Recent Seasons Only)

Step 3: Run Same Simulation with Both Models

Step 4: Backtest Both Models

Example 3: Using Market Calibration

Step 1: Prepare Market Data

Step 2: Upload CSV

Step 3: Run Simulation with Market Blend

Interpreting Results

Simulation Results

Win Probability

Score Distribution

Quantiles

Spread

Backtest Results

Brier Score

Log Loss

Calibration Curve

Troubleshooting

Common Issues

Training Job Stuck on "Queued"

Simulation Returns Unexpected Results

Model Training Fails

Can't Find QB in Model

Getting Help

Quick Reference

Training Profiles

Recommended Settings

Team Abbreviations