NFL Simulation Engine

Help & Documentation

Learn how to use the NFL Simulation Engine. This guide covers everything from basic concepts to advanced workflows.

Table of Contents

Overview: What Does This System Do?

The NFL Simulation Engine predicts NFL game outcomes using a Bayesian statistical model. Here's what it does:

Example Use Case

You want to predict the outcome of Chiefs vs Bills. The system:

  1. Uses trained model to estimate each team's offensive/defensive strength
  2. Accounts for QB quality (Mahomes vs Allen)
  3. Simulates the game 10,000 times
  4. Reports: "Chiefs win 58% of simulations, average score 27-24"

Key Concepts

1. EPA (Expected Points Added)

EPA measures how many points a play is worth on average. A +3 EPA play means the team gained about 3 points of expected value. The model predicts EPA for each play based on:

2. Model Training

Training teaches the model patterns from historical data. You specify:

3. Simulation

Simulation runs the game thousands of times using the trained model. Each simulation:

  1. Generates drives using EPA predictions
  2. Converts drive EPA to points
  3. Alternates possessions between teams
  4. Records final score

After 10,000 simulations, you get distributions: "Chiefs win 58% of the time, average score 27-24."

4. Model Artifacts

Each trained model is saved as an "artifact" with:

You can have multiple models and switch between them. Only one can be "active" at a time.

Basic Workflow

Here's the typical workflow for using the system:

Step 1: Train a Model

First, you need a trained model. Go to Train Model and:

  1. Select seasons (e.g., "2016,2017,2018,2019,2020,2021,2022,2023,2024")
  2. Choose profile (start with "Fast" for ~15 minutes)
  3. Click "Start Training"
  4. Wait for completion (check job status)

Note: Training can take 15 minutes to several hours depending on profile.

Step 2: Run a Simulation

Once you have an active model:

  1. Go to New Simulation
  2. Enter teams (e.g., Home: "KC", Away: "BUF")
  3. Enter QBs (e.g., "Patrick Mahomes", "Josh Allen")
  4. Set number of simulations (default 10,000 is good)
  5. Click "Submit Simulation"
  6. View results on the job detail page

Step 3: Interpret Results

The results show:

  • Win Probability: % chance home team wins
  • Score Distribution: Average scores and ranges
  • Spread/Total: Predicted point spread and total points
  • Quantiles: 5th, 25th, 50th, 75th, 95th percentiles

Training a Model: Detailed Guide

When to Train

Training Parameters Explained

Seasons

Format: Comma-separated list (e.g., "2016,2017,2018,2019,2020,2021,2022,2023,2024")

Recommendation: Use recent seasons (last 5-8 years). Older data may be less relevant due to rule changes.

Training Profile

Profile Time Samples Chains Use Case
Dev ~5 min 50k 2 Quick testing, development
Fast ~15 min 120k 2 Standard use, good balance
Full ~2+ hours All data 4 Production, best accuracy
Overnight ~4+ hours All data 4 (NUTS) Highest quality, run overnight

Inference Method

Example: Your First Training

For Beginners:

  1. Go to Train Model
  2. Seasons: 2016,2017,2018,2019,2020,2021,2022,2023,2024
  3. Profile: Fast (good balance of speed and quality)
  4. Inference: Auto
  5. Click "Start Training"
  6. Wait ~15 minutes (check job status page)
  7. Model will be set as active automatically

Monitoring Training Progress

After starting training:

  1. You'll be redirected to the job detail page
  2. Watch the log for progress updates
  3. Status will change: Queued → Running → Succeeded
  4. When complete, model appears in Model Registry

Running Simulations: Detailed Guide

Required Information

Optional Parameters

Example: Chiefs vs Bills

Scenario: You want to simulate Chiefs (home) vs Bills (away)

  1. Go to New Simulation
  2. Home Team: KC
  3. Away Team: BUF
  4. Home QB: Patrick Mahomes
  5. Away QB: Josh Allen
  6. Simulations: 10000 (default)
  7. Click "Submit Simulation"

Result: You'll see win probability, score distributions, and more.

Understanding QB Names

QB names should match how they appear in NFL data. Common formats:

Tip: If unsure, check recent game logs or use the QB's full name as it appears in official NFL stats.

Market Calibration (Advanced)

Market calibration blends your model's predictions with betting market consensus:

When to use: If you have betting market data and want to incorporate market wisdom. Requires a CSV file with columns: game_id, spread_home, total, home_team, away_team.

Running Backtests: Detailed Guide

What is a Backtest?

A backtest evaluates model accuracy by:

  1. Running simulations on historical games
  2. Comparing predictions to actual outcomes
  3. Calculating metrics (Brier score, log loss, calibration)

When to Run Backtests

Backtest Parameters

Example: Testing 2023 Season

  1. Go to Run Backtest
  2. Seasons: 2023
  3. Simulations: 5000 (default)
  4. Model: Use active model
  5. Click "Submit Backtest"
  6. Wait for completion (can take 30+ minutes)
  7. View results: accuracy metrics, calibration curves, etc.

Understanding Backtest Results

The backtest report includes:

Step-by-Step Examples

Example 1: Complete Beginner Workflow

Goal: Predict Chiefs vs Bills game

Step 1: Train Your First Model

  1. Click "Train Model" in navigation
  2. Seasons field: Enter 2016,2017,2018,2019,2020,2021,2022,2023,2024
  3. Profile dropdown: Select "Fast"
  4. Inference: Leave as "Auto"
  5. Click "Start Training"
  6. Wait ~15 minutes (watch the job status page)

Step 2: Run Your First Simulation

  1. Once training completes, click "New Simulation"
  2. Home Team: KC
  3. Away Team: BUF
  4. Home QB: Patrick Mahomes
  5. Away QB: Josh Allen
  6. Leave other fields as defaults
  7. Click "Submit Simulation"

Step 3: Interpret Results

On the results page, you'll see:

  • Win Probability: e.g., "58.2%" means Chiefs win 58.2% of simulations
  • Score Distribution: Average scores and ranges
  • Spread: Predicted point spread (e.g., "Chiefs by 3.2 points")

Example 2: Comparing Two Models

Goal: See if a model trained on recent seasons performs better

Step 1: Train Model A (All Seasons)

  1. Train with seasons: 2016,2017,2018,2019,2020,2021,2022,2023,2024
  2. Profile: Fast
  3. Note the model hash (e.g., "a3f2b1c9")

Step 2: Train Model B (Recent Seasons Only)

  1. Train with seasons: 2020,2021,2022,2023,2024
  2. Profile: Fast
  3. Note the model hash (e.g., "b4e3c2d1")

Step 3: Run Same Simulation with Both Models

  1. Run simulation: Chiefs vs Bills
  2. First time: Use Model A (select from dropdown)
  3. Second time: Use Model B (select from dropdown)
  4. Compare results: Which gives more realistic predictions?

Step 4: Backtest Both Models

  1. Run backtest on 2023 season with Model A
  2. Run backtest on 2023 season with Model B
  3. Compare Brier scores: Lower is better

Example 3: Using Market Calibration

Goal: Blend model predictions with betting market data

Step 1: Prepare Market Data

Create a CSV file with columns:

game_id,spread_home,total,home_team,away_team
2024_1_KC_BUF,-3.5,52.5,KC,BUF
2024_1_SF_DAL,7.0,48.0,SF,DAL

Step 2: Upload CSV

Place CSV file on server (e.g., /home/azureuser/nfl-sim/market_data.csv)

Step 3: Run Simulation with Market Blend

  1. Go to New Simulation
  2. Enter teams and QBs
  3. Market CSV Path: /home/azureuser/nfl-sim/market_data.csv
  4. Market Blend Weight: 0.3 (30% market, 70% model)
  5. Submit simulation

Result: Predictions blend your model (70%) with market consensus (30%)

Interpreting Results

Simulation Results

Win Probability

Example: "58.2%" means the home team wins in 58.2% of simulations.

Score Distribution

Example: "Home: 27.3 (std: 7.2), Away: 24.1 (std: 6.8)"

Quantiles

Example: "Home Score 5th percentile: 15, 95th percentile: 38"

Interpretation: "There's a 90% chance the home team scores between 15 and 38 points."

Spread

Example: "Spread: 3.2 (std: 10.1)"

Backtest Results

Brier Score

Measures prediction accuracy. Range: 0 (perfect) to 1 (worst).

Log Loss

Measures probability calibration. Lower is better.

Calibration Curve

Shows if predicted probabilities match actual frequencies.

Troubleshooting

Common Issues

Training Job Stuck on "Queued"

Problem: Job never starts running

Solutions:

Simulation Returns Unexpected Results

Problem: Win probabilities seem wrong

Solutions:

Model Training Fails

Problem: Training job fails with error

Solutions:

Can't Find QB in Model

Problem: QB name not recognized

Solutions:

Getting Help

Quick Reference

Training Profiles

  • Dev: ~5 min, quick test
  • Fast: ~15 min, standard
  • Full: ~2+ hours, comprehensive
  • Overnight: ~4+ hours, highest quality

Recommended Settings

  • Seasons: Last 5-8 years
  • Simulations: 10,000
  • Profile: Fast (for most users)
  • Inference: Auto

Team Abbreviations

  • Use 2-3 letter codes
  • Examples: KC, BUF, SF, DAL
  • Case doesn't matter