F1 Race Review

How We Score Formula 1 Races

Our scoring system evaluates race worthwhileness on a 1-10 scale by analyzing ten distinct dimensions of racing quality. Each dimension contributes to the final score based on its assigned weight, reflecting how much it typically improves the spectator experience.

Key innovation: Dimension weights sum to 140% (not 100%), enabling multiple pathways to high scores. This design allows races to excel through different characteristics—chaotic weather-affected races, strategic multi-stop battles, or pure wheel-to-wheel racing—all can achieve excellent scores through their own merits. Final scores are capped at 10.0.

The scoring model processes race data from external sources, normalizes each metric to a 0-10 scale, applies dimension weights, and produces a composite score that represents overall race quality.

Weather (20% of total score)

Rain Factor

What it measures:

Weather conditions during the race, indicating whether it remained dry or featured rain and mixed wet-dry periods.

How it's calculated:

Binary measurement. Races with rainfall score 10.0, dry races score 0.0

Why it matters:

Rain introduces unpredictability by reducing grip, increasing braking distances, and forcing strategic decisions about tyre selection and timing. Wet conditions typically produce position changes and highlight driver skill in car control. The high weight reflects rain's consistent impact on race quality across the sport's history.

Interruptions (23% of total score)

Race Interruptions

What it measures:

Combined total of safety car deployments, virtual safety car periods, and red flag stoppages that interrupted the race.

How it's calculated:

Linear scale from 0 to maximum observed value. Higher values produce higher scores

Why it matters:

Interruptions bunch the field, eliminate time gaps, and create restart scenarios that enable position changes. They signal incidents or mechanical failures that reshape race strategy and create uncertainty about the final outcome.

Dnf Factor

What it measures:

Combined count of drivers who did not finish (DNF), did not start (DNS), or were disqualified (DSQ) from the race.

How it's calculated:

Linear scale from 0 to maximum observed value. Higher values produce higher scores

Why it matters:

Retirements indicate mechanical unreliability, driver errors, or contact incidents. High DNF counts suggest challenging conditions or an error-inducing circuit layout. Multiple retirements can promote lower-placed drivers into points positions and affect championship implications.

Racing Quality (40% of total score)

Overtakes Top10

What it measures:

Number of on-track position changes between drivers who finished in the top ten classified positions.

How it's calculated:

Linear scale based on overtake count

Why it matters:

Overtakes within the top ten directly affect race outcome and podium positions. These passes typically involve championship contenders and indicate competitive racing where position isn't determined solely by qualifying performance or car advantage.

Overtakes Total

What it measures:

Total number of on-track position changes recorded throughout the entire race across all positions.

How it's calculated:

Linear scale based on overtake count

Why it matters:

High overtake numbers indicate racing where position changes are achievable, suggesting good circuit design for wheel-to-wheel combat or performance parity between cars. Low overtake counts often correlate with processional races where grid order determines finishing order.

Strategy (22% of total score)

Tyre Strategy Variety

What it measures:

Number of distinct tyre compound sequences used by the top 6 finishers, where different pit stop orders count as unique strategies.

How it's calculated:

Linear scale based on count of unique strategies or compounds

Why it matters:

Strategy diversity indicates teams took different approaches to the race, creating varied pit stop timing and different performance windows throughout the race distance. This produces natural pace differentials and battles between cars on different strategic paths.

Unique Tyre Compounds

What it measures:

Number of different tyre compound types (dry, intermediate, wet) used by five or more drivers during the race.

How it's calculated:

Linear scale based on count of unique strategies or compounds

Why it matters:

Multiple compound types in play suggest evolving conditions or wide strategic choices. More compounds create performance differentials as cars on different tyres have different grip levels and degradation rates.

Competition & Finishing Order (35% of total score)

Top3 Gap

What it measures:

Time gap in seconds between the race winner and the second-place finisher at the checkered flag.

How it's calculated:

Inverse scale where smaller gaps score higher. 0 seconds = 10.0, gaps above 30 seconds = 0.0

Why it matters:

Close finishes indicate genuine competition for the win rather than dominant performance by a single car or driver. Small gaps suggest the leader was under pressure throughout the race distance.

Grid Chaos

What it measures:

Weighted score measuring dramatic position changes from qualifying grid to race finish, with higher scores for top-five starters who drop back or backmarkers who finish on the podium.

How it's calculated:

Linear scale based on observed value relative to maximum

Why it matters:

High volatility indicates the race shuffled the competitive order, meaning qualifying didn't determine the outcome. This suggests overtaking was possible, strategy played a role, or race incidents affected the natural pecking order.

Team Variety

What it measures:

Number of different constructor teams represented among the top five finishing positions.

How it's calculated:

Linear scale based on observed value relative to maximum

Why it matters:

Greater team variety in top positions indicates competitive parity and reduces dominance by a single team. Five different teams in the top five suggests open competition; one or two teams suggests performance concentration.

Normalization Rules

Boolean normalization

Applied to binary conditions like rainfall. True conditions score 10.0, false conditions score 0.0.

Linear normalization

Applied to counting metrics like overtakes or interruptions. Scales the observed value against the maximum value in the dataset: (value ÷ maximum) × 10, capped at 10.0.

Inverse gap normalization

Applied to time gaps where smaller is better. Maps gaps from 0 seconds (perfect 10.0) to 30+ seconds (0.0) using the formula: 10 - (gap ÷ 30) × 10.

Final Score Calculation

Each dimension's normalized score (0-10) is multiplied by its weight, and all weighted contributions are summed to produce the raw race score. Since dimension weights sum to 140%, raw scores can theoretically exceed 10.0.

Multiple pathways to excellence:

  • Chaos pathway (40%): Rain + interruptions can contribute up to 4.0 points
  • Racing pathway (80%): Overtakes + close finishes + strategy variety can contribute up to 8.0 points
  • Supporting factors (20%): Grid chaos + team variety + DNFs can contribute up to 2.0 points

This design means races don't need all dimensions to score highly—exceptional performance in any pathway can achieve 8-10 scores. The final result is capped at 10.0, with truly exceptional races hitting this ceiling through multiple excellence pathways simultaneously.