Ratings & Models postmortem

RonJohn

Helluva Engineer
Messages
5,648
Yeah, Yards/Drive doesn't line up with both teams' total yards and total drives, and you can't make the numbers work by tossing out drives either. Something's amiss here. Also, there are other apparent errors in the raw data that call into question the overall integrity of the advanced stats for this game. The ESPN stats don't include King's interception, and the Game on Paper advanced stats that @slugboy reproduced here (which use ESPN data) show only 3 TFLs, yet ESPN counts 6. I thought perhaps ESPN was overcounting TFLs when multiple players participated in the tackle, but that doesn't line up either.
Even more than that, if there are ten drives each, it is impossible to have 56.87 yards/drive mathematically. If the denominator is 10, and the numerator is an even number there will not be anything in the hundreds place. 56.8 is possible, but not anything XX.X7. It can't be penalties either that caused the issue. GT had 8 penalties for 65 yards. That includes both offense and defense. If you attribute ALL of those to the offense it still isn't 568 yards. (463+65=528) Something is wrong with the calculations, or the description of the stat.
 

slugboy

Moderator
Staff member
Messages
13,979
How is the Yards/Drive calculated? It says each team had 10 drives, and GT-56.87, CO-40.92 Yards/Drive. However GT had 463 yards, not 568.

I looked at the drive data. It's still different than what ESPN has as total numbers but it matches their drive information (458 yards for GT, 305 for Colorado). So, either ESPN's totals are wrong, or their drive information is wrong, I guess.

GT:


Total Yards
458​
Average Drive Yards
45.8​

StartOffenseResultDescriptionYardsEPAStart WP%End WP%Score
Q1 15:00GTFumble2 plays, 11 yards, 0:2511-4.8562.30%54.70%0-0
Q1 12:08GTFumble4 plays, 27 yards, 1:4627-3.2944.60%40.50%0-7
Q1 9:48GTInterception6 plays, 38 yards, 2:51382.5743.80%38.80%0-7
Q1 4:55GTField Goal13 plays, 84 yards, 6:30844.5542.40%53.00%3-7
Q2 10:25GTTouchdown10 plays, 80 yards, 5:57806.3354.20%70.80%10-7
Q2 1:39GTField Goal9 plays, 49 yards, 1:39492.0570.40%70.10%13-7
Q3 12:54GTPunt6 plays, 14 yards, 2:3914-3.4270.90%62.90%13-7
Q3 6:00GTTouchdown11 plays, 75 yards, 5:46758.1661.20%84.30%20-13
Q4 8:04GTPunt4 plays, 19 yards, 2:4519-1.5661.10%50.20%20-20
Q4 2:43GTTouchdown5 plays, 61 yards, 1:44614.8773.50%97.70%27-20

Colorado:



Total Yards
305​
Average Drive Yards
30.5​

StartOffenseResultDescriptionYardsEPAStart WP%End WP%Score
Q1 14:20COLTouchdown5 plays, 36 yards, 2:27363.2345.30%54.10%0-7
Q1 10:18COLPunt3 plays, 0 yards, 0:270-3.2459.50%56.20%0-7
Q1 6:51COLPunt3 plays, 9 yards, 2:039-1.2457.80%57.60%0-7
Q2 13:31COLPunt5 plays, 23 yards, 3:0123-2.252.50%45.80%3-7
Q2 4:33COLField Goal8 plays, 51 yards, 2:54512.3831.80%41.10%10-10
Q3 15:00COLPunt4 plays, 25 yards, 2:0625-1.3430.20%29.10%13-10
Q3 10:02COLField Goal8 plays, 39 yards, 4:15390.637.10%43.40%13-13
Q3 0:14COLTouchdown15 plays, 75 yards, 6:49756.8118.80%46.60%20-20
Q4 5:04COLPunt5 plays, 22 yards, 2:2822-2.5949.80%26.50%20-20
Q4 1:07COLEnd of Game6 plays, 25 yards, 1:0725-0.726.30%0.00%27-20
 

RonJohn

Helluva Engineer
Messages
5,648
I looked at the drive data. It's still different than what ESPN has as total numbers but it matches their drive information (458 yards for GT, 305 for Colorado). So, either ESPN's totals are wrong, or their drive information is wrong, I guess.

GT:


Total Yards
458​
Average Drive Yards
45.8​

StartOffenseResultDescriptionYardsEPAStart WP%End WP%Score
Q1 15:00GTFumble2 plays, 11 yards, 0:2511-4.8562.30%54.70%0-0
Q1 12:08GTFumble4 plays, 27 yards, 1:4627-3.2944.60%40.50%0-7
Q1 9:48GTInterception6 plays, 38 yards, 2:51382.5743.80%38.80%0-7
Q1 4:55GTField Goal13 plays, 84 yards, 6:30844.5542.40%53.00%3-7
Q2 10:25GTTouchdown10 plays, 80 yards, 5:57806.3354.20%70.80%10-7
Q2 1:39GTField Goal9 plays, 49 yards, 1:39492.0570.40%70.10%13-7
Q3 12:54GTPunt6 plays, 14 yards, 2:3914-3.4270.90%62.90%13-7
Q3 6:00GTTouchdown11 plays, 75 yards, 5:46758.1661.20%84.30%20-13
Q4 8:04GTPunt4 plays, 19 yards, 2:4519-1.5661.10%50.20%20-20
Q4 2:43GTTouchdown5 plays, 61 yards, 1:44614.8773.50%97.70%27-20

But the question still remains: From your original post, where does the 56.87 Yards/Drive come from?
 

slugboy

Moderator
Staff member
Messages
13,979
But the question still remains: From your original post, where does the 56.87 Yards/Drive come from?
I’m not sure, but it’s almost 458/9. I would guess that somewhere in the data feed the total yards and the number of drives is off
 

roadkill

Helluva Engineer
Messages
2,769
But the question still remains: From your original post, where does the 56.87 Yards/Drive come from?

I’m not sure, but it’s almost 458/9. I would guess that somewhere in the data feed the total yards and the number of drives is off
To see if the stats for the Colorado game were just a data anomaly, I checked a couple of last year’s games. Same issues with yards/drive and plays/drive. You can’t make the numbers work, and to the point about the two decimal places, working backwards never results in an integer for plays or yards.

Game on Paper’s glossary defines less intuitive terms like EPA, but does not provide much insight into their yards/drive or plays/drive numbers, likely because the math should be obvious. Problem, is, their results cannot be obtained by simple arithmetic. It seems that there may be some more complex math used where the results don’t line up with typical assumptions.

I feel like the overall stats are still directionally correct, but it makes one wonder why they can’t get simple things like Yards/Drive right.
 

slugboy

Moderator
Staff member
Messages
13,979
To see if the stats for the Colorado game were just a data anomaly, I checked a couple of last year’s games. Same issues with yards/drive and plays/drive. You can’t make the numbers work, and to the point about the two decimal places, working backwards never results in an integer for plays or yards.

Game on Paper’s glossary defines less intuitive terms like EPA, but does not provide much insight into their yards/drive or plays/drive numbers, likely because the math should be obvious. Problem, is, their results cannot be obtained by simple arithmetic. It seems that there may be some more complex math used where the results don’t line up with typical assumptions.

I feel like the overall stats are still directionally correct, but it makes one wonder why they can’t get simple things like Yards/Drive right.
Shrugs GIF
 

GTRhino24

GT Athlete
Messages
353
Location
Birmingham AL
To see if the stats for the Colorado game were just a data anomaly, I checked a couple of last year’s games. Same issues with yards/drive and plays/drive. You can’t make the numbers work, and to the point about the two decimal places, working backwards never results in an integer for plays or yards.

Game on Paper’s glossary defines less intuitive terms like EPA, but does not provide much insight into their yards/drive or plays/drive numbers, likely because the math should be obvious. Problem, is, their results cannot be obtained by simple arithmetic. It seems that there may be some more complex math used where the results don’t line up with typical assumptions.

I feel like the overall stats are still directionally correct, but it makes one wonder why they can’t get simple things like Yards/Drive right.
There are tiny fractions of a yard left at the end of every play. We just round those down and deposit them into an account. Kinda like Superman 3. It’s kinda foolproof.
 

RonJohn

Helluva Engineer
Messages
5,648
To see if the stats for the Colorado game were just a data anomaly, I checked a couple of last year’s games. Same issues with yards/drive and plays/drive. You can’t make the numbers work, and to the point about the two decimal places, working backwards never results in an integer for plays or yards.

Game on Paper’s glossary defines less intuitive terms like EPA, but does not provide much insight into their yards/drive or plays/drive numbers, likely because the math should be obvious. Problem, is, their results cannot be obtained by simple arithmetic. It seems that there may be some more complex math used where the results don’t line up with typical assumptions.

I feel like the overall stats are still directionally correct, but it makes one wonder why they can’t get simple things like Yards/Drive right.
That is my concern. If simple calculations are wrong, it calls all of the calculated stats into question, which makes the source unreliable.
 

Heisman's Ghost

Helluva Engineer
Messages
6,106
Location
Albany Georgia
I think it actually moves the needle as much as ever since they won 9 last year. It definitely didn’t for most of 2023 since they got waxed by Oregon four games into Deion’s tenure.
That was with a Heisman winner and a NFL quarterback. Winning 9 games in the Big 12 does not mean that much. Colorado got its head handed to them by both Kansas and BYU. We spotted them a 3-0 turnover margin and they still lost. Teams with a negative 3 turnover margin like Tech had win less than 1% of the time.
 

Heisman's Ghost

Helluva Engineer
Messages
6,106
Location
Albany Georgia

slugboy

Moderator
Staff member
Messages
13,979
SP+ is no longer paywalled.


We moved from 43rd to 36th. We are two sports behind Duke, who moved up 11 sports due to their win over Elon. Our offense is to 20. Our defense is ranked 59th, and special teams it's 100th. There is a ton of bias from last year in this model, and it will be there for weeks.
 

slugboy

Moderator
Staff member
Messages
13,979
With the current amount of roster turnover seen across the sport, bias from last year's data makes SP+ an even less reliable predictor in the early season. IIWII.
I'm posting it more so people can watch it move, and know where we started

(it’s kind of like watching the stock market at work. unless you’re day trading, it’s a distraction. Hopefully, it a fun distraction)
 
Last edited:

stinger78

Helluva Engineer
Messages
10,594
SP+ is no longer paywalled.


We moved from 43rd to 36th. We are two sports behind Duke, who moved up 11 sports due to their win over Elon. Our offense is to 20. Our defense is ranked 59th, and special teams it's 100th. There is a ton of bias from last year in this model, and it will be there for weeks.
Thank you for pointing this out. Bias is epic in predictive sports statistics, IMPO.
 
Top