[Homebrew] College Basketball rankings

smathis30

Ramblin' Wreck
Messages
732
Normally ill post about my rankings towards march madness when the Kaggle competition opens up, but this year i wouldn't to omit using KenPom rankings and entirely use my own. So with that, here is the first batch of my homebrewed rankings.

Methodology:
Based off the kaggle competition, it looked like the most important things do determing sucess were the following 5 factors:
1. Points/Possesion
2. Rebound ratio
3. Players that scored 10pts/game
4. Free Throw Rate
5. Turnover ratio

I use linear regression to assign weights to each category to come up with a RAW offensive and defiencive score/efficiency.

From there, an adjustment factor is added to the raw average, and then based off the average SoS of each opponents offense and defense, the RAW score is multipled by a standardization factor, which is
Real Score = (RAW Score + adjustment factor) * (1-SoSmultipler*Standard deviatons above mean)

Higher standard deviations means easier schedule, so thats why its a negative sign.

Determining weights:

I use linear regression. Each game is painstakingly added in and a score is plugged into a forumula using least squares regression to determine a score based off of

1. (Visiting O-Home D) *m1 + b1
2 (Home O - Away D) *m2 + b2

From there, I have an expected score differential. I plug that into finding the probability the home team wins with a standard deviation of 10, per what i found online of average score deviation in NCAAMB.

If the projected winner is the same as the actual winner, the "game score" is equal to -1*LN(Probability).
IF its wrong, the game score is equal to -1*LN(1-probability)

Due to the nature of natural logs, games with high confidence and are wrong hurt way more than being barely right. Games with 95% confidence, (like Deleware St beating la tech... they didn't) end up giving the formula 2.553 points, whereas Duke beating East Central Western STate university will give a score of esentially 0. The average of all games is computed, and linear regression is than conducted to minimize the average game score and assign weights to each of the seven variables (5 for score, two for Standardization)

So with that, here is the top 25 through today and ACC rankings

Projected Scores for the week for good guys in gold:
GT 71 NW 81
St Johns 78 GT 77
 

MikeJackets1967

Helluva Engineer
Messages
14,844
Location
Lovely Ducktown,Tennessee
Here's my Men's Basketball Top 25

1)Kansas
2)Gonzaga
3)Duke
4)Nevada
5)Tennessee
6)Virginia
7)Michigan
8)Kentucky
9)Michigan State
10)North Carolina
11)Auburn
12)Kansas State
13)Virginia Tech
14)Florida State
15)Ohio State
16)Iowa
17)Texas
18)Purdue
19)Texas Tech
20)Oregon
21)Villanova
22)Maryland
23)Mississippi State
24)Arizona State
25)Wisconsin
 

smathis30

Ramblin' Wreck
Messages
732
121218.png

updated rankings. Tiers have started to emerge in the ACC
Contenders:
Duke, Virginia, Virginia Tech
Should easily make Tournament if they keep it up:
UNC, Louisville, FSU, NC State, Notre Dame
Bubble watch:
Syracuse
NIT looks fun:
GT, Clemson, Miami, Pitt,
At least there is next year:
BC, Wake
 

YellowJacketFan2018

Helluva Engineer
Messages
9,022
Location
Southeast Tennessee
Here's my 2019-2020 Top 25 Preseason Men's College Basketball Rankings

1)Kansas
2)Michigan State
3)Kentucky
4)Florida
5)Duke
6)Louisville
7)North Carolina
8)Gonzaga
9)Villanova
10)Texas Tech
11)Maryland
12)Ohio State
13)Virginia
14)Seton Hall
15)Baylor
16)Oregon
17)Utah State
18)Tennessee
19)St Mary's
20)Xavier
21)Marquette
22)Cincinnati
23)Houston
24)Mississippi State
25)LSU
 

ibeattetris

Helluva Engineer
Messages
3,604
Normally ill post about my rankings towards march madness when the Kaggle competition opens up, but this year i wouldn't to omit using KenPom rankings and entirely use my own. So with that, here is the first batch of my homebrewed rankings.

Methodology:
Based off the kaggle competition, it looked like the most important things do determing sucess were the following 5 factors:
1. Points/Possesion
2. Rebound ratio
3. Players that scored 10pts/game
4. Free Throw Rate
5. Turnover ratio

I use linear regression to assign weights to each category to come up with a RAW offensive and defiencive score/efficiency.

From there, an adjustment factor is added to the raw average, and then based off the average SoS of each opponents offense and defense, the RAW score is multipled by a standardization factor, which is
Real Score = (RAW Score + adjustment factor) * (1-SoSmultipler*Standard deviatons above mean)

Higher standard deviations means easier schedule, so thats why its a negative sign.

Determining weights:

I use linear regression. Each game is painstakingly added in and a score is plugged into a forumula using least squares regression to determine a score based off of

1. (Visiting O-Home D) *m1 + b1
2 (Home O - Away D) *m2 + b2

From there, I have an expected score differential. I plug that into finding the probability the home team wins with a standard deviation of 10, per what i found online of average score deviation in NCAAMB.

If the projected winner is the same as the actual winner, the "game score" is equal to -1*LN(Probability).
IF its wrong, the game score is equal to -1*LN(1-probability)

Due to the nature of natural logs, games with high confidence and are wrong hurt way more than being barely right. Games with 95% confidence, (like Deleware St beating la tech... they didn't) end up giving the formula 2.553 points, whereas Duke beating East Central Western STate university will give a score of esentially 0. The average of all games is computed, and linear regression is than conducted to minimize the average game score and assign weights to each of the seven variables (5 for score, two for Standardization)

So with that, here is the top 25 through today and ACC rankings

Projected Scores for the week for good guys in gold:
GT 71 NW 81
St Johns 78 GT 77

I need to up my excel game. This is great stuff. Can you elaborate on
Based off the kaggle competition, it looked like the most important things do determing sucess were the following 5 factors
 
Top