top | item 43685122

Show HN: I made a machine learning model to predict 66.45% of NBA games

17 points| francio445 | 10 months ago |github.com

Introducing DeepShot: An NBA Game Prediction Model

Hey devs, sports fans, and data nerds!

After weeks of work, I'm excited to share DeepShot – an advanced NBA game predictor powered by historical data from Basketball Reference, machine learning, and a clean NiceGUI-powered web interface.

What it does: DeepShot uses team-level rolling averages (including Exponentially Weighted Moving Averages) and an Elo rating system to accurately predict NBA game outcomes. All predictions are visualized in real time through a sleek, responsive UI.

Key Features: Data-Driven Predictions using past performance & rolling trends EWMA-based Weighted Stats Engine Elo Ratings for contextual team strength Cross-platform interface built with NiceGUI Key stats highlight to visualize matchup advantages at a glance Tech Stack: Python Pandas, Scikit-learn, XGBoost BeautifulSoup, Requests NiceGUI for the frontend Hosted locally, runs on Windows/macOS/Linux Clone it here → github.com/saccofrancesco/deepshot

Want to see how predictive modeling and sports analytics come together? This is for you.

Feedback, stars, forks, and PRs are more than welcome!

Let me know what you think, or drop your ideas for improvements — always open to suggestions!

#NBA #Python #MachineLearning #SportsAnalytics #OpenSource #NiceGUI #PredictiveModeling #GitHub #XGBoost #EWMA #EloRating #Basketball

14 comments

bonzini|10 months ago

How accurate is "same result as last time the teams played"?

francio445|10 months ago

Using the outcome of the last head-to-head matchup as a predictor can be misleading without proper context. The time elapsed since that game matters significantly—it could be from just a few games ago or over 20 games back. In that time, both teams may have undergone considerable changes in form, strategy, injuries, rotations, or momentum.

My model accounts for each team's evolution by incorporating trends from recent performances against all opponents, not just head-to-head matchups. This includes rolling averages and exponentially weighted metrics over the last 25 games, which help capture current form, streaks, and regressions.

As a result, the most recent head-to-head result only holds substantial predictive weight if it occurred recently and aligns with both teams’ current trajectories. Otherwise, it's treated as just one small piece of a much larger picture.

skeptrune|10 months ago

This is the first time I've seen the term emailware. I love the concept lol.

Are you hosting the full application somewhere? I would love to try it without having to run the code myself.

anfractuosity|10 months ago

Yeah, I'd not heard of that either, I recall postcardware though - https://en.wikipedia.org/wiki/Shareware#Postcardware

francio445|10 months ago

Unfortunately I'm not hosting this anywhere. If you are used to programming in general the steps to reproduce the outcome are very easy. Maybe I'll try to deploy a Docker container for everyone to be able to try this easily ;)

stefanfis|10 months ago

I don’t know how 66% compare to other prediction models like the ones based on classical Machine Learning algorithms. Do you have any comparison data?

tianqi|10 months ago

For reference, a hybrid fuzzy-SVM has a prediction accuracy rate of 88.26%. https://ieeexplore.ieee.org/document/8344688

(This shows why betting on sports is almost impossible to have a long-term edge as it's already a very efficient market and the odds usually reflect the win rate very well.)

As another reference, an earlier predictor that also uses the elo rating system has an accuracy of 65.3% which is very close to the result in this post, and I guess this may be a typical range for elo-based predictors. https://github.com/luke-lite/NBA-Prediction-Modeling

By the way, I really like the interface of this "emailware". It's really fun to play with.

francio445|10 months ago

The 66.45% it's pretty average and might also be sometimes misleading. The model lacks many features and has been developed for around 3 weeks now. I'm only 20 studying my first year in Software Engineering and the project was a lot of fun to create and to se it in action, sometimes being able to outperform the odds (not so consistently to be used to make and edge obviously as it is practically impossible)

paipa|10 months ago

Here's how 66% compares to the trivial predictor: "pick the team with the higher win rate and flip a coin if they're equal" gives you 64% for the 2024 season I just tried it on.

I'm not even considering home/away, let alone win margins, recent form, strength of schedule etc. I'm almost amazed this model couldn't make any use of them.

bangaladore|10 months ago

Excuse my lack of knowledge here.

To what extent is 65% impressive? Naively, I imagine someone very familiar with teams and players could probably achieve similar results. I say this because I assume its obvious that Team A is better than Team B to some extent. Team A might still lose to Team B for whatever reason, but that's why its only 65%. And Team C vs Team D might be a tossup.

francio445|10 months ago

66.45% is inside the edge of 66% to 72% range typical for almost any model. This is given by the fact that the most favored teams lose between 28% to 34% of the game they are supposed to win. So yeah the model predict the most favored team and sometimes was able to predict some winners that the odds weren't able to find but it's a pretty average accuracy ;) Considering the fact that 100% - 28% = 72% and 100% - 34% = 66% the model is inside that edge of predicting the obvious winner but, 1/3 of the times games' outcomes are very "random" / "unpredictable". Also, professional people who bet knowing and watching almost every game, play, knowing almost every news, trade, injury and external factors are accurate around 68% of the time. For me it's pretty amazing that a model knowing nothin could do this well sometimes and it was very fun creating and working on this for around 3 weeks ;)

3vidence|10 months ago

Don't want to be mean to OP but it is a completely useless stat by someone who doesn't know what they are doing.

That isn't said out spite or think OP is trying to be deceptive, it just shows a lack of understanding of the task at hand.

OKC is going to win around 70 of 82 games this year.

If I just naively say OKC will win every game I'm going to be 85% accurate no models required.

RockRobotRock|10 months ago

How does this compare to Vegas lines?