In 2012, Nate Silver and his team were able to create a sophisticated statistical model able to predict the presidential election incredibly accurately.
My roommate and I would like to do something similar with MMA fighting. Like a presidential election, an MMA fight is a contest between two opponents. Unlike team sports, elections and MMA matches are highly suitable targets for Bayesian statistical modelling, which is the method Mr. Silver used. Bayesian modelling is discussed here: http://en.wikipedia.org/wiki/Bayesian_inference; I'm trying to avoid TL;DNR.
While admittedly not as skilled as Mr. Silver, my roommate and I are graduate students in psychology with advanced training in statistical analysis. His expertise is in neural network modelling in MATLAB, and his formal work is with machine learning. Mine is in frequentist and Bayesian inferential statistics and particularly logistic regression. We've both taken several graduate and undergraduate classes and I've been TA for graduate statistics. I'm fluent in SPSS and R statistical packages, and I've also coauthored several scientific papers in which my primary task was to run the statistics and create results sections. This isn't to brag, but rather to establish that I'm no schmuck with this stuff.
We're looking for several knowledgeable (about MMA) and reliable folks to help us build a database of fighter statistics that we could use as data for our model. This would include collecting statistics for your favorite fighters and importing them into a shared, premade spreadsheet (on Google Drive). You don't need to know how to do statistical analysis, you just need to have some free time and be comfortable entering data. We'd make the skeleton of the database, and you'd just add data into the cells. Then we'd use our mad skillz to make that data our bitch and tell us who's gonna win fights.
After testing several models against each other, we'd like to use the best model to predict fights and bet on them. Even if we were write only 70% of the time, we'd make out like bandits. I bet we'd do we better than that, and if you'd made a contribution to the database we'd share the models predictions for your own use.
I'd be happy to discuss details. Supercalo's not invited...he'd enter in fake data and watch the world burn.
You are a graduate student and you hope to be "write" 70 percent of the time ? You better do better than that. Over the past 10 years or so, I'm at about 68 picking based on knowledge.
clintboxe -You are a graduate student and you hope to be "write" 70 percent of the time ? You better do better than that. Over the past 10 years or so, I'm at about 68 picking based on knowledge.
Your write, wat I rote rong ruined everything I said.
This actually sounds sick
It's the UG, what do you expect ?!? :)
Why don't you just buy a copy of the MMA database from the UG?
Statistics aren't going to predict who's going to be judging a fight.
Massa - Why don't you just buy a copy of the MMA database from the UG?
We will, but some of the statistics we'd want are weirder than the ones in that database.
MdGeist -
Statistics aren't going to predict who's going to be judging a fight.
Judges and their errors add to error variance, or unpredictability, but they do not make prediction impossible. They simply constrain the predictability of the system, and probably not as much as you'd think. Bad decisions stick out in your mind because you're a human, but most judging is pretty good.
Not enough statistics in mma to predict outcomes IMO. Unlike other sports that already have established stats.
Kneeblock -Fightmetric.com and compustrike should have the level of granulation you're looking for rather than having to crowdsource. I'd be interested in helping out. Kneeblock@Yahoo.com
Cool, and good info. You're first on the list...check your email.
Email me Kirik@MMA.tv
a regression with variables for size, speed, grappling, striking, cardio, juice and bread starvation factor.
i pee blood - a regression with variables for size, speed, grappling, striking, cardio, juice and bread starvation factor.
Don't forget to factor in the outlier: Dat TRT!
PM sent
PM sent
i pee blood - a regression with variables for size, speed, grappling, striking, cardio, juice and bread starvation factor.
Haha, exactly. TRT much? British Fighter? Wrestle in College? Dolce Diet? Greg Jackson fighter?
Standard logistic regression and regular neural networks run the problem of not being able to incorporate priors. In other words, it can predict if Michael Bisping is a "winner", but can't predict if Michael Bisping will win, given that he's facing Vitor Belfort. If you instead make the model tell you "Will the linear model of Michael Bisping be a "winner", given the linear model of Vitor Belfort, TRT champion?", then you're on to something. That's Bayesian stats, in a nutshell. And like I said, we can compare lots of models and compare their predictive capacities.
Both of our advisors are making us learn Bayesian modelling for our dissertations anyway, we might as well make some cash.
I like the sound of this.
Down