https://www.eraider.com/nfl-picks In this paper he outlines a factor based model that helps him do reasonably well in the NFL season. My take away was, adding a few relatively weak predictors can create a decent system. I also enjoyed his commentary on how strong predictors are hard to come by or rather to-easy to come by. Worth the read imo

Aaron Brown one of my early career quant heros. Along with Paul Wilmot. I still have every single book written by either of them.

why are there not more ETers doing sport betting... friend's kid been pulling 7 figures 3 year straight already doing NBA lines.

I have a stats question related to this article. Aaron' predictors individually were right only about 52% of the time. When he added them all together he had a model that was over 56% accurate. Most stats courses use datasets like iris or mtcars which have predictors that are .5 r^2. We are not usually shown examples where the predictors have an r^2 of .02 and are meaningful. In fact I have not come across a course that shows an example of gathering 4 or 5 .02^r2 predictors to create a model that can estimate reasonably well. Does anyone know of a quant finance course or social science course that give examples of adding multiple weak predictors? Thanks

From the website linked by OP: "The hard part about getting rich from sports betting is not finding a system with a 55% win rate. It's getting people to give you money. Retail betting sites will quickly notice that you're making smart picks. The good ones will cancel your account if you're a net winner, the bad ones will keep your money. There are other venues for placing bets, like matching sites, casinos, human bookies and others; but none are anxious to give large amounts of money away; all of them have defenses you much consider. If your goal is to make a living on sports betting, you're going to need to work. Simple does not mean easy. You're going to want at least a 57% win rate, and that's maybe ten times the work of building a 55% system. And you're going to want several of them so you can make a lot more bets throughout the year. You're also going to need to manage your capital and risk. The reason most people aren't rich from sports betting, and have such unjustified confidence in unsupported beliefs, is that they aren't willing to do the work, or to admit how slight an edge we have when predicting the future even if you do all the work." -------------------------- Struggles in successful sports betting sound a lot like the struggles in successful trading....except the sports bookies are worse than the stock brokerages.

If I were just trying to predict NFL games, it would be easy to get high R^2, after all, the favorite wins about 2/3 of the time in the NFL. But I'm trying to predict versus the consensus of the betting markets. If there were any simple indicator that worked more than about 52% of the time, it would be quickly discovered. There are 256 regular season NFL games in a season. If the spread were set perfectly so that every bet had exactly a 50% chance of winning, the standard deviation of number of wins by, the home team or other simple indicator, would be 8 games. A 52% indicator would produce about 5 extra wins on average. So if it worked for even a few seasons, it would become statistically obvious, and get corrected. Moreover any winning percentage above 11/21 (52.38%) produces profit even for a retail bettor paying full 10% vigorish. But edges below that figure are not profitable to exploit on their own. So there's not a lot of market pressure on them directly. In my opinion, statistics courses tend to concentrate on problems for which you don't need statistics. They spend more time splitting hairs about high R^2 fits that are obvious without formal statistics, than in doing the hard work teasing out the extra information careful quantitative methods can produce beyond simple observation. They compare to irrelevant null hypotheses rather than to the best previous guess to the answer. Binary factor models, like the one I use for NFL betting, are probably the most popular and useful model for people who are trying to be right--not to get published or win lawsuits or get regulatory approvals or win policy arguments--and are absent from all statistics textbooks I know. The textbooks are concerned with using statistics to influence other people--editors, judges, regulators, voters--not find truth. This is why in the 1970s, along with some like-minded people, I gave up on formal academic statistics because people were teaching methods no one would bet a nickel on. I went to Las Vegas to try out ideas where you don't win by being clever or impressing other statisticians--people on the other side are trying as hard as they can not to give you money. This is where you learn reality. I did learn more respect for some of the academic methods, but only when I had tested them--really tested them, not some paper trial--for myself. I then moved on to finance because it's possible to bet for much larger stakes.

This isn't too different than many of the greats and my own experience in academia vs. reality. Ed Thorpe (one of the people I idolize) comes to mind. I started my journey on this road predicated on becoming good at table games. I had ready access to casinos and enough capital to test ideas with. I did well and wanted to quantify a lot of it. Sports books were always around but I never really wrapped my head around how odds are calculated and the fact bookies can basically mulligan and completely fuck you arbitrarily kept me away from casual sports betting (despite the better leverage per unit dollar I got there). Of course you can get around this having friends place bets for you but still...rubbed me wrong. There are certainly plenty of applications for mathematical statistics in betting but much of the literature is tainted by tautology: "the proof says its not possible so therefore it is not" despite statistics being a very fuzzy field in general (if you want to talk religion get a bayesian statistician and an inferential statistician in the same room). Probability theory was founded in gambling and you have people in universities (some I've met) who've never stepped foot in a casino! You'll find that p-hacking, terrible inference and spurious correlation are alive in well in modern financial mathematics. It's a depressing field sometimes. Take a trip to the strategy development subforum and you'll see plenty of claims of 90% win rates without any mathematical evidence. I disagree with you in the reason for moving to finance. I found it difficult to play tournaments against grinders, casinos are prepared for card counters, and bookies will just invalidate your bet. The market has none of this (at least if you're not betting against subprime mortgages), making it the optimal place to put your money. Finance just makes sense - it's far more "gentlemanly" these days. I am interested in your application of R^2 despite the autocorrelated nature of the data in both sports betting and finance. Linear regression provides an at-best poor predictor (theoretically) for data that is serially correlated (good teams tend to keep doing good, etc) and yet you seemed to have cracked the code without touching any of the "mathematically pure" techniques such as ARIMA. Would you mind giving me a taste of your use of R^2 with this data and how you were able to use it despite the math telling you autocorrelation is going to destroy your predictors?

Would someone mind helping me break down the concept of shrinkage? It might have been a little more intuitive for me if it was not about football/sports betting. "For our first factor we will look at one of the most generally reliable predictors, overreaction. We will bet on the team the line moved against versus what we would have expected the prior week. For example, suppose Green Bay plays Chicago in week 7. Before the games of week 6, we would have expected Green Bay to be a 3.5 point favorite, but after the week six results, Green Bay is only a 1.5 point favorite. We’re going to guess that this is an overreaction, and bet on Green Bay. There are three reasons we think this should work. First is the mathematical concept of shrinkage, a very powerful real world tool that is usually taught as a minor mathematical curiosity in statistics texts, if it is mentioned at all. The line move is a combination of signal plus noise. In the example above, we know the line move was 2, so we can call it the sum of S + N, where S is the move that would have made the game exactly a 50% proposition...…." Aaron states that if N is positive the move was too big. Since we know that the move was 2 points (3.5-1.5) than it is highly likely that N is positive. I do not understand this. What is stopping N equaling -2? In our case S+N = 2. What is wrong with 4 + -2 ? Is it possible to have a negative Signal or Noise value? Some clarity would be great! I find I learn best through examples so if you have the time to write a few made up/real world examples, that would be greatly appreciated. I have a few follow up questions re shrinkage Actually, I would like to give an example using stock earnings. Before the last earnings event the implied move was 10%. The event was a 25% move. This earnings the implied move is 20%. There was an increase of 10% which is = S + N. How do I measure S and N? If the prior event did not happen I would expect the implied move this quarter to be 10%. Does that mean the increase to 20% is all noise? In reality S should have a positive value!