Modeling risk with Catalan numbers

Discussion in 'Automated Trading' started by maxdama, Oct 5, 2008.

  1. maxdama


    I'm interested in using Catalan numbers (Wikipedia) to model the risk of maximum drawdown for the sake of optimization. Basically, what I've found so far is that they can be used to find the probability of a large drawdown with certain assumptions. I'd like to know if anyone has found extensions of the theory to eliminate most of the assumptions. It seems like a very interesting theory that has been useful in many fields but not yet practically adapted to money management.

    A bit more of a write-up is in my online notes. If anyone could point me to a paper, book, or webpage going more in depth on Catalan numbers being applied to betting processes I'd appreciate it.


    __________________ - The log of my research on and implementation of automated trading strategies
  2. while it dosn't specifically refer to Catalan numbers per se., there is an older book that talks quite a bit about these types of scenarios (stirling approximation, etc..) and risk of ruin in binary games with random walk.

    The theory of gambling and statistical logic.

    The problem with some of the interpretations of these types of binary models (and I see a lot of these discussions on here), is that they have very little relevance to continuous outcomes (i.e. reality).

    However, ralph vince does touch on this a bit and there is a paper that uses parametric modeling with geometric brownian motion to discuss some of these ideas (particularly with emphasis on fractional betting). Even these types of models suffer from normality assumption, but they worth looking at IMO.
  3. In you website you state:

    "Solving, P=439/5040=8.5%. So with a 50/50 chance of a winning trade, you will almost certainly be down at some point during a 14 trade period."

    Not correct IMO. First, an 8.5% probability of survival is quite high. Second, I think you confuse probability of survival with drawdown levels. The probability of survival has little to do with whether or not at some point during a 14-day period you were down. There are many alternative outcomes if you toss a coin 14 times, yet the probability of survival is the same even in the case that heads comes up 14 times in a row.

    Actually, the probability of survival is calculated by considering all possible outcomes (events) and using that number to divide the number of "survived" events.

    In trading you need to specify what "survival" means and this has different interpretations and measurement for different people.

    Some can sustain a 90% drawdown because they invest using cash and can wait for many years to recover.

    Some consider survival staying above -25% high watermark while not having three consecutive losing months in a row.

    Michael Harris in his new book (Chapter 5) uses Bernoulli Trials to address this issue and to show that there is always a small but finite probability of ruin for every possible trading startegy:

    Thus, to have an accurate measure of the probability of survival or ruin of a specific trading startegy you must know in advance its success rate and odds of winning.

    The paradox is that you can know whether you will survive after you survived.

    You see in you example you used a "a 50/50 chance of a winning trade" so you already assumed that the success rate was known and constant.

    If the success rate is a random variable, nothing can be determined a priori.

    Too bad, that is another reason hedge funds fail although they employ Nobel Prize winners.
  4. It's getting a bit late, so hopefully this comes out coherent and somewhat intelligible. However, I was a bit intrigued by the Catalan number idea, so I looked over it a little bit.

    1) Not sure where you obtained the formula's application, but I think you have misinterpreted this particular application. On your blog, you made the assumption that it applies for all vertices above the abscissa. However, according to an example I'll link, it only applies for vertices that lie on or above the diagonal line that starts at the origin and ends at the final vertical point (corresponding to xf=14 in your example). So, it actually excludes 1/2 of the possible vertices that should be included above the abscissa.
    (search for the city lawyer problem).

    The correct formula for the probability of being always being above water (or survival as you called it) -- more technically, likelihood of 2t vertices occurring on the positive side of the abscissa, is p2t,2n =
    (2t choose t)*((2n-2t) choose (n-t))*(1/2^2n). ** In the case of your example of final trial being 2n = 14 (where n = 7 in the formula), the resulting probability of all vertices being positive is 20.9%.
    Notice it's about double your result (i.e. the bottom area of box diagonal your application excluded. It's not exactly a linear relationship, as the probabilities follow something called an arcsin law.
    Another corollary to the arcsin law (regarding your 50/50 assumption) is that a 50% likelihood of being above the abscissa or in the lead has the least likelihood of occuring (i.e. the balance you refer to is flawed). It is increasingly more likely that one side will lead at any fraction above 50%
    Regarding intraday bill's reply, notice the correct formula includes one of the comments you stated -- the divisor is comprised of 2^2n
    events (or one over the probability of all possible outcomes with 50/50 chance per event), rather than the 7 factorial and other coefficients used in mdama's blog example.

    Regarding your interpretation of survival, it's sort of what I mentioned earlier, that there is a great divide between some of the binary theories being postulated on some of these threads and reality. These types of simple binary problems assume fixed betting with only two binary outcomes.
    If trading was really like that, then while there might be relative drawdowns; once you cross the horizontal axis you have blown up -- capital depleted. In this sense, the model is correct -- survival implies you have not crossed below the horizontal axis, or else your capital is gone.

    Reality often deals more along the lines of non-parametric distributions with a large range of continuous outcomes, and often they have fat tails, which completely nullify many of these types of binary theories. It also deals with compounding , unlike the example application here, as well as fractional betting. All of these pragmatic issues change the dynamics of this simple model quite a bit.

    Probability of survival in dama's model (or even any simple binomial model) is not the same under 14 coin tosses being all heads compared to other possible outcomes. The likelihood of all heads is quite different than say, half heads and tails -- meaning the probability of exceeding the horizontal threshold is likewise different. In two coin tosses, two heads are not equally likely to show up as a head and a tail. Basic binomial distribution.

    **The formula is shown in the 1st book I referenced earlier.
  5. maxdama


    dtrader and intradaybill,

    Thanks for the criticism. You both brought up variations of the three criticisms/assumptions, and added a few more, that I was trying to eliminate to make it practical. Looks like this is not going to be a very fruitful model.

    What do you think is the best way to model risk for the sake of position-size optimization? That's my main goal. Do you think Monte Carlo is the best?


    __________________ - The log of my research on and implementation of automated trading strategies
  6. aonelite


    Sound cool
  7. Good question. There are many answers depending on the objective function you use in the optimization.

    (1) Do you want to minimize risk and maximize return?

    (2) Do you want to minimize variance of returns?

    (3) Are you looking for geometric equity growth

    (4) ......

    I think you should read the paper by Michael Harris where he shows that geometric growth is accomplished when position size is calculated using the ratio of expectancy to avg winning trade, also known as the Kelly formula. Maybe you could expand on that paper and I am looking forward to see if you come up with something new along these lines: