Help needed with probability math

Discussion in 'Automated Trading' started by braincell, Nov 18, 2011.

  1. To be fair, braincell, it's frustrating. There are people (very few) here who knows what they are talking about and don't mind joining a discussion once or twice with beginners on actual technical subjects (why bother here when you can do it on wilmot or nuclearphynance where the people are FAR FAR better).

    Then, inevitable, some strange loon shows up and starts pollution the thread with nonsense. Nonsense worded well enough such that anyone who doesn't really know the subject (ie, probabilities, expectation operators, etc) think it's a legit opinion - but it's not - just equations copied from wikipedia misapplied and misused.

    Which, of course, drives the first group of people insane as they now find themselves trying to defending basic concepts against cranks - and not winning - because there are no qualified judges and the awful amount of cranks vs experts on ET.

    So after some name calling (because it's not only a waste of time, but also beneath the dignity of the members of the first group), the members of the first group vow - never again - and leave for sunnier pastures (or go to read nypost or other, more interesting, nonsense) in those gaps when they have some free time in their jobs in actual, non-pretend, finance.

    Good morning. :)

     
    #41     Nov 20, 2011
  2. You are wrong. He is right. Expectation Operator E[.] is NOT the same as mean - which is a sample estimation of population expectation E[.]

    You might find it interesting that there are OTHER sample estimates of expectation that is NOT the arithmetic mean.

    Finally, random DOES NOT equal to cannot-model. In fact, the ENTIRE field of statistics deals with modeling values with random components.

    ... please don't call others ignorant when you are completely clueless as to what you are talking about...

     
    #42     Nov 20, 2011
  3. neke

    neke

    If the intention is strictly finding the probability of hitting a large figure, and assuming the small sample size is not a problem, it is quite obvious sysem 1 has the higher probability (it doesn't make it the better system though). The std dev for system 1 is 207, while for system 2 it is 16. Due to the very small sample size, I think there is no harm assuming a normal distribution. In this case the probability of 10 (mean) or more for both systems is 50%, but if you are thinking of +20, +30, etc, the probability for system 1 is higher (just as it is higher for -20, -30, etc or worse). Which system is better? For the same return, it is obviously better to go with a system with less volatility. Why crank up frightening risk if your expectation is the same?

     
    #43     Nov 20, 2011
  4. Good morning! Yes, i read that post while still having my morning coffee and i thought it didn't make that much sense. I realized earlier that it is in fact what you say, so i didn't make much of it.

    Further, i used your advice for sharpe ratio, but i'm worried i can't quite make sense out of it in extreme cases either, probably because i'm a biased human being. I'm talking about one specific case.

    here is an example:

    Say we have system3 added to the mix. In this system the outputs are
    6 x +1
    5 x -1

    Meaning 6 days with a +1 profit and 5 with a -1.

    The sharpe ratios are as follows:

    System1 SR = 0.851
    System2 SR = 10.425
    System3 SR = 1.44

    So, this is the non-linearity i'm worried about. System1 is more profitable than System3 with a slightly greater risk. However the difference in profitability is huge: System1 avg = +10 and System2 avg = +0.09

    Using SR only, System3 would be picked before System1 (ignoring System2). This probably means that if using SR for finding the right systems, it's quite possible i would end up with only those that trade the spread, and make micro movements over a long time. It's very hard to model scaling up a strategy into the testing model itself (plus it would last 100x as long to compute), but I still agree with the notion that SR is more important and can later be "somehow" scaled up. I'm just not sure if i should follow it to these extremes.

    Are my worries founded?

    PS.
    If we used Score = SR * Avg = Avg^2 / StdDev(series) , then these systems would be scored as i would think using "common sense". Putting an exponent on the avg price balances out the importance of Avg and StdDev. Maybe. I'm not sure it's the right thing to do though.
    In that case we would get:

    S1 = 8.51
    S2 = 104.25
    S3 = 0.1296
     
    #44     Nov 20, 2011
  5. If the problem you have is that the system with the higher sharpe ratio has a lower average return, then you can simply multiply the positions of that system with a scalar.

    That is, if r* is your desired average return and r is the average return of the highest sharpe ratio system, then simply multiply all the positions of that system by r*/r; The sharpe ratio is invariant to scalar multiplications (isn't that a nice property :) <- sharpe ratio is invariant to leverage).

    Now, if the problem then is that you can't lever the high sharpe ratio system to the desired expected return (because of margin, etc), then the problem gets a little complicated, but still solveable.

    It then becomes a problem of mixing a set of leverage over each system such that the total sharpe of that mixture is maximized while attaining your desired return (I hope this statement reads like something else you have have read about).

    Still don't think Avg*SR is the right thing to do - (I didn't bother working this part out enough entirely, so it's probably true to some multiple + constant), but I suspect Avg*SR is actually the solution to a degenerative optimization problem in spirit to what I hinted in the paragraph above but with some assumptions you probably didn't want to make.

    Which brings up full round - intuition is really important because it guards you against doing doing silly things with probabilities (much like our friend who polluted this thread), but you must check everything with work.

     
    #45     Nov 20, 2011

  6. I see what you mean, but the hard part then becomes finding "desired average return" which depends on correlation of system vs others systems, leverage decrease, the actual sharpe ratios, etc. That then becomes the hard part, and obviously something i shouldn't be expecting anyone to give me answers to on ET (or anywhere for that matter). I'll figure it out, but thanks for the idea of r*/r. It makes perfect sense.

    Yes that statement reads like something i have read about, and am trying to implement.

    I think my approach at first will be to limit strategies with constraints (internal and external combinations) so that they don't tend to end up being low Avg. I think you can guess what i'm doing when i say they will have a small sample size. ;) If that fails, i will try to introduce a scalar, plus with some studies on scalability, and then it will probably be easier knowing what kind of ranges i can expect for the "desired average return" input.

    Went to nuclearphynance btw, looks interesting!

    cheers
     
    #46     Nov 20, 2011
  7. desired average return is just whatever you want your systems to return; That number comes entirely from you; and it's the easiest number to come up with of the whole thing.

    But I see you are getting the hint... there rest the problem does require correlations, etc.

    P.S.: if you post your general problem, and the ideas you've explored so far (using sharpe ratio ranking; difficulties of mixing systems with very different returns), etc, on the more learned forums, I'm sure there'll be a lot more people who are willing to give you relevant venues of research to head down.

     
    #47     Nov 20, 2011
  8. Sounds like a quadratic optimization for allocation, i.e. Markowitz.
     
    #48     Nov 20, 2011
  9. kut2k2

    kut2k2

    You're welcome. The problem is that some people can't admit when they're wrong. I have a history with ronblack, who probably wouldn't have said a word if almost anybody else but me had posted what I posted. All I did was post a very common empirical formula for calculating a trader's edge. Given the full order of magnitude difference, that's significant even for a sample size of 5.

    And notice the gyrations that top-posting mouthbreather DontMissTheShortBus has to twist himself into trying to "prove" that all I did was restate his Sharpe nonsense. Since when is E[x²] equal to the variance? Oh yeah, it's when E[x] is zero. But E[x] is emphatically not zero here. See how ridiculous these little math dilettantes are? Like undisciplined puppies soiling your carpet, they don't realize it unless you rub their noses in it.

    Impossible to have a civil conversation with arrogant fools. My 2¢.
     
    #49     Nov 20, 2011
  10. ssrrkk

    ssrrkk

    I agree with many posters: I think the Sharpe ratio is the answer. It tells you the trade-off between mean profitability versus std dev (risk). This is similar to the t-statistic which is mean / std err, but if you are comparing t-statistics between systems, then one way to think of it is you could omit the sqrt(n) factor in the std err. Another way to quantify the "luck" part of this is to calculate the 95% confidence interval of the mean profitability estimate. System 1 will have a much wider confidence interval, meaning that there is a larger chance that the mean profitability estimate (10) is way off in system 1. Therefore, you would consider it a "riskier" system. Unfortunately we cannot really calculate that CI without assuming something about the distribution, and with just 5 samples, one probably could not assume a normal distribution as many have mentioned.

    I think the question a poster asked earlier (regarding performance of sys1 + sys2 etc) may be related to the additivity of mean and variance (but not std dev).
     
    #50     Nov 21, 2011