Thoughts on over-filtering...

Discussion in 'Strategy Building' started by fulliautomatix, Nov 11, 2005.

  1. New to elitetrader- have traded the last 2 years. Now focusing on a mechanical strategy I've developed following intra-day trends of the ETFs. I wrote this a while back while doing some research on trading strategies and putting some ideas on paper, and wanted to share to see what you guys think.

    Basically on over-filtering a trading strategy, why I'd rather have 100 trade signals over 10 (if I'm following a mechanical system)- and why so many back tested strategies that look great fail due to curve fitting. Probably nothing new to those of you who've been trading a while though.

    Note- While I am good w/ stats, etc. I am certainly not an expert! These are basically just my thoughts on the matter.

    ---------

    Insurance companies do not want to simply hand out insurance to the entire population. Excluding obvious, easily proven, outliers such as current cancer and AIDS sufferers can drastically reduce the odds of an insured developing these conditions. However, when a group of similar risks are lumped together (such as a large corporation of financial workers) the Law of Large Numbers comes into play; the reason insurance companies demand that (in the case of group insurance) 100% of workers sign up with the plan in order to reduce adverse selection.

    Why would insuring the entire working population of a company, rather than try to cherry pick group of people with a less than average chance of getting sick, be more desirable from an insurance standpoint? The Law of Large Numbers is the reason. This law of statistics states that:

    “The average of a large number of independent measurements, with a similar probability of success, of a random quantity tends toward the theoretical average of that quantity.”

    The more samples you take of a population with a similar risk profile the more likely you are to come close to the average value of the entire population. For example, say that out of a population of working professionals, 100 out of every 1000 will develop a grave disease over the course of the upcoming year. Picture what could happen in an insurance company tried to hand-pick the workers it would insure, and instead of insuring the entire population of 1000 they narrowed down the number of people to receive insurance to only 100. Given that 100 people (10% of the 1000) are statistically destined to become gravely ill, there is a chance (albeit a small one) that all 100 (or 100%) of the hand-picked population, out of the original 1000, will be the unfortunate statistical victims!

    Given this scenario, there are a number of possible outcomes.
    1. 100% of the 100 chosen people are disease free
    2. Only a fraction of the chosen population will be stricken with a disease
    3. 100% of the 100 chosen will also be members of the statistical probability group destined to fall gravely ill.

    Out of these three scenarios, only the first one would really be acceptable to an insurance company. The second scenario could potentially yield disastrous results, as each member of that population to fall ill is makes up much larger percentage than if the entire population were to have been taken into account. Even if only ten of the cherry-picked population became sick (10%), the percentage of sick workers would now equal the same percentage of ill workers that the entire population had as a whole. What happens if 20 become ill? More? It is because of these risks that insurance companies seek to include as many people, of similar risk profiles, into their group insurance plans. Typically it is mandatory that 100% of a company seeking these plans is to be covered under these blanket plans. Narrowing down the population to be insured even more would continue to increase the risk of having a disproportionate number of insured that become ill. The fewer who are included in a population of similar risk profiles, the greater the risk that there are more adverse selections (in this case gravely ill workers) within this group- better to spread the risk out over the entire population of similar risks. The larger the number of insured, who all have similar risk profiles, the further the risk is distributed.

    It is only when an entire population of similar risk profiles are taken into consideration that these numbers become acceptable. As the Law of Large Number states, the larger your sample size, the more accurate your predictions will be. In the above example, if the insurance company decided to cover the entire population of 1000 workers and the statistical 100 became ill, the impact of those who became ill would be drastically reduced. While there is now an almost guaranteed statistical risk that 10% of the workers will result in a loss for the insurance company, there is close to a 0% risk that 100% of the workers will become ill, thus keeping the overall risk profile of the insured group to a minimum.

    Take the same theory, that a larger number of samples from a similar risk profile will produce more stable, predictable results then a smaller sample, and now move the focus away from insurance and towards trading the markets. The results are very much the same.

    While analyzing a certain market, an investor/trader can attempt any number of strategies- most utilizing certain filters in an attempt to narrow down the total universe of stocks/futures/etc. into a smaller population that meet certain predefined criteria. These criteria are chosen with the belief that they will produce a list of potential trading/investing opportunities that will be profitable.

    -need to split here to keep post length down...
     
  2. There are many who do not feel it is possible to ‘beat’ the market, and promote instead the long-term holding of index funds- funds which include many, if not all, of the stocks in a certain market. While this approach certainly has its merits, why would an experienced investor/trader not take the initiative to attempt and reduce the number of obvious outliers the larger population? This would be very similar to how insurance companies offer their blanket insurance policies- they cull out the obvious anomalies in the entire population (for example, all of North America) and focus on a grouping of similar risk profiles (working professionals). As explained above, by removing extreme risks (i.e., homeless, terminally ill, etc.) they are able to increase their odds of having a portfolio that is generally ‘healthy’.

    This of course isn’t limited to stock picking; even a strategy that trades only one instrument at a time fits perfectly into this example too. Say a certain technical indicator gives, over the course of a month, a thousand ‘buy’ signals. Filtering out obvious outliers, whether by intuition or by the use of other indicators as confirmation, will narrow down the population to signals with similar risk profiles- in this case meaning that they will all have similar odds of being profitable. Taking the entire population into consideration, you now have a universe of signals that, theoretically, should all have the same ‘risk-profile’. Trading on each of these signals should lead to a situation similar to the above examples with insurance- your population of potential trades will now include all winners, losers, and in-between trades. If you have a strategy which generally has more winners than losers (or statistically comes out ahead due to tight stop-losses or other risk-management system), you will ultimately have a winning strategy. Not taking the entire universe of statistically similar trades will result in over-filtering.

    Of course the risk of ‘over filtering’ is as relevant to trading as it is to the insurance industry. Whereas severely limiting the amount of people insured under a plan increases the risk that a greater number will become ill, limiting the number of trades to take (all with similar risk profiles, of course) will increase the risk that there will be too many unprofitable trades!
    Take this example into consideration. Say there is a statistical chance that out of 1000 trading signals, a certain strategy will produce 100 drastic losses, but 10 gains which would more than make up for those losses. The rest would consist of an even mix of winning/losing trades (keep in mind that as this is a trading strategy, all of these trades already have similar risk profiles). A trader who doesn’t take each of these signals runs the risk that the trades they do take will have a greater percentage of losses. Consider these three possible outcomes that could occur with this strategy if only 100 out of the 1000 trades are taken:

    1. All 100 trades result in an even mix of wins/losses.
    2. All 100 trades turn into losses.
    3. 10 of the trades are amazingly profitable, the rest a mix- the best possible outcome.
    4. There are too few very profitable trades, and too many drastic losses.

    As with the insurance example, limiting the number of trades could result in great financial gains- but it also greatly increases the chances for a devastating loss. If all trades within this example strategy are taken, 100 of them will almost certainly result in a loss. But these losses will be more than offset by the 10 winners which are factored in. Not taking all of the trades not only a results in a greatly reduced chance of those winners appearing, but also a larger chance of a loss occurring. Even one of the severe losing trades leads to a higher percentage of losing trades than with the larger population of potential trades (ie. 1 losing trade out of 10 = 10 losing trades out of 100)!



    Consistency is Key in Automated Trading:

    Going back to the above examples, it must be understood that consistency in trading an automated strategy is a key to its success. Given the fact that the entire population of trading signals will have the same probability of being right/wrong, all trades made must be consistent in size of capital traded, risk management strategy, etc. in order to ensure a profit.

    Sizing up or down based on these signals is almost certain to cause disaster (unless the strategy is beyond the scope of this paper)- how is one to know which trade you are making is going to be the next big loss? If the trader suddenly begins to randomly increase the size of the trades, you increase the risk that each trade you size up on will bring about a loss that is not comparable to the winning trades you would have with smaller size. The results will be skewed, with the larger losses pulling down the winning averages.

    Discretionary trading a mechanical strategy can also prove disastrous as it runs the risk of reducing the population of statistically similar trades- reducing the population increases the risk of larger losses.
    ----------

    Well there ya go- this is just the way I go about developing strategies (which I like to see work over multiple markets, over mulitple time periods- won't trade things that are only effective in one issue) and looking at the market!