Multi-Asset Class Realtime Cointegration Modeling Platform

Discussion in 'Trading Software' started by dirtybrown, Jun 2, 2012.

  1. Z score chart of nzdusd // nokjpy.

    Here are some terms for those new to this form of quantitative analysis, I am still learning:

    Glossary
    Autocorrelation
    Autocorrelation is used to identify an appropriate model in studying or analysing a time series. Autocorrelation answers the question, “Is there a (linear) relationship
    between the value of a series now and the value of the same series one or more time periods in the past?”
    Positive autocorrelation is the tendency for a given condition to persist – for example, during rainy season tomorrow is likely to be raining and wet if today is raining and
    wet. In equity markets a parallel might be price momentum – especially in the days after financial results. Positive reaction is followed by positive reaction; and negative
    by negative.


    Glossary
    Autocorrelation
    Autocorrelation is used to identify an appropriate model in studying or analysing a time series. Autocorrelation answers the question, “Is there a (linear) relationship
    between the value of a series now and the value of the same series one or more time periods in the past?”
    Positive autocorrelation is the tendency for a given condition to persist – for example, during rainy season tomorrow is likely to be raining and wet if today is raining and
    wet. In equity markets a parallel might be price momentum – especially in the days after financial results. Positive reaction is followed by positive reaction; and negative
    by negative.

    Cointegrating regression
    A cointegrating regression, like a single linear regression, regresses two or more series or variables and uses output statistics from the regression to determine if they are
    cointegrated. There are, however, important considerations when interpreting outputs.
    Among the outputs are the mean reversion coefficient, beta (â), the coefficient of determination (R2/R-squared) and t-values all of which are used to interpret the
    degree and nature of the cointegration, if any.

    Cointegration
    Cointegration is a statistical property of two or more time series that indicates whether the series have a long term relationship. When two series are cointegrated their
    prices may move away from each other in the short term, but they will in time revert to a mean value. This mean value is identified by the cointegrating regression of
    the two variables.
    Correlation
    A less robust measure than cointegration, correlation measures the strength of the relationship between two sets of data. A high correlation between two pairs of
    equities indicates that the two move together. Knowledge from the statistical tests is combined with knowledge of the market, industry or sector to take a market
    position on the two securities.

    Half-life
    The mean reversion coefficient is the basis for the half-life calculation in ArbMakerPlus:
    t½ = ln(2)/mean reversion coefficient
    Where ln refers to the log of 2. Half-life is not a magic number; and numbers derived from mean reversion coefficients > absolute (1) deserve scepticism. ArbMakerPlus
    outputs the half-lives of pairs but the quantity and timing of the cross-overs on the residual chart, in combination with ArbMakerPlus’s proprietary normality measures,
    should not be ignored and are frequently more helpful.

    Mean reversion coefficient
    Also called an ‘adjustment coefficient’ it indicates the expected ‘speed’ of mean reversion per time period. Thus a mean reversion coefficient of -0.45 suggests 45%
    movement towards equilibrium one time period from now.
    Absolute mean reversion coefficient values between 0 and 1 suggest stable systems; absolute values greater than one are less stable and also suggest over-shoot. For a
    cointegrating equation to be valid the mean reversion coefficient should be between zero and -1. The negative sign indicates adjustment back towards equilibrium.
    Mean reversion coefficients greater than 0 and 1 in a stationary time series contain autocorrelation.

    Modelling time series – lags, autocorrelation and partial autocorrelation
    Most data on economic and financial variables are time series. A time series is a sequence of variables measured over different intervals in time – daily, weekly, monthly
    quarterly or annual, for example. A first step in the statistical analysis of equity prices for pairs is selecting an appropriate model of the series generated over time by the
    economic or financial phenomenon.
    Normality
    Since the sum total of all residual values equals zero they display, by definition, mean-reverting characteristics and oscillate around zero. For pairs traders they will also,
    ideally, be normally distributed both through time and around zero. ArbMakerPlus has configurable tools designed to assess the degree to which this is so – forecasting
    and trading become easier when the most promising such cases can be identified.

    Analysing residuals for normality and mean-reversion is a crucial aspect of assessing the tradability of a pair. Simply identifying the presence of cointegration alone is not
    enough to achieve consistent results.

    Pairs trading
    A market-neutral trading strategy used by arbitrageurs to make money from the divergence and re-convergence of prices of a pair of equities.
    Key to the strategy is an analysis of the historical movements of the prices of the two stocks: the historical data of the pair are statistically tested to see if a relationship
    exists between them.
    Partial Autocorrelation

    ArbMakerPlus uses a partial autocorrelation calculation to determine the appropriate lag order of the AR model. If the sample autocorrelation plot indicates that an AR
    model may be appropriate, then the sample partial autocorrelation is calculated and plotted to identify the order of the model. The order is the number of lags to
    include as regressors in the model and is shown on a PACF plot at the point where the partial autocorrelations essentially become zero. Lagged pairs are generally
    tougher to trade because normalising their residuals means offsetting data series in the recalculation of the regressions necessary to test for cointegration. This in turn
    means losing observation points from the graphical representations of the relationship. For trading purposes that is not very helpful. ArbMakerPlus thus automatically
    filters out higher lagged pairs from its returns.

    Residuals
    Suppose a sample of investment bankers is taken and the average bonus of the group is calculated. The amount by which each individual banker’s bonus differs from
    this average is called a residual or prediction error. Summing these residuals necessarily equals zero.

    Stationarity
    Prerequisite to running a cointegrating regression is testing that the time series are integrated of the same order. The order of integration of a time series tells us about
    the stochastic properties of the series over time. In simple terms, a stationary series has a zero mean and constant variance.

    Standard deviation
    A measure of spread. A rough rule of thumb is that around two thirds of observations in a given distribution fall within a single standard deviation of the mean of the
    distribution; 95% fall within 2 standard deviations; and just about everything with 3 standard deviations.

    Z scores
    Simply what standard deviations are called by convention. This measure combined with the hunt for normal distributions (see the entry on normality above) is key to
    using ArbMakerPlus successfully.
    Why? If, for example, the residuals chart of a pair of equities currently reads ±2 Z scores from the mean value of the series - and the trader has checked that the
    distribution is normal or near normal - he knows that it is an exceptional reading. From the graphic in the standard deviation entry above only 2.27% of the distribution
    will fall there.

    This suggests, if the pair underlying the residuals is truly cointegrated, that future observations will trend the graph back toward the mean value – because that is where
    most observations must lie in a normally distributed series of data.

    Tradability
    In the application of cointegration to pair trading, the residuals are analysed to make the best determination of when the prices of the pair have deviated from their
    long-term mean; and to estimate the time of mean reversion of the series. This is, in essence, the process of gauging tradability. To assist this judgement ArbMakerPlus
    uses normality filters during the scanning process that help return the most tradable pairs in addition to reporting mean reversion statistics.



    ///

    Hopefully not too heavy of a reading for those used to standard technical analysis, haha just kidding.
     
    • z.png
      File size:
      260.6 KB
      Views:
      180
  2. How does ArbMaker analyse/forecast the "residuals" since they are random/normally distributed?
    It is generally mean reverting but how do you predict what is the next value/range between the upper/lower limits?
     
  3. residuals are the difference between the predicated vs. actual values from the market.
     
  4. Yes, residuals must always tend back to 0.

    This is what makes it different from a spread calculation, such as the arbomat implementation done by 7bit that sends forex data from metatrader directly through R, then plots a mean reversion.

    Because it's a spread, no guarantee it will go back to 0 beacuse its a more direct reflection of price. The scalings can shift, effetively repainting.

     
  5. Yes, but when? How can you tell it is ready to go back to zero?
     
  6. no quick answer on that as i need to go back to work. There are several other charts including macd and stochastics done on the spread ratio that seem to make it pretty intuitive when to enter.
    After the initial fointegration test, you look at all of the chart studies together.

    Will post more later as I use it more.
     
  7. Ok, good post.