Mean reversion basket

ajensen · Apr 13, 2017

For a group of M related stocks that I specify, I would like to backtest the strategy of buying at the close, each day, equal dollar amounts of the N stocks that are most oversold, defined as being the most below (on a percentage basis) their P-day moving averages. (For example, I could own the 5 stocks out 50 that are most below their 21-day MA, in which case M = 50, N = 5, and P = 21.)

I would like to be able to test two versions -- on which assumes you can use the closing price to determine the basket, and a more realistic version that uses yesterday's closing price.

What software would you use for this?

nooby_mcnoob · Apr 13, 2017

Hey, I'm messing around with backtrader which could handle this pretty easily. Combine this with quandl's free closing data, you could probably get a long ways towards it.

Edit: And of course, the 800 pound gorilla: Quantopian could do this easily.

ajensen · Apr 13, 2017

nooby_mcnoob said:
Hey, I'm messing around with backtrader which could handle this pretty easily. Combine this with quandl's free closing data, you could probably get a long ways towards it.

Edit: And of course, the 800 pound gorilla: Quantopian could do this easily.
More...

Thanks. Backtrader https://github.com/mementum/backtrader looks especially useful to me since it's in Python, which I use.

stevegee58 · Apr 13, 2017

I'll second Quantopian. Python based, good data. I've used it in the past for an ETF rotation strategy I was developing.

Zzzz1 · Apr 13, 2017

The finest granularity data are minute bars? And only US securities? Just curious. Thanks.

stevegee58 said:
I'll second Quantopian. Python based, good data. I've used it in the past for an ETF rotation strategy I was developing.
More...

i960 · Apr 13, 2017

Zzzz1 said:
The finest granularity data are minute bars? And only US securities? Just curious. Thanks.
More...

If you want sub-minute granularity on US and things outside the US, I think it's a given you're gonna either pay $$$ up or have to come up with something on your own.

Why would you need sub-minute anyway? It's not like it's possible to even compete within that space. If it's for precision reasons, I'd imagine it's very much following the rule of diminishing returns.

Zzzz1 · Apr 13, 2017

Thanks. My expertise lies in order book analytics and higher frequency algorithm analysis and trading. I want to potentially expand into different asset classes and test existing strategies on data sets but am not prepared nor willing to shell out close 100,000 USD for data sets at the moment. Should the economics work out I have no problem to invest in collocation. My primary interest though at the moment lies in finding cloud based solutions to rent rather than own data sets I need for my backtest environment.

i960 said:
If you want sub-minute granularity on US and things outside the US, I think it's a given you're gonna either pay $$$ up or have to come up with something on your own.

Why would you need sub-minute anyway? It's not like it's possible to even compete within that space. If it's for precision reasons, I'd imagine it's very much following the rule of diminishing returns.
More...

nooby_mcnoob · Apr 13, 2017

Zzzz1 said:
Thanks. My expertise lies in order book analytics and higher frequency algorithm analysis and trading. I want to potentially expand into different asset classes and test existing strategies on data sets but am not prepared nor willing to shell out close 100,000 USD for data sets at the moment. Should the economics work out I have no problem to invest in collocation. My primary interest though at the moment lies in finding cloud based solutions to rent rather than own data sets I need for my backtest environment.
More...

Affordable rental for exactly your purpose here: https://www.quantgo.com/

Only problem is you can't download the data, but you should be at least able to determine whether the data is worthwhile for you to purchase.

Zzzz1 · Apr 13, 2017

Thanks a lot, I spoke with the guy at quant go and it did not meet my needs in terms of available data and contract specifics though pricing seemed OK. I don't intend to download data but only look to upload my testing environment and run the data on their aws instances. So, the quant go setup seems very interesting at first in terms of technology implementation but the data they advertise is either not available or not in the granularity advertised. That was a bit of a disappointment

nooby_mcnoob said:
Affordable rental for exactly your purpose here: https://www.quantgo.com/

Only problem is you can't download the data, but you should be at least able to determine whether the data is worthwhile for you to purchase.
More...

backtrader · Apr 14, 2017

This is the strategy logic in *backtrader* to do it. There aren't many requirements in the original post (for example there is no exit criterion), so the assumption is that there is an unlimited supply of cash, which can be added to the broker for the new acquisitions.

Cheat-on-Close: when a *Market* order is sent (default), *backtrader* tries to buy with the next incoming price, which for a daily bar situation is the next opening price. By activating *cheat-on-close*, one can buy with the current closing price.

In all-in strategies the following can happen when cheat-on-close is false and the next day open price is used:

The size, which is calculated with the last known price (the close) may actually be too much for the current cash reserves in the broker if there is an opening gap to the upside. The order will be rejected. In this case, up to 5 orders can be sent and if opening gaps take place, the last order is the one that can undergo that behavior.

Of course the strategy below needs to be executed inside a cerebro instance, which loads the data feeds.

Code:

class BuyMNP(bt.Strategy):
    params = dict(
        P=21,    # Average Period
        N=5,     # Number To buy
        A=1000,  # Monetary units to buy
        movav=bt.ind.SMA,  # Moving Average to apply
        cheat_on_close=False,
    )

    def __init__(self):
        # Decide if buy on today's close or next incoming price (open next)
        self.broker.set_coc(self.p.cheat_on_close)

        # Put the smas in a temporary dict indexed by data
        smas = {d: self.p.movav(d, period=self.p.P) for d in self.datas}

        # dict with the most oversold (%) indexed by data
        self.mos = {d: 1.0 - d.close / smas[d] for d in self.datas}

    def next(self):
        # Get a (data, %) list with the 5 which are most oversold today
        mos = sorted(self.mos.items(), key=lambda x: x[1], reverse=True)

        # Buy a max of N and only if below the moving average
        mosN = [d for d, perc in mos[:self.p.N] if perc > 0.0]

        # Unlimited cash supply ... add to the broker - to keep on buying
        self.broker.cash += self.p.A * len(mosN)

        # Execute the buy - no need to keep the order
        for d in mosN:
            self.buy(data=d, size=int(self.p.A / d.close))
            print(self.datetime.date(), d._name, d.close[0])  # some feedback

The complete code (there is a placeholder for the strategy) to make this run (with a cash start of 100.000 to avoid running into the opening gap problem)

Code:

from __future__ import (absolute_import, division, print_function,
                        unicode_literals)

import argparse
import datetime

import backtrader as bt

TICKERS = ['IBM', 'MSFT', 'YHOO', 'ORCL', 'NVDA',
           'GOOG', 'AAPL', 'AMZN', 'INTC', 'MCD']

########################
# PUT YOUR STRATEGY HERE
########################

def runstrat(args=None):
    args = parse_args(args)

    cerebro = bt.Cerebro()

    # Data feed kwargs
    kwargs = dict()

    # Parse from/to-date
    dtfmt, tmfmt = '%Y-%m-%d', 'T%H:%M:%S'
    for a, d in ((getattr(args, x), x) for x in ['fromdate', 'todate']):
        if a:
            strpfmt = dtfmt + tmfmt * ('T' in a)
            kwargs[d] = datetime.datetime.strptime(a, strpfmt)

    # Data feed
    for dticker in args.datas:
        data = bt.feeds.YahooFinanceData(dataname=dticker, **kwargs)
        cerebro.adddata(data)

    # Broker
    cerebro.broker = bt.brokers.BackBroker(**eval('dict(' + args.broker + ')'))

    # Strategy
    cerebro.addstrategy(BuyMNP, **eval('dict(' + args.strat + ')'))

    # Execute
    cerebro.run(**eval('dict(' + args.cerebro + ')'))

    if args.plot:  # Plot if requested to
        cerebro.plot(**eval('dict(' + args.plot + ')'))


def parse_args(pargs=None):
    parser = argparse.ArgumentParser(
        formatter_class=argparse.ArgumentDefaultsHelpFormatter,
        description=('Mean Reversion Basket'))

    parser.add_argument('--datas', default=TICKERS,
                        required=False, help='Data tickers to read in')

    # Defaults for dates
    parser.add_argument('--fromdate', required=False, default='',
                        help='Date[time] in YYYY-MM-DD[THH:MM:SS] format')

    parser.add_argument('--todate', required=False, default='',
                        help='Date[time] in YYYY-MM-DD[THH:MM:SS] format')

    parser.add_argument('--cerebro', required=False, default='',
                        metavar='kwargs', help='kwargs in key=value format')

    parser.add_argument('--broker', required=False, default='cash=100000',
                        metavar='kwargs', help='kwargs in key=value format')

    parser.add_argument('--strat', required=False, default='',
                        metavar='kwargs', help='kwargs in key=value format')

    parser.add_argument('--plot', required=False, default='',
                        nargs='?', const='{}',
                        metavar='kwargs', help='kwargs in key=value format')

    return parser.parse_args(pargs)


if __name__ == '__main__':
    runstrat()