Zen and the art of ATS design...

Discussion in 'Automated Trading' started by TraderMojo, Nov 29, 2006.

  1. As I cover each class and perhaps document with UML, responsibilities should become clearer.

    As per earlier post, signal generation could be encapsulated elsewhere outside of the class if needed.

    I haven't covered the money management/position sizing component yet or indeed other possible order filters so that is yet to come.

    I have found that breaking things up into too many components can lead to practical difficulties for strategy development. The main issue is sharing common data between components.

    It's interesting to note the SmartQuant documentation referred to me shows that the componentized model with separate components for entry,exit,money,risk,exposure etc. has effectively been superceded by an architecture that uses only a single component for all of those functions. I can only guess the reasons why. I think there is a middle ground though.

    I don't have any particularly strong emphasis on pure OO design. I'm just going about things in a way that seems logical to me and is conducive to a certain level of flexibility.

    Let's just say it was mis-named. SignalGenerator has been renamed to Trader.

    I'll be covering the PositionManager and money management at some point in the future.

    Yes, OrderManager simply keeps a list of orders and tracks their status, fill amounts etc. and executes orders submitted to it. It's a fairly dumb component with the only intelligence perhaps being to simulate complex order types that are not supported by the broker.

    I'm aware that different people attach different semantics and concepts to common names so there is a lot of room for confusion. I'm totally flexible on naming so any suggestions are welcome.
     
    #101     Dec 18, 2006
  2. OK Guys. For the three or four of you that are following along, just thought I'd let you know: I'm on a break for a few weeks as of tommorrow so it's not likely I'll be updating this thread till I'm back.

    I will be monitoring if anyone has some ideas to contribute.

    I'll leave with a brief overview of some of the main ATS components I haven't yet covered:

    1) Tick Compressor - this is responsible for generating the Bar events that trigger onBar(). It listens for Tick events and then broadcasts Bar events. It can be a Strategy-level component whereby the bar types required for the Strategy are defined and the Tick Compressor for the Strategy generates those Bars. Bars can be a variety of types and periodicity. So for example you can have 5 and 10 minute bars being generated for the same Instrument if you use signals on one timeframe and trade on another.

    2) Tick Filter(s) - these are placed in front of the Trader and monitor the stream of Tick events as they come in. They can alter the Tick events, remove Tick events or just look at the Tick events to generate other kinds of related information that can be made accessible to the Strategy. For example you could have a Tick Filter that constructed a virtual Order Book from level 2 quotes. Or, perhaps you could have a Tick Filter that removes stray ticks or maybe you are only interested in ticks of certain sizes and want to filter out the noise.

    The idea I have in mind is to implement Tick Filters using a modified Chain of Responsibility pattern so that you can have zero or more Tick Filters configured per Strategy.

    3) Order Filter(s) - these are placed behind the Trader and monitor Orders as they come out of Trader. Order Filters can either modify an Order e.g. set position size or they can block Orders from reaching the Order manager based on trading rules.

    Again, the idea is to use a modified (though different from above) Chain of Responsibility pattern to implement the Filter Chain and Filters. Again, this provides the flexibility to have zero or more Order Filters on a per strategy basis and have the ability to reuse/swap in/out Order filters across Strategies.

    4) Market Data Recorder - self-explanatory. The ability to accurately record ticks in chronological order across Instruments so that when playing back multi-instrument tick streams, ticks arrive in order for realistic backtesting. Looking again at possible tick database implementations or tick flat-file compression possibilities.

    5) Strategy-level administrative listeners. Arbitrary listeners that work at the strategy level but are not related to decision making logic. These listeners are capable of receiving the same events as Trader. Example uses for these listeners include: Send an e-mail/IM/SMS when the Strategy makes a trade. Not suitable for high-frequency trading systems obviously!

    6) Backtesting and Optimization in more detail.

    Alot to think about for anyone interested. As usual, thoughts and comments welcome.
     
    #102     Dec 20, 2006
  3. Java is a wonderful language, but I have a question about real-time efficiency. Some instruments generate 80,000 to 100,000 quotes in a 6.5 hour trading window, 30,000 to 70,000 trades, and maybe upwards of 800,000 level II changes. If the system monitors more than one or two of these types of symbols, will Java be able to keep up? Especially when you include various and sundry real-time indicator calculations? Perhaps a real, compiled language might be more appropriate? (yes, I know, I'm going to get nailed on that 'real, compiled' remark).

    I know IB's Java trade API can throw fits if you try to do too much with it, and it is only passing info, and not doing anything with it.
     
    #103     Dec 20, 2006
  4. I think it was done that way due to feedback from users. Strategy development is a 'holistic' thing, and is hard to do from a module perspective. When you have indexes influencing the direction of trading, when you have trade momentum influencing money management and risk (or vice versa), and when you use multiple instruments at once, using a modularized system may have been unwieldly.

    The flow of your design thoughts seem good so far. The infrastructure you are designing needs to support the 'inputs' and 'outputs' required of the strategy, which effectively becomes a black box, with said black box 'subscribing' to the various services offered by the development environment.

    There were a number of words regarding 'Indicator' and 'TradingSystem'. They may in fact take second row seats to some of the other possibly more important support services needed from the infrastructure. The IProvider is definitely one. Some sort of instrument management is necessary. Some sort of order management is necessary. If you're attempting to provide back testing facilities, then the whole world of ordered trade/quote/depth submission and execution simulation will be required.

    Have you laid out a basic development timeline (relative of course) in order to prioritize development in order to get something basic in place upon which all other stuff can be bolted and enhanced?

    And somewhere in your design notes, I think you were anticipating running multiple strategies at the same time. Which I think is a good thing, because, for example, I may want to massage index data differently from what I'll be doing with regular equity/etc type data, but yet the index data will be used to influence the strategy used for the equity/etc type of data.

    I also recall mention of design for storage of historical back-testing data. You may want to get that nailed down up front rather than later on. Storing eod/trades/quotes for a number of symbols over a period of time can quickly consume mega (or even giga) drive space. So some form of, dare I say it, custom compressed/indexed format may be of value.

    I once tried to start a Perl based infrastructure, but after a few months of design and research, I realized that there was ever so much to do to make it work. I decided I'd rather focus on strategy development rather than infrastructure development. So I can only applaud your efforts to in this regard.

    I think you've made good progress design decisions. I look forward to seeing what can be implemented.
     
    #104     Dec 20, 2006
  5. IMHO Java performance is adequate. Modern JVMs really are amazing in the performance they achieve - beating C++ sometimes.

    I've written a real time scanner in Java that evaluates expressions like

    RSI > 65 AND SMA20 > SMA50 OR .........

    etc, on a tick by tick basis.

    I've done a little performance testing with IB data on 50 reasonably high volume stocks. On a Linux box - Athlon 2800 (Socket A !), using SUN 1.5 JVM, you can hardly see the CPU utilization using top. Even replaying the data at 100x market speed causes the CPU to barely raise a sweat. The only performance issue is displaying the alerts in a Swing JTable.

    Even computations that are a bit more demanding than calculating SMAs are not an issue eg

    price crossed above the developing market profile upper value area

    is not an issue.

    To be honest, I was more than a little surprised at how truely excellent the performance is. My initial thoughts were to develop the code and then get a dual core box and run it in a 64 bit JVM, but it seems the old banger I'm using is quite sufficient.

    There seem to be a number of reports floating arount that 1.6 is 20%+ faster than 1.5.
     
    #105     Dec 21, 2006


  6. To put a figure on storage requirements, 3 months of all book changes delivered by IB for the DAX takes about 450 Mbyte in a Mysql table. The format is pretty much the same as that delivered over the wire from IB with the addition of a timestamp. Recording all IB 'ticks' for DAX for 3 months takes about 55 Mbyte.

    I agree that Mysql (or similiar) is not the best tool for the job and some custom implementation is possibly the way to go - for performance reasons as much as for storage space which after all, is pretty cheap these days.
     
    #106     Dec 21, 2006
  7. OK, I'm back.

    It's going to be difficult to get re-motivated on this project after having been away from technology and the markets for a few weeks.

    Let's see how it goes. I'll probably kick-start things again properly tomorrow or Friday so, if anyone has come up with some questions, queries, suggestions or ideas in the last few weeks, now is the time to post!
     
    #107     Jan 10, 2007
  8. Sorry for the delay in response. I must have just missed your post before I left ET. I should probably subscribe to my own thread.

    I have no reservations just yet about Java's ability to cope with high volume throughput. My volume needs may be more modest than others though. Luckily, anyone following along can implement their ATS framework in their desired language.

    Java is compiled...just-in-time :D Actually, if required, it is possible to do a full ahead-of-time compilation to native code using one of the commercial or GNU tools out there. I have never done this myself and don't plan to.

    Perhaps, my remarks on Java's suitability for Real-time were misinterpreted. The real-time issue was referring to is this one:

    http://en.wikipedia.org/wiki/Real-time

    This is one of the primary reasons why I'm going to great lengths to make the data feed and broker aspects swappable via different adpaters. I'm wary of too tight a coupling to IB's API or indeed any other broker. The cost of this flexibility is of course a small performance hit.

    Furthermore, although I've noticed it's prevalent here on ET, it's best not to equate IB's TWS performance with that of Java's performance and capabilities in general.
     
    #108     Jan 10, 2007
  9. Interesting. At least it lends some further credibility to the approach I have outlined which is good to know.

    Essentially, the infrastructure as described is getting to a point where it dictates the dependency paths for development itself.

    Off the top of my head the list could be something like this:

    1) Instrument Registry/Manager/Server
    2) Provider Subsystem
    3) Tick Compression
    4) Data Storage/Retrieval
    5) Order Manager
    6) Strategy Container etc.

    I'm struggling to remember everything I've covered in this thread so I'll need to re-read things and will probably at some point put together a diagram detailing the how the layers are built upon and depend upon each other.

    At the top of the pyramid is the Strategy Container and hence probably the last thing to be built.

    Agree. It's something I'm hoping I can extract information from other members about. Having a pluggable storage mechanism should be feasible so that a basic storage scheme can be used initially and then replaced with a more performant solution if required.

    Certainly, a custom linear memory mapped file approach is a possibility.

    However, HDF:
    http://www.hdfgroup.org/whatishdf5.html

    Seems to be a good choice given that it supports linear contiguous and compressed data storage etc. It's already optimized for IO efficiency.

    I would like to hear how others like ktmexc20 have implemented their solutions using HDF.

    I need to research this area more.

    Thanks for the input and remarks.

    I still put my chances of success with this project at about 5%. I will continue to plod along at a very slow pace.
     
    #109     Jan 10, 2007
  10. Hi TM,

    What in particular would you like to know?
     
    #110     Jan 10, 2007