Zen and the art of ATS design...

Discussion in 'Automated Trading' started by TraderMojo, Nov 29, 2006.

  1. Hopefully I've sucked you in with the pretentious thread title.

    What I'd like to do here is discuss ATS system architecture best practices. To be clear, I'm talking about custom software built from scratch rather than systems coded up in Tradestation or Amibroker etc.

    In other words, I'm talking about the ATS framework.

    I've come to the conclusion that an open source solution will most likely not arrive on the scene any time soon (actually there are some but none that I deem suitable so far). This is not surprising given the nature of the industry we are in. It's very much "every man for himself".

    However, instead, I wonder if we can compromise and adopt a "Collaborate on ideas, compete on implementation" approach to ATS development.

    This thread is here to test that idea.

    Those of you with genuine experience of ATS development are encouraged to share your ideas, concepts, lessons learned (learnt?), architecture overview etc. Naturally, anyone, regardless of experience is welcome to contribute to the thread.

    If you have said experience and prefer not to share your ideas due to the large amount of effort expended getting to where you are, I fully respect your position. However, please try and refrain from coming on to this thread and naysaying or poo-poo'ing ideas and discussions that I hope will take place herein.

    I am quite familiar with the frustration that can come about when people start discussing things you have already done several years ago!

    If you have criticisms to make, please make alternative suggestions as clearly as possible. Cryptic one-liners just serve to annoy people and don't really take things forward.

    I apologize for the lecture, I'm just trying to preempt thread deterioration that I've witnessed on many forums.

    Lastly, in the spirit of collaborating on ideas, we'll try and avoid getting bogged down by debates on languages i.e. no Java-bashing, Phython cheerleading, C++ religious extremism etc.

    Many high-level architectural discussions can be done language agnostic. However, having some Java experience myself I will most likely be discussing various Java data structures etc. in the future if any other Java developers care to participate.

    Lets keep it on topic!
     
  2. fatrat

    fatrat

    I approve of the spirit of this thread, as I am involved in such an endeavor myself. As for my technology, I use C++ (ATL/COM), SQL Server 2005, and Genesis as my broker.

    Right now, my primary problem with my intraday trading system is that while I'm recording NASDAQ Level1 and Level2 quotes, I have not been able to determine an edge. When I traded by hand on the NYSE, I had some degree of insight on the patterns that arose from the way the specialist managed the books. With NASDAQ, I'm currenty puzzled at what sorts of variables I should be using as inputs to any sort of trading system.

    Let's take NTAP, for example. I've decided to use this stock as the basis for my intraday trading system because of its moderate volume, reasonable amount of liquidity, and average strength of its intraday moves.

    Look at the Level2 information I collected from just yesterday's trades. The following is a list of MPIDs from NASDAQ and the number of book-modifications that came with a particular MPID in question. These are the numbers I collected, and they're probably slightly off because I started data collection before 9:30AM. What I'm trying to determine is whether this Level2 information has any value whatsoever [and would love to hear others' views].

    For the most part, most of the transactions and market making going on for NTAP happens over ISB, NSDQ, and ARB. It's common to see, for example.

    'SELECT DISTINCT MMID, count(*) FROM L2TABLE WHERE TICKER='NTAP' GROUP BY MMID ORDER BY count(*) DESC'

    Results:

    ISB 154662
    NSDQt 144141
    ARB 123350
    EDGA 3924
    LEHMt 1029
    SBSHt 752
    UBSSt 668
    TMBRt 457
    BTRDt 411
    NITEt 381
    EDGX 265
    PERTt 178
    AUTOt 163
    FBCOt 138
    MAXMt 134
    CDRGt 115
    NFSCt 104
    RBCMt 96
    BOFAt 70
    MADFt 52
    TRAC 46
    COWNt 44
    BARDt 40
    CIBCt 37
    ETRDt 33
    MSCOt 31
    GSCOt 30
    PRUSt 30
    AGEDt 28
    WCHVt 28
    JEFFt 28
    STFLt 25
    MOKEt 22
    SUSQt 20
    CRTCt 18
    HDSNt 18
    CEUTt 16
    RAJAt 16
    JPMSt 16
    FBRCt 14
    MLCOt 14
    HSBCt 14
    EFGIt 14
    EKNSt 14
    CANTt 14
    RHCOt 14
    LYONt 14
    STCSt 14
    TWPTt 14
    WBLRt 14
    GARCt 14
    DBABt 14
    FACTt 14
    LAZAt 14
    MWREt 14
    HILLt 14
    BESTt 14
    PIPRt 14
    GNLNt 14
    BERNt 14
    KINGt 14
    THNKt 14
    CHDNt 14
    BNCHt 14
    ADAMt 14
    GROWt 14
    PACSt 14
    NEEDt 14
    WEEDt 14
    FRANt 14
    BMOCt 14
    VNDMt 14
    DOMSt 14
    KBROt 7
    NACIt 7
    FAGIt 7
    ALLNt 7

    If you look at the size on the bid, the average size posted on a Level 1 quote is 8, the standard deviation is roughly around 8. If you assume sizes are normally distributed on the bid, then a size of 32 or more only shows up on the inside roughly 5% of the time. Chart analysis of moves after the display of size shows nothing special.

    Here's the actual numbers and SQL query for 2 days of Level 1:

    SELECT STDEV( BSIZE ), AVG( BSIZE ), MIN( BSIZE ), MAX( BSIZE ) FROM L1Table WHERE TICKER='NTAP'

    8.51408077193801 8 1 108

    I suppose my question to the trading system developing public is whether they believe information that's useful is present here. I'm trying to construct a sort of understanding of how shares move on NASDAQ.
     
  3. Actually fatrat, the intention of the thread was to discuss software architecture for ATS frameworks rather than strategy implementations or the trading logic of an ATS.

    I can't help you with the specific question you posted. Perhaps others can!

    On the topic of software architecture, I believe you have some experience in that domain so your contributions on that matter are most welcome.
     
  4. fatrat

    fatrat

    The reason I posted that information is simply this: How much information does an ATS need from the usual market sources to be successful? We know that the AMIBroker/Tradestation users have access to all the features of their respective languages, but what of those features are they actually using? What are the most used? Where do they go with that?

    Where are we looking for information? News sources? The degree of customizablity for a raw system developer is so high, this thread can go in any direction.

    Consider that, in the past, hand-based NYSE traders didn't even need a chart and only the tape. They didn't need OpenBook. So the question in my mind is whether an ATS developer has to expend time and energy collecting specific forms of information.

    So, really, are we going to duplicate what AMIBroker and Tradestation have, or are we looking for a different edge in terms of information processing? Really, why are we doing this? So, I think questioning the validity and usefulness of our data sources is very relevant to the design of an ATS.
     
  5. The point with respect to information/data sources in your first post was lost on me. Now I see what you are getting at.

    Why are we doing this? That's a good question to start this thread off.

    Developing custom software gives the developer/owner of the software unprecedented levels of control and flexibility that is simply not possible with third-party commercial software. The closest I'm aware of is QuantDeveloper that provided modules and APIs in that provided the plumbing and infrastructure neccessary to put together an ATS.

    Further features that are not found in some/all of the more popular commercial offerings that I would like to explore include:

    1) Pluggable data feeds (acknowledging performance hit taken by having the flexibility)
    2) Pluggable broker execution and support for FIX.
    3) Truly scaleable and robust. Support for clustering and failover.
    4) Following on from that, server based with thin (web) and/or thick client access to the server engine.
    5) Support for backtesting and optimization large datasets by leveraging grid computing.
    6) Support for genetic algorithm optimization.
    7) Support for pluggable neural nets.
    8) Support for retrieving and incorporating non-standard market data e.g. for semantic news analysis etc.
    9) Tick-based algorithms.
    10) Support for options trading and pluggable option pricers.
    11) Scriptable functionality where appropriate.

    Granted, some/most of these features may seem highly ambitious but if the software is planned initially with these features in mind then I believe it will have greater potential for extensibility later on.

    Furthermore, at least in the Java world, over the last few years, many of these features are more feasible than ever due to the comoditization of the relevant technologies.

    e.g.

    Scaleable, High availability messaging Implementations:
    Too many to list but e.g.
    http://labs.jboss.com/portal/jbossmessaging/

    Grid Computing (C + Java):
    http://www.globus.org/

    Genetic Algorithms:
    http://jgap.sourceforge.net/

    Neural Networks:
    http://www.jooneworld.com/

    Options pricing etc.
    www.quantlib.org

    Scripting engines:
    Too many to list but my favorite:
    http://groovy.codehaus.org/


    Hopefully, this has given some insight into the kind of scope of the project I have in mind. Yes, it's much larger and sophisticated than some commercial applications. Yes, it's not going to get implemented any time soon. Indeed, it may be overkill for well over 95% of people. Those people can use the Amibrokers, Tradestations, Wealthlabs and Rightedge's of the world...

    I give my personal development efforts on this project about 5% chance of success. However, I'm more than happy to provide any source code I develop freely in an open source fashion as this thread develops.

    I'm curious to examine the features and capabilities of Frosty's Frostengine that has been promised for release at some point. At this juncture I have dismissed JSystemTrader as 1) I am not convinced by the core architecture, 2) It is growing organically and 3) has a much more narrow focus and end goal.

    However, I would more than welcome the respective developers of those systems to participate here and perhaps they may gain a couple of ideas for incorporation into their tools.
     
  6. Fatrat - I see you started your own thread on ATS framework development! Congratulations and good luck with that. I look forward to reading it and seeing the system unfold.

    This is great. The more discussions that take place on this topic, the more ideas are bound to come out.
     
  7. Great thread Rufus. Thanks for that. Pity it didn't stay on track for too long.

    You guys seem to be quite hardcore low level programmers. Like yourself, I hadn't written any code till recently for many many years. Even then, I stay quite high level.

    The question of timeframe e.g. high frequency or low frequency and how that affects the implementation of an ATS framework is interesting and something to think about.

    If this thread fails, the most likely reasons were summed up nicely by Stephen Crowley:

    "What is the incentive for the ATS developer to share the countless hours of work he has put into it?"

    "Also, why help the competition? We really are fighting over the same trades a lot of time."

    On the plus side, that thread reminded me of some other great open source Java components that would be useful in an ATS:

    Quartz for scheduling (all Open Sympony components are great):
    http://www.opensymphony.com/quartz/

    Jive Software's SMACK API for Jabber/IM integration
    http://www.jivesoftware.org/smack/

    Commons Logging:
    http://jakarta.apache.org/commons/logging/

    Jakarta Commons CLI for command line processing:
    http://jakarta.apache.org/commons/cli/

    Plus a bunch of others....
     
  8. If anyone else has opinions on why we are doing this? either for or against, please feel free to share your thoughts here.
     
  9. I just happen to use something that I am familiar with, I was a part of the ANSI C++ V3 committee, so there are a lot of features of C++ that I thought is poorly thought out. When Gosling first released the Java V1.0, there are some league features that I didn't like either.

    I dealt with high performance systems my entire life, so I tend to care a great deal about how much control I have over the internals, that's all.
     
    #10     Nov 30, 2006