Trading Catechism

Discussion in 'Trading' started by nitro, Oct 19, 2015.

  1. nitro

    nitro

    I think all math does is shed light on what is often hidden by the sheer complexity of the [inter]relations and the (in|de)duction needed to find them. Then, coded, the computer does most of the heavy lifting.

    I agree though, I have never seen anything like human intuition. Let me give you an example. Someone recently asked me how I came up with my model. I tell you, all I did was intuit my way to it, and it came almost naturally to me. Granted, I have been doing this for a very long time, so it is not like I just drank some OJ and poof out the top of my head it came. I estimate that it would take a computer probably something like 500 trillion simulations to find it. And then, it might not understand how to trade it.

    What is strange is, I don't even know why I was drawn to this model. It is as if it found me instead of me finding it. I tell you it is very strange. No question in my mind, the brain is not running some serial algorithm called creativity. Intuition and creativity are somehow exploring truly colossal search spaces way beyond even the most powerful computers, all almost effortlessly. Still, I wonder how Watson would do.

    To me, human intuition is close to being voodoo and is the one thing that makes me a closet mystic.
     
    Last edited: Nov 29, 2015
    #161     Nov 29, 2015
    brisvegas likes this.
  2. nitro

    nitro

  3. nitro

    nitro

    All very reasonable tries. In the end, it has to stand up to testing on actual bid/ask quotes.

    The hardest part of getting started is to get a database. Start collecting tick data now on as many instruments as possible. It won't be effort wasted. Many of those time series will be useful at one time or another.

    If there was one exercise that I would say is step 0 in all of this, is learn to align data correctly. So if say you are collecting B/A millisecond or better data on GE and SPY (to start with - ideally you want all the data in existance being stored in realtime), you are going to get many times more SPY then GE. Learn the different mechanisms for aligning data. They all have their pluses and minuses. Start with say resampling GE to have the same number of data points as SPY. Use, fill with last, with next, with linear interpolation, etc. Then plot the two series so that the x-axis is time and the y-axis is price. This is the "Hello World" of data science.

    Use whatever language you want for research, but I recommend Python. With pandas, scipy, numpy and scikits, you are going to get very far.

    N.B., you are going to need a disk (persistence) structure that won't fail. Already this is getting expensive. However, in a pinch just use a small RAID array. Pay close attention to the strategies CERN uses to store data.
     
    Last edited: Dec 1, 2015
    #163     Dec 1, 2015
    Gambit likes this.
  4. nitro

    nitro

    Someone asked me what I use. Here is a short (not inclusive) list of the software I use daily:

    Real-time
    =======
    C++
    C#
    Python
    Clojure
    Scala
    F#
    Erlang

    Research
    =======
    Python
    Haskell
    R
    Matlab (If I had the compiler I would probably use some models in realtime)
    Mathematica

    Everything sits on top of Apache everything:
    ===========================
    Spark
    Mesos/Marathon
    Docker
    Hadoop
    Hive
    Storm
    Cassandra
    HDF5

    I hold everything together with:
    ==============
    Git
    GlusterFS

    OS
    ===
    Linux
    Windows
     
    Last edited: Dec 1, 2015
    #164     Dec 1, 2015
  5. noddyboy

    noddyboy



    Sounds impressive, but why would one need two OS or more than one programming language. I used to use C++ and now I use python, but it baffles me why I would use two at the same time. Sometimes I might be forced to -- eg. Bloomberg or IB APIs, but Bloomberg now has Python too.
     
    #165     Dec 1, 2015
  6. Gambit

    Gambit

    Can we step back for a second and start with a few assumptions about a noob's resources and capacity:
    1) He/she will not be trading past 15-30 min time frame
    2) He will execute by paying the spread
    3) He will not have an ability to automate equities trading except by using a broker's pair algo, off the shelf equities spreader or by hand. That means no custom execution or legging in at good prices.

    And one last thing, what about starting with futures spreads which have readily available data through tt API?
     
    #166     Dec 1, 2015
  7. nitro

    nitro

    • As to two OS, F# and C# run best on Windows as most of my GUI is in Windows. I also use mono on Linux, but it is a little clunky still but getting better every day. That explains C# and F# and Windows.
    • As to why use more than one language, for example some underlying apache technologies are best used through the JVM. That explains Clojure and Scala.
    • Erlang is there mostly because there is some legacy stuff that I am too lazy to port.
    • C++ is there because one of my models tries to get to zero latency, which also explains Linux (I compile and handcraft all my kernels myself and use either low latency or realtime kernels. Plus I often have to build the Infiniband drivers myself and 100% of them are in C++). To say nothing that my FIX engine is in C++. There is no other language in existence that can do this for me, save assembler.
    • Python because I try to write all of my models so that they don't know if they are running in sim mode or in realtime and I write and research 90% of my models in Python and Matlab. For the slow stuff, Python works fine in realtime.
    • There are several ways for these language objects to communicate. In the past I used ZeroMQ for messaging. Today I am thinking more about microservices in Docker (which in turn uses ZMQ underneath) that are totally language and OS agnostic. This works fine for anything that is not arbitrage or HFT.
     
    Last edited: Dec 1, 2015
    #167     Dec 1, 2015
  8. nitro

    nitro

    That is all probably true. I am just saying this is how I do it. Every time I try to use some canned program I get annoyed within ten minutes of using it that I can't do a or b. There is always friction. So I build things the right way for me from scratch. Over time, things get easier since I already have a code-base.

    Sure, TT is fine. I just claim people should be saving every conceivable data they can get their realtime hands on. Why limit yourself to futures?
     
    #168     Dec 1, 2015
    Gambit likes this.
  9. Gambit

    Gambit

    Got it.
     
    #169     Dec 1, 2015
  10. noddyboy

    noddyboy

    Ahh I see, that makes sense. You are trading at a higher frequency than I am.
     
    #170     Dec 1, 2015