GPU accelerated high-frequency trading

Discussion in 'Automated Trading' started by alpha_tokyo, Oct 12, 2009.

  1. Thanks for the insight! Interesting article you posted too, fairly tough to understand though hah. Brings me back to my computer architecture class days. Is that (math or physics) your background? Is there a good place to start for someone without a lot of experience(books or anything)?

    I've spent some time with C++ in the past, it'd be kind of neat to toy around with it since it looks like that's where we're heading.
     
    #21     Oct 13, 2009
  2. Nvidia's new GT300 chip, due out in December, has apparently been designed to natively support C++
     
    #22     Oct 13, 2009
  3. I think threads on GPUs/CUDA have about the highest proportion of nonsense, buzzwords, posturing, chest-beating & etc of any on E'trader. I would be surprised if anyone on such threads has made a nickel using a GPU solution.
     
    #23     Oct 13, 2009
  4. in basic terms, why?

    why?

    when we debated using super computers to crunch numbers and handle the strains of program trading a few generations ago, we debated secondary processors and their additional support they would provide and found them wanting.

    in short, offloading display or logic processes to the gpu's that the display cards comes with comes at a bandwidth cost that pales in comparison to the significant achievements and improvements in the:

    1) motherboard architectures on Intel as of late
    2) the xeon and extreme edition processors capabilities
    3) the core i7 and core i7 extreme editions processors

    simply put, the gpu's of the display cards are oscillating at lower frequencies than the mobo/chipset combinations...

    also, if you offload logic processes then you essentially are using a gaming technique that usually would not be supported at major firms (just boutique shops), similar to gamers do, when they offload data, pswds, code and hack notes to their data-capable mouses instead of placing same on usb-keys or on their hdd's.

    all that you have done is create a maintenance and support nightmare that is essentially locked to that specific release and version of that hardware.

    imagine 6months from now, when you have completely new staff and no reference or documentation notes (frequent whole shop firings or abortions often occur) and trying to debug or reprogram your essential business logic...

    just not going to happen (properly)
     
    #24     Oct 13, 2009
  5. nitro

    nitro

    :eek: :eek:

    http://www.guru3d.com/news/nvidia-gt300s-fermi-architecture-unveiled/

    :eek: :eek:

    If nVIDIA put a two network cards in this GPU, I don't think I would ever need to buy a standard computer again. Just give me a bank of these. No wonder INTC is crapping in it's pants.
     
    #25     Oct 13, 2009
  6. Corey

    Corey

    Not all gains are monetary. I ported my monte-carlo simulations, without pretty basic knowledge, and got the new system to run in about 7% of the time of the old one. I am planning on moving my genetic algorithms over as well. I would also like to see if I can speed up my pairs analysis by porting some cointegration tests.

    No, this doesn't make me 'money' per se (in terms of high-frequency trading), but it sure does save me a hell of a lot of time.

    The point is less about the specific implementation: perhaps CUDA won't be around much longer -- but what I have learned is directly portable to OpenCL, which is a much broader standard, and may very well be the next generation. Why not at least take a peek at what is coming down the road?
     
    #26     Oct 13, 2009
  7. dloyer

    dloyer

    I did some work with CUDA to speed up my backtests.

    It reduced my backtest time from 15s/pass to 5ms/pass for 1 year of 5 minute bars over 800 symbols.

    I dont use it anymore. It takes too much time to develop and debug everything in C/C++.

    I used a low end card that I happened to have. Newer cards are much/much faster.

    The CUDA arch is optimized for vector processing vs more general purpose processing. The main tradeoff is to use the transistor budget for floating point execution units rather than a big cache. A normal cpu will stall for many/many cycles for memory access, so has a big cache to avoid the delay. The CUDA approach is to have many/many threads ready to execute and always have a thread ready to execute while waiting for memory. It only works for problems that are small enough to fit into memory and big enough to keep all those cores busy.

    Another way to solve even bigger problems is to use map/reduce, but the hardware cost is much higher.
     
    #27     Oct 13, 2009
  8. nitro

    nitro

    I went for a long walk with my pooch, and I started to think about the implications of this. If nVIDIA takes this to it's logical conclusion fearlessly, I am going to go out on a limb and say that nVIDIA would be the next Apple, and the stock could easily go to 100.

    This is a disruptive technology, or it can easily be made so by adding a solid state drive the card, a keyboard and mouse port, and a network port (it already has a graphics port :) ). Then implement an add in for the Visual Studio compiler to support the card directly, port the .net Virtual Machine, and you have a winner.
     
    #28     Oct 13, 2009
  9. Nvidia CUDA and Tesla gives yourself a personal super-computer.

    <object width="560" height="340"><param name="movie" value="http://www.youtube.com/v/l8FUmS1h-5U&hl=en_US&fs=1&"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/l8FUmS1h-5U&hl=en_US&fs=1&" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="560" height="340"></embed></object>

    <object width="425" height="344"><param name="movie" value="http://www.youtube.com/v/5qyhLAsVxlU&hl=en_US&fs=1&"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/5qyhLAsVxlU&hl=en_US&fs=1&" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425" height="344"></embed></object>
     
    #29     Nov 13, 2009
  10. http://www.nvidia.com/object/tesla_computing_solutions.html

    he NVIDIA® Tesla™ 20-series is designed from the ground up for high performance computing. Based on the next generation CUDA GPU architecture codenamed “Fermi”, it supports many “must have” features for technical and enterprise computing. These include ECC memory for uncompromised accuracy and scalability, support for C++ and 8X the double precision performance compared Tesla 10-series GPU computing products. When compared to the latest quad-core CPU, Tesla 20-series GPU computing processors deliver equivalent performance at 1/20th the power consumption and 1/10th the cost.
     
    #30     Nov 17, 2009