SQL for trading...?

Discussion in 'Trading Software' started by CloroxCowboy, Jul 1, 2009.

How do you feel about SQL?

  1. Good tool for trading.

    27 vote(s)
    47.4%
  2. Awful tool for trading.

    11 vote(s)
    19.3%
  3. Never tried it / no opinion.

    18 vote(s)
    31.6%
  4. Heard so many bad things I'd never want to try it.

    1 vote(s)
    1.8%
  1. nitro

    nitro

    #41     Jul 3, 2009
  2. HDF5
     
    #42     Jul 3, 2009
  3. nitro

    nitro

    HDF5!? Really. It is just a datastore, it doesn't support queries.
     
    #43     Jul 3, 2009
  4. BTW, In fairness to Vertica, should mention their "$500 a month" cloud solution. http://www.vertica.com/_pdf/verticacloudpricing

    Whether it's useful for your purposes, and whether your needs are sufficiently encompassed by this "bargain" offering, I wouldn't know. And I haven't tried it myself.
     
    #44     Jul 3, 2009
  5. thstart

    thstart

    1) Amazon EC2-hosted
    2) 500GB/$500/month/1 node
    1TB/~$4,000/month/3 nodes

    1) Is not the best idea for high performance computing.
    1.1) you have the context switching overhead.
    1.2) 500GB-1-5TB back and forth to Amazon servers - I don't see how with current transfer speeds this would a practical solutions.
    2) this is just for the DB. adding other costs - development, etc. in order to get a ROI it is a long shot.
     
    #45     Jul 3, 2009
  6. thstart

    thstart

    By warehouse oriented I mean a large TB size DB's. I don't believe a lot of interested parties have an access to TB size financial data, probably the exchanges, very large financial firms, clearing houses, etc.

    Being TB size oriented, their focus is mostly on storage and management. For Vertica+computations you need a 3rd party solution

    I believe the products and the brains are excellent for large warehouse DB's. Mr. Stonebraker has >30 years experience this is undisputed. He is also well connected to the above mentioned large companies - last year data - Vertica has about 35 paying customers, I don't know if this is a lot.

    I would add that all column oriented DB story is way over-hyped like some kind of stocks. This is understandable
    because they have investors.

    Sybase vs Vertica lawsuit is exactly about the column oriented way of storing data. We analysed the Sybase patent - it is ~20 years old and if Vertica is doing the things this way it is an old technology already. As I mentioned before column oriented is not all the story - it helps only with the disk I/O performance. It being the slowest is the obvious low hanging fruit as they say today. This is why their focus is on multi TB warehouse oriented applications.

    From my records last year the Vertica pricing was like ~ $150K/terabyte of RAW uncompressed user data. This again is more warehouse oriented approach targeted to large inventory databases.

    If we need the max performance the best results are achieved when the DB is specifically tailored for the data of the problem domain. That is the part of the reason why general COTS DB are so slow - something like a Jack of all trades but master of none.

    Looking at the way so called column oriented DB are going it is similar - solving the I/O problem they try to fit their solution to all markets. This is understandable because they have an investors, but not exactly the best for the customers.
     
    #46     Jul 3, 2009
  7. thstart

    thstart

    The LSE TradeElect abandoned.
    http://blogs.computerworld.com/london_stock_exchange_to_abandon_failed_windows_platform

    "TradElect runs on HP ProLiant servers running, in turn, Windows Server 2003. The TradElect software itself is a custom blend of C# and .NET programs, which was created by Microsoft and Accenture, the global consulting firm. On the back-end, it relied on Microsoft SQL Server 2000. Its goal was to maintain sub-ten millisecond response times, real-time system speeds, for stock trades.

    It never, ever came close to achieving these performance goals."

    Here is an example of a high profile a joint Acenture/Outsourcing/Microsoft project for the London Stock Exchange. Microsoft was involved at all levels of development, testing, deployment, and support.

    1) the cost ~ 40M GBP
    2) only been used for two years, now abandoned.
    3)10,000 orders/second (this is low for an exchange)
    4)latency 10 ms+ (this is a lot)

    This is an example of using the wrong tools to do the job. MS Server OS is good enough, MS tools are good for UI, but why they used it for a real time applications?

    I believe the main reason for this failure is because .NET is at the heart of the platform.

    If the MS OS is configured with minimal overhead, not using .NET, programing to the bare metal with C++/assembler, SIMD usage, the disk formatted and aligned properly, elevating application execution to high priority and minimizing the context switching, using your desktop or server only for trading, it is possible to get a good performance from the current machines.
     
    #47     Jul 3, 2009
  8. I think this is absolutely not a .NET issue. .NET, properly handled, can outperform most C++ installs.

    It probably runs down to bad management of the whole thing, as well as totally crap programming. Seeing MS was "involved" and having been in some projects like that - this may not necessarily mean MS was responsible. There is so damn much politics involve din proejcts like that.

    GIven the quality of development and information flow I often have seen at oursources, it is likely MS did in the involvement a great job, but Accenture fed them wrong data an the whole thing was so tangled up in politics that is jut hit a wall.

    One has to distribute the amount of transactions they handle per second with the amount of servers invovled and will realize this is a really low number coming out ;)

    Not saing .NET is perfect or something, but whenever I see a name like "accenture" in a project list, I personally make sure I do not get involved. Ever. Period.

    Add to that some likely stupid decisions by the architect team (which most likely has no clue how an exchange works etc.) and you end up with a stupid system design that kills performance.

    Except all that is useless / not needed. OS overhead is minimal. SIMD will not get you anything in clearing applications - you simply miss the problem for this. No data level parallelism in price determination, sorry. Disk alignment should also be irrelevant.... In fact, an exchange clearing application should not be disc bound or do more to a database than logging under normal operations - and keep all things in memory. High priority is another irrelevant issue. Point is - what is the sense of putting an application of high priority if NOTHING ELSE GOES ON ON THE SERVER? Yeah, you are first in the line. Oh, and the only one. And minimal context switching is not something to optimize, because, again, one application = minimum context switching.

    In the end, we will never know. But there are non-trivial chances that this was either written as single threaded application running in a windows form that was never shown to the end user or it was written as managed code in stored procedures running in the database (with the "application" just calling those). Both are awkward stupid design decisions I have seen in real world projects, in both cases asked to get involved and in both cases some outsourcing team had kicked up major crap.

    Exchange clearing applications are inherently simple from the interfacing (i.e. nothing utterly complicated like 3d involved). Properly tuned MS IO stacks are not bad in performance, memory is memory, CPU is CPU (and in fact, C# is not that bad performance wise). So at the end it runs down to - programming skill. And that is not exactly something companies like Accenture are known for, sadly.
     
    #48     Jul 4, 2009
  9. thstart

    thstart

    I am working with .NET from the very beginning ~5 years ago ver 1.0. In the beginning was very excited. Speeds the development time a lot, UI programming is quick, Visual Studio is probably the best development environment ever, SSIS under MS SQL is very good, paralelizes when you have >1 CPU. What I can say - the best development environment ever, all functions available with Intellisense ready for you.

    I developed probably >20 web services under SOAP, XML, REST for a very important company. So I do know .NET very well, had a direct contact with Microsoft developers and solved many problems together - they are very helpful and know a lot.

    But when the question comes to the performance this is a different picture.
    If you read some of my former posts under this SQL topic you would see I made extensive benchmarks and now I do know that the .NET was the main cause of lack of performance. The performance suffers exponentially on my tests - 20,212 data points, regression, etc. with ~1,000 generated columns from computations. MS SQL 2008, MS Server 2008, .NET 3.5, VB.NET - impossible to make >10,000 data points - computations last at this point >1 hour, ~150MB of data. I am suspecting the .NET garbage collection management slows down the things a lot. Also because .NET is a virtual machine there is always an overhead no matter what you do.
    No vectorizing SIMD optimizations.

    Sybase - the same computations - 5 mins. That was still unnacceptable. But still you can see the big difference between ,NET environment and non .NET.

    On the top of that the performance monitor shows .NET version a max utilization of the 2 CPU, Sybase outperforms .NET ~10 times with only 1 CPU but it is very expensive and priced per CPU.

    Environment - Dell PowerEdge 2850 2XCPU XEON 3 GHZ, 4 GB RAM, RAID 2 x 75 GB HDD.

    See my former posts or try to make extensive benchmarks.
     
    #49     Jul 4, 2009
  10. thstart

    thstart

    I believe they wanted to store these data somewhere and for that used MS SQL server right?
     
    #50     Jul 4, 2009