SQL for trading...?

Discussion in 'Trading Software' started by CloroxCowboy, Jul 1, 2009.

How do you feel about SQL?

  1. Good tool for trading.

    27 vote(s)
    47.4%
  2. Awful tool for trading.

    11 vote(s)
    19.3%
  3. Never tried it / no opinion.

    18 vote(s)
    31.6%
  4. Heard so many bad things I'd never want to try it.

    1 vote(s)
    1.8%
  1. nitro

    nitro

    Exactomundo.
     
    #111     Sep 24, 2009
  2. thstart

    thstart

    ;)

    General purpose solutions are always a compromise. From other side a custom tailored solution is not easy.

    But after weighting the options we came to conclusion the time learning and testing all these products would be better spend if we begin coding from scratch a custom tailored solution.
     
    #112     Sep 24, 2009
  3. rdg

    rdg

    I'm glad to see this thread is still going...

    The timestamp issue is a problem, but it happens to not be much of a problem for me. A lot of the data I'm working with is from iqfeed which only has second resolution. The data does have implied order within a second but I threw that out. I really should have added a column to store the relative ordering, or a subsecond column for my data that has accurate time stamps. That would at least solve the problems of being able to retain order using ORDER BY, but you would have to work to get it into a datetime type in C#, for example. I can't imagine MySql will ignore milliseconds forever, so maybe there will be a better solution in the future.

    One thing I learned with this is that I had to stop thinking in the row-based mindset and include as many columns as I think could ever be useful. I really didn't go far enough when designing the schema, even with 14 columns total. After using this more, I would have added more columns (dollar value of the trade, where the trade occurred relative to the book, etc). For most types of data it is essentially free to add columns, so don't hold back if you are starting from scratch! Queries that only operate on a single column are always going to be faster than queries that operate on multiple columns because of the way it stores data.

    At one point I put together a binary flat format that was extremely fast for the type of query I was running, but in the end, I've quit using it and have moved completely over to ib. My main use case so far is to take a (relatively) large amount of data, perform a transformation on it that returns a tiny fraction of the data, and analyze it offline. I've ended up storing the transformed data in both ib and sql server for analysis, and I imagine that I will do the same as I find more transformations that I want to apply.

    I have really come to appreciate that IB isn't a time series database. Adding computed columns to hold commonly used data is a partial solution, but the bottom line is that it sucks at things that time series databases really excel at. It will return results in a heartbeat for all sorts of complex (and useful) queries, but try to get it to compute a moving average or tell you at what price MSFT was trading at every time the S&P down ticked yesterday and you will cripple it. If I ever have the type of resources to invest in kdb, I'll ditch ib and switch without thinking twice. Needless to say, I'm not at that point yet, and I haven't come across any better solutions so far.
     
    #113     Oct 7, 2009
  4. thstart

    thstart

    I am glad too it is so important.

    Column oriented is a part of the picture - it is still too general.
    Time series oriented comes closer to what we need. Still there is one more component - computations. kdb has its own language. If we have to use a computer language the first thing I am looking for is it popular and if it is live. kdb is very proprietary and practically dead.
     
    #114     Oct 7, 2009
  5. nitro

    nitro

    I am analysing a large database of historical options data. I am storing it in a MySQL ENGINE=MyISAM.

    It is horribly slow.
     
    #115     Nov 4, 2009
  6. Occam

    Occam

    My experience with RDBMS's with large amounts of price data was similar -- slow and large memory/disk footprint.
     
    #116     Nov 4, 2009
  7. I hope you're at least using a dimensional model to store the data
     
    #117     Nov 4, 2009
  8. nitro

    nitro

    You mean using a Star Schema?
     
    #118     Nov 5, 2009
  9. Yea.
     
    #119     Nov 5, 2009
  10. For analysis, you might wanna check out SQL Server Analysis Services, or a database solution made for analysis/olap
     
    #120     Nov 5, 2009