Data storage for backtesting

Discussion in 'App Development' started by calhawk01, Aug 24, 2012.

  1. Read up on sql views.


     
    #11     Aug 27, 2012
  2. Great thread!! I find it very interesting to read, and PocketChanges reply was very useful!

    Thanks guys!

    Kind regards
    Espen :D
     
    #12     Aug 28, 2012
  3. You don't want to use Views on a large table. You want to transform the data once and that's it.

    Also keep in mind that Views work only well with simple tasks, such as combining data from different tables or reshuffling columns a bit.

    Again, you want an ETL script that runs and pours your higher frequency table into a lower frequency table.
     
    #13     Aug 31, 2012
  4. my advice...... use php.. and mysql.... if your going to be displaying at via a website.. if you use these two technologies you be light years ahead with anything else you wanna do related to the website... IE content management:wordpress drupal etc.. etc.. ecommerce: Magneto
    you can easily get things developed in php cheaply.. and nothing your doing with the actual data is anything extraordinary.. so i would urge you to think more about what else your doing with the site... you need content management... depending.. i always use wordpress drupal or magento.. there are a bunch out there.. but at minimum develop using a framework like Zend or something so every developer you have work on it has a stardard work from ,organize files, etc.. what your actually talking about for data storage for backtesting isn't shit for any SQL server.. thats why i say.. forget that line of thinking..
     
    #14     Aug 31, 2012
  5. Especially on MySQL, which came by views lately, and still doesn't do them too well. Even worse when you have views referencing views. Use "EXPLAIN" to see what is going on with them, you will be shocked at the inefficiency of what is going on under the hood, at least compared to a highly optimized commercial database.
     
    #15     Aug 31, 2012
  6. Not that it really makes a difference, because the OP likely lacks the funds to put a database server worth calling one there ;) Dealing with those amounts of data requires a dedicated and planned infrastructure of some sort - I personally am just considering getting a nice 60 disc SAS subsystem (i.e. a 4U rack case that houses vertically up to 60 SAS discs, or SATA discs with interposer cards), for storage and - to handle the amounts of data.
     
    #16     Aug 31, 2012
  7. WHOA... Please share the details :D? I modified my Backblaze units so they are 3x 16 drives (48 drives total) so I'm interested to learn what motherboard and chassis fits 4x 15 backplanes... I'd try to modify that up to 64 drives assuming the motherboard could support that much.

    And what would I do with that size array? Probably run 64x 3TB drives and get about 180TB in RAID 60 (yes sixty) for an archive box.

    This would be a really cool build so if any of the chassis or specs are public or open-source please let us know.
     
    #17     Aug 31, 2012
  8. None. There is neither a Motherboard nor a Processor in the classical sense in there. This is a SAS Subsystem - you got 2 SAS controllers integrated and are supposed to plug it into a SAS network of some sort (switch, or directly into a RAID card).

    You will not like the price- around 10k USD ;) Check DataON 1660D on the internet ;)
     
    #18     Aug 31, 2012
  9. This is one of those exception to the "rules" cases. The views are simple and result columns are all contained in the index... just the group by (1 min, 5 min, 10min etc) clause changes.

    When doing an indexed lookup of a row, the usual procedure is to do a binary search on the index to find the index entry, then extract the rowid from the index and use that rowid to do a binary search on the original table. Thus a typical indexed lookup involves two binary searches.

    If, however, all columns that were to be fetched from the table are already available in the index itself, SQL will use the values contained in the index and will never look up the original table row. This saves one binary search for each row and can make many queries run twice as fast.

    For this specific case calling a 1 min view, 2 min view, 5 min view etc performance is already optimized with one table simplicity. K.I.S.S.



     
    #19     Aug 31, 2012
  10. to much information can be poison... the guy wants only 1 min data to serve to a website... why does everything have to be brought out to an exponent of 10 in complexity.. no need.
     
    #20     Aug 31, 2012