data base constrution

Discussion in 'Data Sets and Feeds' started by sevenlaws, Jul 1, 2010.

  1. edbar

    edbar

    1. If the system is currently being programmed by someone, then they must have the data stored somewhere, otherwise, how are they retrieving it now for the programming that they are doing?

    2. If you are not a programmer and your programmers do not know how to store data, maybe the project is way bigger than you realize and is going to cost a fortune and take years to complete.

    My recommendation is to use an existing system. Some require programming but some are just point and click so you can do what you want without being a programmer.

    To answer your question, I use SQL for a everything prebuilt like pre-created indicators, and flat files for historical data - for fast retrieval.

    Ed
     
    #11     Jul 4, 2010
  2. When you say fast retrieval do you mean optimize on the fly fast or just fast enough that simulation isn't slow?

    Optimize on the fly would mean SSD or RAID, very high I/O, fast CPU(s) with software designed for hyper thread and lots of RAM (possibly DDR3 RAM Drive or the like)

    Fast access for simulation can be a Drobo or even a tower box with a decent RAID array over gigabit network and you will be plenty fine/fast. I can push a moderate strategy with 500-750 tickers through simulation in about 2-5 minutes.
     
    #12     Jul 4, 2010
  3. edbar

    edbar

    When referring to "fast" I was not even talking about the file system used or the hardware. Just fast retrievals when comparing opening a file and reading the data vs the overhead of the DBMS.

    Ed
     
    #13     Jul 4, 2010
  4. i will try to answer everyone .. the prgrammers are stock guys mostly and we are commodity guys.. they wanted to build the system on thier server and host it there, and use silver light for the program to be accessed at first. they do have a data base, but its all stocks and futures realtime only, ( REALWORLD SYSTEMS in panamal). they had futures data but no options. so we decided to host the program here and i think that there programmers were not use to that set up..

    would anyone be willing to email me a snap shot of a set up? just something for me to see.. i understand if its best not too in some cases but if there is anything that you could thats common knowlage that could help us a lot i think...

    thanks for all the help so far. it really is helping me any my partner are deciding weather or not we should buy it or not.
    hes trying to pull excel tick data and 1 min data right now but its going slow. ill read about HDF5 stuff thanks
     
    #14     Jul 6, 2010
  5. our computer is sick fast. quaud core, 12 gigs af ram. we didnt skimp on the cpu at all. we have 2 Tb drives for now. 1 for the OS and one for storage. we have a case that can can 5 drives in it that pull out. so i think we are ok on the computer for now.. maybe we will need another one for the data base it self???
     
    #15     Jul 6, 2010
  6. GTS

    GTS

    The first thing you need to know is database is one word.
     
    #16     Jul 6, 2010
  7. ET99

    ET99

    most kids have better and faster computer than you do.
     
    #17     Jul 6, 2010
  8. ET99

    ET99

    what kind of joke is that?
    either you use C#, or C++
     
    #18     Jul 6, 2010
  9. Hm, sorry.

    * Outdated CPU
    * WAY too little RAM for the CPU to play with
    * Slow drives.

    Gratulations ;)

    Not a bad computer, but I would not cal lthat "sick fast".
     
    #19     Jul 6, 2010
  10. Many people use .NET as an engine and C++/# for connectivity. Programmers use whatever they know best and it really does not matter.

    OP, I pull data on one machine with simple 1TB 7,200rpm HDDs (cheap retail). I then (daily) transfer the data over to my database, which is a modified backblaze. The backblaze is a 4U rackmount 48-hdd enclosure driven by an intel E8400 3.0ghz CPU and 4.0GB ddr2 RAM. The backblaze is ONLY for storage. The 48 2TB hdds are arranged into 3 seperate RAID6 arrays.

    http://www.grijpink.eu/tools/raid/index.php

    I have three 26TB arrays giving me a total of 78TB in RAID6 on retail 2TB hdds. This is all controlled by one e8400 cpu with 4GB RAM. All of this is over a gigabit network and to run simulations I will query the data to a local machine and then run it locally. The database itself looks like a giant sequence of folders, I have one main directory (TickData), three subs (2010, 2009 & 2008) and then each yearly folder has one folder per trading day. Inside the daily folders there is one Access database file for each ticker.

    We use a C# & Java application to pull/record/write the tick data into the Access files. Simulation machines are mostly dual quad core Xeon CPU boxes with W7x64 and 8-12GB RAM. Execution boxes are single Q9650 (quad core non-hyper thread) with 8GB RAM running W7x64 or XPx64.

    A quad-core Q9xxx CPU isn't the fastest chip out there but is plenty for a RAID array and storage only and is certanly plenty for data recording/pulling. I've worked in the futures space a little bit but mostly equities - its two different animals so take your time with decision making.

    Good luck!
     
    #20     Jul 6, 2010