best way to hack up an rdbms to approximate something like kdb for poor/dumb people

Discussion in 'Automated Trading' started by garchbrooks, Mar 23, 2010.

  1. look into infobright.

    Also, there are lots of other suggestions in the different tick database threads out there.

    Eric
     
    #11     Mar 24, 2010
  2. I tried infobright, but it didn't work. I loaded everything from a CSV, and then did:

    select ticker from ticktable where ticker = 'foo';

    And I get an empty set. Is it supposed to work that way? It did fetch data quickly, but none of the queries were working.

    Tried reading the forums, couldn't find anyone who had the same problem. Used to same exact query that would work in MySQL, and infobright was having none of it.
     
    #12     Mar 24, 2010
  3. I was hopeful I was hoping for a magic bullet, , but figure you are probably right. (Although, to be fair, I made my request for poor and dumb people for a reason.)

    KDB usually takes a few seconds to fetch queries, but it has its own data limits. The hardware required was quite a bit. I was hoping to not to have to spend too much money getting new hardware/software.

    If I could get the query time down into the range of maybe 10-15 seconds then that might be workable.
     
    #13     Mar 24, 2010
  4. Ok, here is what I would go for on a budget. Still going to be a lot:

    * AMD X6 when it comes in a month
    * 16gb on a mini atx board
    * SAS controller
    * There is a SuperMicro SAS Cage for 8x2.5" discs. Costs around 400 USD ;)
    * Velociraptors. 8. Alternatively some SSD.

    On a non-budget... well... Supermicro case SAS backplane place for 24 discs. Dual Opteron with 1x6 cores, 32gb ram, doublind as needed. Raid controller, Velociraptors. Many. Or SSD.

    If that is not long term you could try renting capacity on a server like that.

    10-15 seconds are doable, but it needs hardware. Basically lots of RAM, fast discs. Especially if you want to store, for example, the results in the database.
     
    #14     Mar 24, 2010
  5. Did you verify that your data loaded properly? (http://www.infobright.org/wiki/Data_Loading/) I dont remember the command, but there is one to show the table statistics to make sure your data loaded properly

    Did you make sure your query was written properly? (http://www.infobright.org/wiki/Optimized_SQL_Functions_and_Operators/)

    I've never had any problems like that before. I'd suspect the issue is on your data load.
     
    #15     Mar 24, 2010
  6. Like posted above, sounds like a data load problem. I've played with infobright with a few billion rows of tick data. Its performance profile isn't all it's hyped up to be, but blows the other MySQL engines out of the water for the right use case.
     
    #16     Mar 24, 2010
  7. This works:

    select ticker from tickdata;

    This, however, doesn't:

    select ticker from tickdata where ticker='A';

    And that's on a table with 10 rows having nothing but rows where ticker = 'A''.

    If I shut down and restart the database, the data is there. It just has some kind of issue with the table. Could be a weird .deb package I installed, that's a possibility.
     
    #17     Mar 24, 2010
  8. either one of these works for me:
    USE Futures_Data; SET @bh_dataformat = 'txt_variable';
    select Symbol from Futures_Data where Symbol = "AD" limit 10;

    or


    USE Futures_Data; SET @bh_dataformat = 'txt_variable';
    select Symbol from Futures_Data where Symbol = 'AD' limit 10;
     
    #18     Mar 24, 2010
  9. Give SQlite a shot, it supports in memory database for optimizing performance. Your disk I/O will be a killer for the initial data load. Wrap your writes inside of transactions.

    It is a very fast ACID compliant DB in a single file.



     
    #19     Mar 24, 2010
  10. I really am a database newbie, so please pardon me if what I'm saying sounds absolutely dumb.

    Here's a sample slice of 1-minute data, in a file called 'slice.csv':

    Code:
    '2008-12-01', 10:21:00, 'A', 17.820000, 17.830000, 17.780000, 17.800000, 1600, 1400
    '2008-12-01', 10:22:00, 'A', 17.800000, 17.830000, 17.790000, 17.830000, 3200, 2100
    
    When I log into infobright, this is the string I get (so I know it's infobright):

    Code:
    Welcome to the MySQL monitor.  Commands end with ; or \g.
    Your MySQL connection id is 2
    Server version: 5.1.40-log build number (revision)=IB_3.3.1_r6997_7017(ice)
    
    I then do this (there's a backslash in here that doesn't show up right on ET; it's correct):

    Code:
    mysql> load data infile '/tmp/slice.csv' into table tickdata fields terminated by ',' enclosed by '\'';
    Query OK, 50910948 rows affected (3 min 10.44 sec)
    
    So the results of this query:

    Code:
    mysql> use tickdb;
    Database changed
    mysql> select ticker from tickdata limit 10;
    +--------+
    | ticker |
    +--------+
    |  'A'   |
    |  'A'   |
    |  'A'   |
    |  'A'   |
    |  'A'   |
    |  'A'   |
    |  'A'   |
    |  'A'   |
    |  'A'   |
    |  'A'   |
    +--------+
    10 rows in set (0.09 sec)
    
    But then I do:

    Code:
    mysql> select ticker from tickdata where ticker='A' limit 10;
    Empty set (0.02 sec)
    
    ... empty set??

    The test table layout I'm using for tickdata is this:

    Code:
    mysql> describe tickdata;
    +------------+----------+------+-----+---------+-------+
    | Field      | Type     | Null | Key | Default | Extra |
    +------------+----------+------+-----+---------+-------+
    | tickdate   | date     | NO   |     | NULL    |       |
    | tstamp     | time     | NO   |     | NULL    |       |
    | ticker     | char(16) | NO   |     | NULL    |       |
    | openprice  | float    | YES  |     | NULL    |       |
    | highprice  | float    | YES  |     | NULL    |       |
    | lowprice   | float    | YES  |     | NULL    |       |
    | closeprice | float    | YES  |     | NULL    |       |
    | upvolume   | int(11)  | YES  |     | NULL    |       |
    | downvolume | int(11)  | YES  |     | NULL    |       |
    +------------+----------+------+-----+---------+-------+
    9 rows in set (0.03 sec)
    
    What am I doing wrong here?
     
    #20     Mar 24, 2010