importing vast set of financial data into mysql

Discussion in 'Data Sets and Feeds' started by silveredge, Nov 5, 2018.

  1. I am looking to upload a lot of financial data from bloomberg into mysql to start constructing a proprietary financial db.

    For now only basic stock data, but will expand options and fundamental data.
    -1000+ tickers
    -1-min price (bid, ask, last), volumn

    For someone who has gone through building his own db, can you please shed some light on an efficient way to upload the data? I just plan to download in csv files 1-by-1 then upload to mysql using a script.
     
  2. newwurldmn

    newwurldmn

    Two issues:

    1. Bloomberg has rules on how much data you can download a month. It’s a lot for normal usage but not a lot of you want to build a comphrensive database.

    2. With a vast amount of data you will need a proper database design. A few flat tables will make querying annoying.

    With sqlserver you can use a bulk upload bat file which is crazy fast (as it doesn’t affect the database logs). I don’t know if mysql has the same utility.
     
    silveredge likes this.
  3. Robert Morse

    Robert Morse Sponsor

    I think you will find for your $2000/month your sales person at Bloomberg will be very helpful. You should start there. They will also provide a sample format of the data.
     
    VPhantom likes this.
  4. tommcginnis

    tommcginnis

    If you design a "complete" (in your estimation) data set first, with a minimum of observations but all the fields ("and more!") that you could desire, AND THEN scale it up, you'll be in a better position to judge efficiencies.

    But the way you're going about it now (methods first, data sources second) -- you're going to end up with a great tool looking for the right problem to fit.

    My vote would be to get the sources first, and the template right, and then scale up. (BTDT...)
     
    Last edited: Nov 5, 2018
  5. I am lost in how to batch download stock data instead of a ticker-by-ticker which will take forever. Any pointer?
     
  6. Thanks. Actually, something I suspected but not yet bother to dig into. I eventually wanna have an 'offline' bloomberg for most market data I use (as the bloomberg account is tied to company and not sure how long it lasts).

    This project looks very useful. Anyone implemented
    https://www.codeproject.com/Articles/1078609/Market-Recorder-for-Interactive-Brokers-TWS
     
    Last edited: Nov 5, 2018
  7. DaveV

    DaveV

    To insert the records in to mysql, there is a "LOAD DATA INFILE" commad that is very fast. Google it for lots of examples.

    A few hints:

    Assuming you only want records from 09:30AM to 4PM, for 1000 symbols that works out to:
    6.5 hours * 60 minutes * 1000 symbols = 390,000 records per day Or 99,450,000 per year (assuming 255 trading days). Let's round that to 100M records per year.

    Assuming that you create a MySql table such as
    create table `Bloomberg_One_MInute_Quotes` (
    `Symbol` varchar(12) not null,
    `DateTime` Timestamp not null,
    `Open` float default null,
    `High` float default null,
    `Low` float default null,
    `Close` float default null,
    `Volume` int default null,
    primary key (`Symbol`,`DateTime`)
    );

    and, that the average symbol is 4 characters long, you will need 1+4 bytes per symbol, and 4 bytes each for all other columns. That works out to (1 + 4 + ( 4 * 6 )) = 29 bytes per record.

    Let's assume another 11 bytes per record for the Primary Key and Indices. So now we have 40 bytes per record. So in each year of trading, your MySql table will be 40 * 100M = 4GB.

    Assuming that you are using the MySql Community Server (i.e. the free version), the limit of any single table is 4GB, so the most you can store is one year. In that case, I suggest switching from MySql Community server to MariaDB, which is 100% upward compatible will support all your MySql scripts and commands. MariaDB is free, and in the public domain. Even Google uses it.
     
    Last edited: Nov 5, 2018
    helgen_1 and silveredge like this.
  8. RedDuke

    RedDuke

    here is sample how to load data. Do not use wizards will never finish on large data:

    LOAD DATA LOCAL INFILE 'D:\\Documents\\Options Data\\CBOEData\\UnderlyingOptionsEODQuotes_2018.csv'
    INTO TABLE options.spxeod
    FIELDS TERMINATED by ','
    LINES TERMINATED BY '\n'
    IGNORE 1 LINES
     
  9. DaveV

    DaveV

    #10     Nov 5, 2018
    silveredge likes this.