I am looking to upload a lot of financial data from bloomberg into mysql to start constructing a proprietary financial db. For now only basic stock data, but will expand options and fundamental data. -1000+ tickers -1-min price (bid, ask, last), volumn For someone who has gone through building his own db, can you please shed some light on an efficient way to upload the data? I just plan to download in csv files 1-by-1 then upload to mysql using a script.
Two issues: 1. Bloomberg has rules on how much data you can download a month. It’s a lot for normal usage but not a lot of you want to build a comphrensive database. 2. With a vast amount of data you will need a proper database design. A few flat tables will make querying annoying. With sqlserver you can use a bulk upload bat file which is crazy fast (as it doesn’t affect the database logs). I don’t know if mysql has the same utility.
I think you will find for your $2000/month your sales person at Bloomberg will be very helpful. You should start there. They will also provide a sample format of the data.
If you design a "complete" (in your estimation) data set first, with a minimum of observations but all the fields ("and more!") that you could desire, AND THEN scale it up, you'll be in a better position to judge efficiencies. But the way you're going about it now (methods first, data sources second) -- you're going to end up with a great tool looking for the right problem to fit. My vote would be to get the sources first, and the template right, and then scale up. (BTDT...)
I am lost in how to batch download stock data instead of a ticker-by-ticker which will take forever. Any pointer?
Thanks. Actually, something I suspected but not yet bother to dig into. I eventually wanna have an 'offline' bloomberg for most market data I use (as the bloomberg account is tied to company and not sure how long it lasts). This project looks very useful. Anyone implemented https://www.codeproject.com/Articles/1078609/Market-Recorder-for-Interactive-Brokers-TWS
To insert the records in to mysql, there is a "LOAD DATA INFILE" commad that is very fast. Google it for lots of examples. A few hints: Assuming you only want records from 09:30AM to 4PM, for 1000 symbols that works out to: 6.5 hours * 60 minutes * 1000 symbols = 390,000 records per day Or 99,450,000 per year (assuming 255 trading days). Let's round that to 100M records per year. Assuming that you create a MySql table such as create table `Bloomberg_One_MInute_Quotes` ( `Symbol` varchar(12) not null, `DateTime` Timestamp not null, `Open` float default null, `High` float default null, `Low` float default null, `Close` float default null, `Volume` int default null, primary key (`Symbol`,`DateTime`) ); and, that the average symbol is 4 characters long, you will need 1+4 bytes per symbol, and 4 bytes each for all other columns. That works out to (1 + 4 + ( 4 * 6 )) = 29 bytes per record. Let's assume another 11 bytes per record for the Primary Key and Indices. So now we have 40 bytes per record. So in each year of trading, your MySql table will be 40 * 100M = 4GB. Assuming that you are using the MySql Community Server (i.e. the free version), the limit of any single table is 4GB, so the most you can store is one year. In that case, I suggest switching from MySql Community server to MariaDB, which is 100% upward compatible will support all your MySql scripts and commands. MariaDB is free, and in the public domain. Even Google uses it.
here is sample how to load data. Do not use wizards will never finish on large data: LOAD DATA LOCAL INFILE 'D:\\Documents\\Options Data\\CBOEData\\UnderlyingOptionsEODQuotes_2018.csv' INTO TABLE options.spxeod FIELDS TERMINATED by ',' LINES TERMINATED BY '\n' IGNORE 1 LINES
You are correct. It appears that MySql got rid of the 4GB Community Server limitation in the 5.* versions.