Log in or Sign up

ET News & Sponsor Info

General Topics

Markets

Technical Topics

Brokerage Firms

Company Specific

Interactive Brokers

Tools of the Trade

Trading for a Living

Community Lounge

Site Support

Feedback

importing vast set of financial data into mysql

Discussion in 'Data Sets and Feeds' started by silveredge, Nov 5, 2018.

silveredge
- 182
  Posts
- 29
  Likes
I am looking to upload a lot of financial data from bloomberg into mysql to start constructing a proprietary financial db.

For now only basic stock data, but will expand options and fundamental data.
-1000+ tickers
-1-min price (bid, ask, last), volumn

For someone who has gone through building his own db, can you please shed some light on an efficient way to upload the data? I just plan to download in csv files 1-by-1 then upload to mysql using a script.

#1 Nov 5, 2018

Share
newwurldmn
- 18,066
  Posts
- 9,972
  Likes
Two issues:

1. Bloomberg has rules on how much data you can download a month. It’s a lot for normal usage but not a lot of you want to build a comphrensive database.

2. With a vast amount of data you will need a proper database design. A few flat tables will make querying annoying.

With sqlserver you can use a bulk upload bat file which is crazy fast (as it doesn’t affect the database logs). I don’t know if mysql has the same utility.

#2 Nov 5, 2018

Share

silveredge likes this.
Robert Morse Sponsor
- 7,330
  Posts
- 6,173
  Likes
silveredge said:
I am looking to upload a lot of financial data from bloomberg into mysql to start constructing a proprietary financial db.

For now only basic stock data, but will expand options and fundamental data.
-1000+ tickers
-1-min price (bid, ask, last), volumn

For someone who has gone through building his own db, can you please shed some light on an efficient way to upload the data? I just plan to download in csv files 1-by-1 then upload to mysql using a script.
More...

I think you will find for your $2000/month your sales person at Bloomberg will be very helpful. You should start there. They will also provide a sample format of the data.

Robert Morse |VP, Institutional Sales
Lightspeed Financial Services Group
Phone: 646-393-4806 | rmorse@lightspeed.com
www.lightspeed.com

#3 Nov 5, 2018

Share

VPhantom likes this.
tommcginnis
- 4,081
  Posts
- 3,033
  Likes
If you design a "complete" (in your estimation) data set first, with a minimum of observations but all the fields ("and more!") that you could desire, AND THEN scale it up, you'll be in a better position to judge efficiencies.

But the way you're going about it now (methods first, data sources second) -- you're going to end up with a great tool looking for the right problem to fit.

My vote would be to get the sources first, and the template right, and then scale up. (BTDT...)

Last edited: Nov 5, 2018

#4 Nov 5, 2018

Share
silveredge
- 182
  Posts
- 29
  Likes
I am lost in how to batch download stock data instead of a ticker-by-ticker which will take forever. Any pointer?

#5 Nov 5, 2018

Share
silveredge
- 182
  Posts
- 29
  Likes
newwurldmn said:
Two issues:

1. Bloomberg has rules on how much data you can download a month. It’s a lot for normal usage but not a lot of you want to build a comphrensive database.
More...

Thanks. Actually, something I suspected but not yet bother to dig into. I eventually wanna have an 'offline' bloomberg for most market data I use (as the bloomberg account is tied to company and not sure how long it lasts).

This project looks very useful. Anyone implemented
https://www.codeproject.com/Articles/1078609/Market-Recorder-for-Interactive-Brokers-TWS

Last edited: Nov 5, 2018

#6 Nov 5, 2018

Share
DaveV
- 912
  Posts
- 848
  Likes
To insert the records in to mysql, there is a "LOAD DATA INFILE" commad that is very fast. Google it for lots of examples.

A few hints:

Assuming you only want records from 09:30AM to 4PM, for 1000 symbols that works out to:
6.5 hours * 60 minutes * 1000 symbols = 390,000 records per day Or 99,450,000 per year (assuming 255 trading days). Let's round that to 100M records per year.

Assuming that you create a MySql table such as
create table `Bloomberg_One_MInute_Quotes` (
`Symbol` varchar(12) not null,
`DateTime` Timestamp not null,
`Open` float default null,
`High` float default null,
`Low` float default null,
`Close` float default null,
`Volume` int default null,
primary key (`Symbol`,`DateTime`)
);

and, that the average symbol is 4 characters long, you will need 1+4 bytes per symbol, and 4 bytes each for all other columns. That works out to (1 + 4 + ( 4 * 6 )) = 29 bytes per record.

Let's assume another 11 bytes per record for the Primary Key and Indices. So now we have 40 bytes per record. So in each year of trading, your MySql table will be 40 * 100M = 4GB.

Assuming that you are using the MySql Community Server (i.e. the free version), the limit of any single table is 4GB, so the most you can store is one year. In that case, I suggest switching from MySql Community server to MariaDB, which is 100% upward compatible will support all your MySql scripts and commands. MariaDB is free, and in the public domain. Even Google uses it.

Last edited: Nov 5, 2018

#7 Nov 5, 2018

Share

helgen_1 and silveredge like this.
RedDuke
- 7,401
  Posts
- 2,827
  Likes
here is sample how to load data. Do not use wizards will never finish on large data:

LOAD DATA LOCAL INFILE 'D:\\Documents\\Options Data\\CBOEData\\UnderlyingOptionsEODQuotes_2018.csv'
INTO TABLE options.spxeod
FIELDS TERMINATED by ','
LINES TERMINATED BY '\n'
IGNORE 1 LINES

#8 Nov 5, 2018

Share
silveredge
- 182
  Posts
- 29
  Likes
On size, this says otherwise?
http://ftp.nchu.edu.tw/MySQL/doc/refman/5.0/en/full-table.html

#9 Nov 5, 2018

Share
DaveV
- 912
  Posts
- 848
  Likes
silveredge said:
On size, this says otherwise?
http://ftp.nchu.edu.tw/MySQL/doc/refman/5.0/en/full-table.html
More...

You are correct. It appears that MySql got rid of the 4GB Community Server limitation in the 5.* versions.

#10 Nov 5, 2018

Share

silveredge likes this.

(You must log in or sign up to reply here.)

Search