I forgot to say (I should add) to the OP 2) 1) For the 1000 by 500 matrix, it take at least 6 hours or overnight, to get result from the big matrix. So between 9:30 and 4, it cannot be run. Usually it runs every weekend (saturday night to Sunday morning while I am sleeping) Computer estimate big picture to find parameter(shape) for each 500 stocks. Compter never sleeps to work hard as a slave.
6 hours? wow.... that's a long time for a few million records... what kind of ML are you running on that data? is that running on a single core, or parallel?
Under the 32bit XP ( or Window7), there are many loops, such as 2000 days and 500 stocks, and it calculates many many statistics. If one change from 2000 to 4000 days, or 500 stocks to 1000 stocks, then total time changes from12H to 24H. But we better program for maximum 8 hours, since it is good to start 10PM when I sleep, to get result at 6 AM when I get up. If I change code from R(S) to C like, for efficiency, there will be speed improvement. Note that R is not compiler but interpreter. But I do NOT need to spend my time in C coding since the overnight computing is satisfactory for my daily work. BTW what is ML?
Are you sure the increase is lineal? (double the data=double the time) and not more? (an easier test is to reduce the sample to say, 100 stocks and see if the decrease is lineal or faster) ... ML stands for machine learning...
Yes roughly linear, proportional to the number of rows(observation number) However, in many case, if we increase the number of column (variable), then it may not be linear. Anyway, it does'n matter whether it takes 2 hour or(to) 8 hours. It is important to make it about 8 hours so that we can sleep while computer in running.
Programming languages are irrelevant for your trading strategies, it doesn't really matter. The questions you need to ask yourself are the followings: 1. What level of Automation do you want to achieve ? light human monitoring or complete blind monitoring in a systematic trading system 2. What kind of data do you want to process ? last trade real time ? bid/ask feed real time ? 3. What kind of trading strategies do you want to build ? price event based ? price arbitrage ? long term fundamental holdings ? day trading ? HFT ? Charting ? You need to answer those 3 questions first to define yourself and where you are heading. From that, building a strategy so it can trade directly through IB APIs would be easier to answer as a feasible project. Hope this helped,
Here are outline of Excel program working fine now. Unlike in US, many still use Excel in out of US. Please note that some letters are not English, but it not a problem at all. (Attached in Pic 1) Pic1 shows daily close for 100 days, for personal watch list (roughly 200 stocks). This is NOT dynamic but needed once every night. (Attached in Pic 2) Pic2 shows that intraday (dynamic) price, which changes every minutes in the Excel file during 9:30 and 4 PM. It shows 4 rows (OCLC) and 200 columns, for personal watch list. (Attached in Pic 3) For example, the first line implies that BUY order of MSFT(=Daum) will be delivered to NYSE AS SOON AS CURRENT PRICE HITS $ 16.377 or low. The order is for BUY at $15.9. Usually couple of hundred of waiting orders in local PC which I make myself using my own logic. But roughly only 10% is changed to REAL order (in NYSE) most days. If interested, please send me email so that I can precisely explain those program working fine now. My friend has written those 3 with his EXCELlent EXCEL skill using COM (not DLL). I paid him$200 for Pic1, $200 for Pic2 and $500 for Pic 3. -Jay
Please forgive me for (possible) mis-insert for the 3 pictures. This is my first time. Many years I am familiar to R and Python. Although those 3 were written in Excel by my friend, I am NOT used to Excel and VB.
Is that for Korean securities on the KSE ? again not sure what is it you are looking for. It seems that you have this under control in Excel, so why change it ? also can you explain your strategy in plain English, that is in terms of trading logic flow so we can understand what you are trying to accomplish ? describing your data model is not explaining your trading strategy, unless there are no trading strategy and only a data model you are trying to exploit for pattern discovery.