Calling all C++ programmers

Discussion in 'App Development' started by Maverick1, Jul 26, 2011.

  1. gtor514

    gtor514

    No boost needed. See...

    http://www.cplusplus.com/reference/clibrary/ctime/

    As an alternate option, if your using windows, there is a COleDateTime class that stores the date time as a double. You would use the same steps to parse the datetime from string into a COleDateTime class, and from there write the double value into your "chart objects(time, op, hi, lo, cl, vol)"
     
    #11     Jul 27, 2011
  2. LeeD

    LeeD

    That's the most efficient solution!

    Alternatively, you can use OLE date-time format where the date-time is a floating point number and increase by 1 means 24 hours (or exactly 1 day) later. The best part with OLE format is the dates and time in a CSV file can be read, understood and saved by Excel.

    Regarding the best format, as others suggested a vector of OHLC structures is better than 4 vectors.

    I don't assume you will be adding new dates - instead you'll just read the input file from the start. So, I would avoid list containers like a plague.

    A few ideas:
    1) If you want to use strings for dates and times anyway, use the correct format: year, month, day, hour, minute with leading zeros. Something like yyyy-MM-dd hh:mm or yy/MM/dd hh:mm. Then strings will have the same order as corresponding dates do
    2) Consider adding to OHLC structure a boolean value with the meaning "price exists". Then a bar will be present for every minute and there will be no need for searching the right time. Instead you will know that a bar 60 minutes later is exactly 60 positions later in the vector/llist/array.
    Note that if you implement typical idicators like moving avearge in a naive way, this storage method comes with a performance penalty. Say, for a 30-period moving average you need 30 valid prices. So, you will have to iterate back through any gaps in price series till you find 3 bars where the price is not missing.
    3) To handle bank holidays and week-ends better, consider storing a list of days when the security is traded with intraday data arranged as arrays. This way each day's data can start at market open and not midnight... and it still will be very easy to find the right time every day.
    4) If you want to find the right date in an ordered array of days the binary search method is much faster than iterating through all days.

    If you want speed at the cost of some flexibility, consider implementing backtesting as matrix operations. This way you cna use one of the popular linear algebra implementations such as ATLAS, MKL or GotoBLAS. For a case study of how linear algebra subroutines can be use dto massively speed up backtesting, see Amibroker and its Amibroker Formula Language (AFL).
     
    #12     Jul 27, 2011
  3. rosy2

    rosy2

    you said you have a csv file of data. Isnt that file already sorted by date? so you dont need to worry about sorting. you just get the data put it in an object and add that to a list. This is trivial. You probably spent more time typeing on this forum thread than it would take to code it. if you need a sort then put the objects in a tree
     
    #13     Jul 27, 2011
  4. dcvtss

    dcvtss

    What he said...I've set up a MySQL DB for EOD OHLC data from excel in about 25 minutes, including installing the MySQL software, defining the tables, and importing the csv data. Use a free front end like HeidiSQL or phpmyadmin and it does the imports for you. Best thing is you can use excel to fromat the date string in MySQL's format before you import and then once it's in there as a native timestamp it's easy to query as keyser said.
     
    #14     Jul 27, 2011
  5. Maverick1

    Maverick1

    Yes the file is sorted by date. Here's the issue though: Each date has ~390 minutes of price data associated with it. For example:

    09/11/2009 9.30 o, h, l, c
    09/11/2009 9.31 o, h, l, c

    and so on to
    09/11/2009 16.14 o, h, l, c

    That completes one day, and then starts the next which could be 9/14/2009 due to weekend, say.

    What I want to do is simple analysis for now: for ex, find the time at which the low was made for each date in my data file. That requires that I be able to iterate over each date in sequence and because of weekends and holidays, that sequence is not a simple as +1 each time.

    From the responses so far, it looks like best bet is to convert the date string into a long using strptime()?
     
    #15     Jul 27, 2011
  6. If this is all you want to do then just stick it into a map where the key is the timestamp. Using minute data you can reasonably store years worth of data in memory on a current system.

    strptime and mktime will work, although I myself used ICU libraries (ICU4C) because I was dealing with various time zones trading data from Hong Kong, Germany, and the US. It's more complex but doing things like date addition is a bit easier.

    If you start getting into tick data, then you might need to go with something more complex because some data feeds are adding millisecond components to the timestamp.
     
    #16     Jul 27, 2011
  7. Mr_You

    Mr_You

    FYI, PostgreSQL is also an excellent "enterprise class" quality database that is opensource and free. Easy to use and install on Windows and other OSs.
     
    #17     Jul 31, 2011