Coding TPO chart with c++

Discussion in 'App Development' started by gregorybishop, Aug 29, 2021.

  1. Hello all! I want to create an algorithm that translates data from OHLC to TPO chart. Can anyone help me with an algorithm or pseudocode for this problem?

    picture for example:

    [​IMG]


    My data is in this format (start, end, open, high, low, close and volume):

    Code:
    1443182400,1443185999,239.99,239.99,237.36,237.45,11501
    1443186000,1443189599,237.45,237.05,236.08,236.08,22625
    1443189600,1443193199,236.08,236.52,236.1,236.34,17434
    1443193200,1443196799,236.34,236.13,235.44,235.71,26900
    1443196800,1443200399,235.71,236.01,235.46,235.75,29200

    I read them into the vector of structures (My examples will be using c++, but I think they are simple, so you can use your language too and understand what's going on here):

    Code:
    #include <iostream>
    #include <chrono>
    #include <fstream>
    #include <sstream>
    #include <vector>
    #include <map>
    
    using namespace std;
    
    struct TOHCLV
    {
        string symbol;
        double open;
        double high;
        double low;
        double close;
        double volume;
    };
    
    const vector<string> TPO_Chars = { "A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X" };
    
    vector<TOHCLV> fromSourceDataToVec(string fileName)
    {
        vector<TOHCLV> data;
        string symbol = "BTC";
    
        ifstream inFile(fileName);
    
        if (inFile.is_open())
        {
            string line;
            while (getline(inFile, line))
            {
                stringstream ss(line);
    
                string timestart, timeend;
                string open, high, low, close, volume;
    
                getline(ss, timestart, ',');
                getline(ss, timeend, ',');
                getline(ss, open, ',');
                getline(ss, high, ',');
                getline(ss, low, ',');
                getline(ss, close, ',');
                getline(ss, volume, ',');
               
    
                TOHCLV nData = { symbol , stod(open), stod(high), stod(low), stod(close), stod(volume) };
    
                data.push_back(nData);
    
            }
        }
    
        inFile.close();
    
        return data;
    }
    

    Now that I have the data, I can start translating it into TPO And I had a lot of problems with that. First, what structure to use to store the (for example, daily) graph of TPO. I decided that map<double, string> (the first one is used to store the price, and the second one is used to store the symbols) would work for me. Also the map is very easy to output, here is the output code:


    Code:
    void showTPOInConsole(map<double, string> tpo)
    {
    
        cout << "\n\t TPO_CHART \n\t" << endl;
        for (const auto& p : tpo)
        {
            cout << p.first << " \t" << p.second << endl;
        }
    }
    The second problem I encountered was what to do about the price. For example, if the price in the first hour went:

    Code:
     {219,"A"},
    {220, "A"},
    {221, "A"},
    And in the second hour the price went:
    Code:
    {219.5, "B"},
    {220,"B"},
    {220.8,"B"},
    {222,"B"},
    {223,"B"},
    How do I combine them?
    We know that 219.5 is very close to 219 and they could be combined, how do I solve this? Should I use a quantizer?


    Thx!
     
  2. I haven't actually coded this myself, but would approach it in the following way:
    It is important to notice that while you are starting with data which is time-based, your end result is price-based. The time information gets lost in the conversion. Your software needs to emulate this.
    (*) The daily bar has OHLC. Based on (High - Low) I get the price range for the day. Based on this I would decide what price buckets to use. Suppose that I want to use 10 buckets, then each bucket has a price range of 0.1*(High-Low). For each bucket I determine the lower price value and upper price value. As a software data format a Bucket would now have a lower price, upper price, and a list to represent which hourly bars are in this bucket.
    (*) Now I look at the hourly OHLC bars within that one day. This OHLC bar can be identified by a number (e.g. the x'th hour of the day), or a letter as in your drawing. I go to each of the defined buckets and add the bar's indicator to the bucket's list if this OHLC bar fills this bucket. This is easy to do: you compare the bar's High and Low versus the upper and lower price boundaries for each bucket.
    (*) To display on the screen you'll have to loop through the buckets, starting with the highest one and print the list of bar indicators. That will lead to what is on the right hand side of your drawing.
     
  3. ph1l

    ph1l

    The basic data structure I used to simulate Market Profile is a deque of deque of int where each int is a sequence number of the bar input.
    Code:
        T priceBoxSize;     // granularity of price used in time price opportunities (TPOs); priceBoxSize > 0
            // NOTE: T would be a type to represent an asset price (e.g., float or double)
        int numBarsToStore; // number of ohlc price bars (or other data) stored in the instance
    
        typedef std::deque<int> TPOs_t;     // time price opportunities
    
        int seqNum; // current sequence number
        std::deque<TPOs_t> histogram;           // histogram of prices for different sequence numbers
                                                // list of lists.  each element has one or more bar indexes already stored.  these bar indexs are time price opportunities (TPOs).
                                                // histogram[0] has bar indexes >= (baseNumBoxes + 0) * priceBoxSize and < (baseNumBoxes + 1) * priceBoxSize
                                                // histogram[1] has bar indexes >= (baseNumBoxes + 1) * priceBoxSize and < (baseNumBoxes + 2) * priceBoxSize
                                                // ...
    
        int baseNumBoxes;                       // multiple of priceBoxSize that represents the beginning of histogram[0]
        int numStoredTPOs;                      // number of stored time price opportunities
    
        std::deque<int> highBoxes;              // numBarsToStore bar multiples of priceBoxSize high prices
        std::deque<int> lowBoxes;               // numBarsToStore bar multiples of priceBoxSize low prices
    

    For adding a bar with a high and low price, the number of boxes in the histogram was rounded to the nearest box size where the box size depends on the asset modeled
    Code:
        int highNumBoxes = roundVal (highPrice / priceBoxSize);
        int lowNumBoxes = roundVal (lowPrice / priceBoxSize);
    
    Then, this histogram gets expanded and/or contracted as needed. My implementation didn't map sequence numbers to hours ('A', 'B', etc.), but this could be done if the time interval between bars is known.
     
  4. Thank you for your answers! ph1l and HobbyTrading

    I would like to point out that I would like to use my code in real time mode


    So I have data (a vector of ohlc structures. The vector contains data for 24 hours) I can find high and low days:

    Code:
    vector<TOHCLV> data = fromSourceDataToVec("C:\\cpp\\data.txt");
    
        double day_high = find_max_day_range(data);
        double day_low = find_min_day_range(data);
        cout << "high and low" << endl;
        cout << day_high << endl;
        cout << day_low << endl;
    Now I have two numbers that contain the high and low of the day.

    I don't quite understand this phrase. How can we determine that?

    Right now I have 24 hour bars in my vector. I could of course write 0.24*(high-low), but this solution won't work in real-time processing, because we haven't gotten all of the hour bars for the day yet.

    Also, I don't understand this phrase.

    Perhaps it would have been clearer if you had sent elements of the code.

    Sorry for my intrusiveness, I just want to understand in detail how it works so I can program it.


    Could you add an approximate pseudocode of how the addition is done?
     
  5. ph1l

    ph1l

    In my previous post, I wrote "each int is a sequence number of the bar input." I was mistaken, and each input bar can have multiple sequence numbers (see pseudocode below).

    In this pseudocode for adding a bar with high and low price to the histogram,
    • incremented and decremented actions are done before the rest of the statement (same as ++var in C++)
    • back and front refer to the beginning and end of the deques (same meaning as back and front in the C++ standard library)
    Code:
        highNumBoxes = highPrice / priceBoxSize rounded to nearest integer
        lowNumBoxes = lowPrice / priceBoxSize rounded to nearest integer
    
        if baseNumBoxes has not been set
        {
            baseNumBoxes = lowNumBoxes
        }
    
        // expand histogram downward if needed
        while ( baseNumBoxes > lowNumBoxes )
        {
            add empty deque to front of histogram
            decrement baseNumBoxes
        }
    
        // expand histogram upward if needed
        while (histogram size < highNumBoxes - baseNumBoxes + 1)
        {
            add empty deque to back of histogram
        }
    
        // remove data for oldest bar prices which are no longer needed
        if (highBoxes size >= numBarsToStore)
        {
            remove first value from highBoxes and lowBoxes
    
            for i in previous lowNumBoxes - baseNumBoxes through previous highNumBoxes - baseNumBoxes
            {
                remove front element of histogram [i]
                decrement numStoredTPOs
            }
        }
    
        // add TPOs to histogram.  to help settle ties on histogram heights, keep middle TPOs in a single bar to
        // have higher values (sequence numbers).
        // for bars with an odd number of price boxes, there is one middle price box.
        // for bars with an even number of price boxes, make the even-valued absolute number of price boxes have the higher value.
        int lowOffsetNumBoxes = lowNumBoxes - baseNumBoxes
        int highOffsetNumBoxes = highNumBoxes - baseNumBoxes
        while (true)
        {
            add incremented seqNum to back of histogram [lowOffsetNumBoxes]
            increment numStoredTPOs
            if (lowOffsetNumBoxes >= highOffsetNumBoxes)
            {
                // odd number of boxes, so lowOffsetNumBoxes is the true middle one
                break loop
            }
            add seqNum (same -- not incremented) to back of histogram [highOffsetNumBoxes]
            increment numStoredTPOs
    
            increment lowOffsetNumBoxes
            decrement highOffsetNumBoxes
    
            if (lowOffsetNumBoxes > highOffsetNumBoxes)
            {
                // even number of boxes, and all boxes have been filled in.  the two middle boxes have the same sequence number.
                // give the highest even-valued absolute number of boxes a higher sequence number (i.e., rounding to avoid bias)
                // to break ties when finding points of control in getIndexes()
                if (lowOffsetNumBoxes + baseNumBoxes is even)
                {
                    // highTPOs represents and absolute even number of price boxes, so give it a higher sequence number to break a potential tie later
                    replace most recently added seqNum at the back end of histogram [highOffsetNumBoxes] with incremented seqNum
                }
                else
                {
                    // lowTPOs represents and absolute even number of price boxes, so give it a higher sequence number to break a potential tie later
                    replace most recently added seqNum at the back end of histogram [lowOffsetNumBoxes] with incremented seqNum
                }
                break loop
    
            }
        }
    
        add highNumBoxes to back of highBoxes
        add lowNumBoxes to back of lowBoxes
    
        // remove empty data from the high end of the histogram
        while the back element of histogram is empty
        {
            remove the back element of histogram
        }
    
        //  remove empty data from the low end of the histogram
        while the front element of histogram is empty'
        {
            remove the front element of histogram
        }
    
     
  6. Looking at the example in your opening post I see on the right hand side that the total price range has been split into 19 buckets if you count from the highest letter to the lowest letter. So if you have a daily bar with a price range of 9.5 US (e.g. daily low 89 USD, daily high 98.5 USD) each bucket gets a width of 0.5 USD ( = 9.5/19). The lowest price bucket goes from 89 - 89.49, the second from 89.5 - 89.99, etc, until the highest goes from 98.0 - 98.5 USD.
    Instead of defining how many price buckets you want you could also specify the width of each price bucket. The number of buckets is then determined by the daily price range. In my example: if you don't like to get buckets with price range of 0.5 USD, bnut you prefer to have buckets with a price range of 1 USD. Then the number of buckets becomes ten, in order to capture all prices that where recorded during the day.
    I don't know how to code in C, only in Java. An I didn't want to cause confusion, which is why I didn't provide pseudo-code.
     
  7. I decided to give it a try in Java:
    First step is to define the buckets, define the range for each of them. I decided that I want 20 price buckets, and that the Highest and Lowest price of the day are known:
    Code:
    class Bucket{
      double LowerPriceBoundary;
      double HigherPriceBoundary;
      ArrayList<String> HourIndicators;
      }
    
    int NumberOfBuckets = 20;
      double BucketRange = (DailyHigh - DailyLow)/NumberOfBuckets;
      ArrayList<Bucket> HourlyBuckets = new ArrayList<>();
      //define and initialize hourly buckets
      for(int i = 0; i < NumberOfBuckets; i++){
      Bucket b = new Bucket();
      b.LowerPriceBoundary = DailyLow + i*BucketRange;
      b.HigherPriceBoundary = b.LowerPriceBoundary + BucketRange - 0.01;
      b.HourIndicators = new ArrayList<>();
      HourlyBuckets.add(b);
      }
    
    The next step is to transfer the hourly price bar information to the buckets. I assume here that each of the hourly bars has Low value and High value. Then, for each bucket I evaluate whether it is within the High and Low range of the hourly price bar. If it is in range, then the hour's number is registered.
    Code:
    //iterate over the hourly price bars to transfer price bar data to buckets
      for(int i = 0; i < HourlyBars.size();i++){
      double Low = HourlyBars.get(i).Low;
      double High = HourlyBars.get(i).High;
      for(int j = 0; j < HourlyBuckets.size();j++){
      Bucket b = HourlyBuckets.get(j);
      boolean InRange = false;
      if((b.LowerPriceBoundary < Low) && (b.HigherPriceBoundary >= Low)) InRange = true;
      if((b.LowerPriceBoundary >= Low) && (b.HigherPriceBoundary < High)) InRange = true;
      if((b.LowerPriceBoundary <= High) && (b.HigherPriceBoundary > High)) InRange = true;
      if(InRange){
      b.HourIndicators.add(String.valueOf(i));
      HourlyBuckets.set(j, b);
      }
      }
      }
    
    The last step is to print out the result, from high to low
    Code:
    //now print for each bucket which hours are in this bucket's range
      for(int i = (HourlyBuckets.size()-1); i > 0; i--){
      Bucket b = HourlyBuckets.get(i);
      String Hours = "";
      for(int j = 0; j < b.HourIndicators.size();j++) Hours = Hours + b.HourIndicators.get(j);
      System.out.println(b.HigherPriceBoundary+" "+Hours);
      }
    
    I haven't tried this code with real data, so I'm not sure whether there are some bugs.
     
  8. Thanks for the share. Did you solve this case?