Hi fellow developers, here's some source code I wrote for generating time series using Geom. Brownian Motion (GBM): Code: /* GBM.cpp - Geom. Brownian Motion (GBM) 2016-01-23-Sa: v0.99: init Author: U.M. in Germany (user botpro at www.elitetrader.com) What-it-does: Create timeseries data using Geom. Brownian Motion (GBM) Can generate bars of any size in time. By default it generates 30-sec bars (ie. 780 bars/day @ 23400 seconds/day) You can modify it easily to create OHLC-data (intraday and EOD) to be used in trading systems like AmiBroker etc. Compile using a C++11 conformant compiler like GNU g++: g++ -Wall -O2 -std=gnu++11 GBM.cpp Run (here on Linux): ./a.out >file.csv Analyse: Import file.csv into Excel or LibreOffice-Calc and do some analysis (calcs, charts etc). Remember: the changes are normally distributed, but the resulting timeseries is log-normally distributed because there are no negative stock prices. By default it uses the normal-distribution. t-distribution can be used optionally (see "#if 1" in code). For normally distributed changes the quality of the data can be verified with the following formula: ObservedAnnualVolaPct = BarVolaPct * sqrt(252 * nBarsPerDays) ie. in Excel/LibreOffice-Calc for the sample data the pgm creates do this: =STDEV(D2:D16381)*100*SQRT(252*780) It should give approximately the same VolaPct as was specified as the input volatility (ie. here 30). FYI: Doing the same calc over data that was created using the t-distribution gives a higher number (about 38), so using the default normal-distribution is the stochastic correct method for research. For the difference see [2], sections "Normally Distributed Model of Asset Returns" and "Leptokurtic Model of Asset Returns". Misc: - You can modify the code easily to create OHLC-data (intraday and EOD) to be used in trading systems like AmiBroker etc. - It works with trading days instead of calendar days, and a year is defined as 252 trading days (can be chgd in ctor) - This code is a stripped down standalone usable version of my TCIntradaySpotGenerator See also / References: [1] https://en.wikipedia.org/wiki/Geometric_Brownian_motion [2] https://mhittesdorf.wordpress.com/2013/12/29/introducing-quantlib-modeling-asset-prices-with-geometric-brownian-motion/ [3] https://people.sc.fsu.edu/~jburkardt/cpp_src/brownian_motion_simulation/brownian_motion_simulation.html [4] http://www.javaquant.net/books/MCBook-1.2.pdf [5] http://investexcel.net/geometric-brownian-motion-excel */ #include <cstdio> #include <cstdlib> #include <cmath> #include <random> #include <chrono> using namespace std; default_random_engine randgen(chrono::system_clock::now().time_since_epoch().count()); normal_distribution<double> n_dist(0.0, 1.0); // mu=0, s=1 student_t_distribution<double> t_dist(5); // 5 degrees of freedom class GBM { private: double sigma, r, q, u, t, dt, R, SD, S_t, S_tPrev; size_t cGen; public: const size_t uDays, uDailyBars; const double dbSpot0, dbAnnDriftPct, dbAnnDividPct, dbAnnVolaPct, dbTradeDaysInYear; GBM(const double AdbSpot0 = 100.0, const double AdbAnnVolaPct = 30.0, const size_t AuDailyBars = 780, const size_t AuDays = 252, const double AdbTradeDaysInYear = 252.0, const double AdbAnnDriftPct = 0.0, const double AdbAnnDividPct = 0.0 ) : uDays(AuDays), uDailyBars(AuDailyBars), dbSpot0(AdbSpot0), dbAnnDriftPct(AdbAnnDriftPct), dbAnnDividPct(AdbAnnDividPct), dbAnnVolaPct(AdbAnnVolaPct), dbTradeDaysInYear(AdbTradeDaysInYear) { r = dbAnnDriftPct / 100.0; q = dbAnnDividPct / 100.0; // dividend yield u = r - q; t = double(uDays) / dbTradeDaysInYear; dt = t / double(uDays * uDailyBars); sigma = AdbAnnVolaPct / 100.0; SD = sigma * sqrt(dt); R = (u - 0.5 * sigma * sigma) * dt; // Ito's lemma S_t = S_tPrev = dbSpot0; cGen = 0; } double generate() { // convention: the very first spot is the initial spot S_tPrev = S_t; #if 1 // normal distribution (gauss) if (cGen++) S_t *= exp(R + SD * n_dist(randgen)); #else // t-distribution ("fat tails") if (cGen++) S_t *= exp(R + SD * t_dist(randgen)); #endif return S_t; } double get_cur() const { return S_t; } double get_prev() const { return S_tPrev; } }; int main() { // define the input params to use in GBM: const double dbSpot0 = 100; const double dbVolaPct = 30; const size_t nBarsPerDay = 780; // ie. 30-sec bars @ 23400 trading seconds per day GBM G(dbSpot0, dbVolaPct, nBarsPerDay); // create bar data for 21 days (= 1 trading month) and print as CSV: printf("Day,Bar,Spot,logOfChg\n"); for (size_t d = 1; d <= 21; ++d) for (size_t b = 1; b <= nBarsPerDay; ++b) { const double dbCur = G.generate(); const double dbPrev = G.get_prev(); printf("%zu,%zu,%.5f,%f\n", d, b, dbCur, log(dbCur / dbPrev)); } return 0; } You can find it also in the attached zip file together with a sample output. Enjoy! PS: let me know pls if you find any bugs. Thx. .
I compiled using Codeblocks under Windows. I see // create bar data for 21 days (= 1 trading month) and print as CSV Did you omit the statement to output to CSV? MyFile?
Please read the section "Run" in the comments header of the code. Under Windows you would simply do this in a command window: pgmname >filename.csv If Windows does not like a name like "file.csv" then simply use a different filename. I have no experience with Codeblocks. Just ensure that your C++ compiler generates the *.EXE file, then execute it as shown above...
Here's an updated version (v1.00). Now t-distribution can be enabled via the last constructor parameter. Now also the verifikation formula works for both cases, n-dist and t-dist. Code: /* GBM.cpp - Geom. Brownian Motion (GBM) 2016-01-23-Sa: v0.99: initial version 2016-01-26-Tu: v1.00: now t-distribution can be activated via the last param in ctor Author: U.M. in Germany (user botpro at www.elitetrader.com) What-it-does: Create timeseries data using Geom. Brownian Motion (GBM). Can generate bars of any size in time. By default it generates 30-sec bars (ie. 780 bars/day @ 23400 seconds/day). You can modify it easily to create OHLC-data (intraday and EOD) to be used in trading platforms like AmiBroker etc. Compile using a C++11 conformant compiler like GNU g++: g++ -Wall -O2 -std=gnu++11 GBM.cpp -o GBM.exe Run: Linux/Unix: ./GBM.exe >data.csv Windows: GBM.exe >data.csv Analyse: Import data.csv into Excel or LibreOffice-Calc and do some analysis (calcs, charts etc). Remember: the stock returns (ie. the logarithmic changes) are normally distributed, but the resulting timeseries is log-normally distributed because there are no negative stock prices. By default it uses the normal-distribution. t-distribution can be used optionally (see last param of the ctor). Using the default normal-distribution is the stochastically correct method for research. For the difference see [2], sections "Normally Distributed Model of Asset Returns" and "Leptokurtic Model of Asset Returns". The quality of the generated data can be verified with the following formula: ObservedAnnualVolaPct = BarVolaPct * sqrt(252 * nBarsPerDay) ie. in Excel/LibreOffice-Calc for the sample data the pgm creates do this: =STDEV(D2:D16381)*100*SQRT(252*780) It should give approximately the same VolaPct as was specified as the input volatility (ie. here 30). Misc: - You can modify the code easily to create OHLC-data (intraday and EOD) to be used in trading platforms like AmiBroker etc. - It works with trading days instead of calendar days, and a year is defined as 252 trading days (can be chgd in ctor) - This code is a stripped down standalone usable version of my TCIntradaySpotGenerator See also / References: [1] https://en.wikipedia.org/wiki/Geometric_Brownian_motion [2] https://mhittesdorf.wordpress.com/2013/12/29/introducing-quantlib-modeling-asset-prices-with-geometric-brownian-motion/ [3] https://people.sc.fsu.edu/~jburkardt/cpp_src/brownian_motion_simulation/brownian_motion_simulation.html [4] http://www.javaquant.net/books/MCBook-1.2.pdf [5] http://investexcel.net/geometric-brownian-motion-excel [6] https://en.wikipedia.org/wiki/Volatility_(finance) */ #include <cstdio> #include <cstdlib> #include <cmath> #include <random> #include <chrono> using namespace std; default_random_engine randgen(chrono::system_clock::now().time_since_epoch().count()); normal_distribution<double> n_dist(0.0, 1.0); // mu=0, s=1 student_t_distribution<double> t_dist(5); // 5 degrees of freedom class GBM { public: const size_t uDays, uDailyBars; const double dbSpot0, dbAnnDriftPct, dbAnnDividPct, dbAnnVolaPct, dbTradeDaysInYear; const bool fUseTdistribution; private: double r, q, u, t, dt, sigma, SD, R; double S_t, S_tPrev; size_t cGen; public: GBM(const double AdbSpot0 = 100.0, const double AdbAnnVolaPct = 30.0, const size_t AuDailyBars = 780, const size_t AuDays = 252, const double AdbTradeDaysInYear = 252.0, const double AdbAnnDriftPct = 0.0, const double AdbAnnDividPct = 0.0, const bool AfUseTdistribution = false) : uDays(AuDays), uDailyBars(AuDailyBars), dbSpot0(AdbSpot0), dbAnnDriftPct(AdbAnnDriftPct), dbAnnDividPct(AdbAnnDividPct), dbAnnVolaPct(AdbAnnVolaPct), dbTradeDaysInYear(AdbTradeDaysInYear), fUseTdistribution(AfUseTdistribution) { r = dbAnnDriftPct / 100.0; q = dbAnnDividPct / 100.0; // dividend yield u = r - q; t = double(uDays) / dbTradeDaysInYear; dt = t / double(uDays * uDailyBars); sigma = AdbAnnVolaPct / 100.0; if (fUseTdistribution) sigma = sqrt(sigma * sigma * 3 / 5); // scaled by reciprocal of Student T variance (v/(v-2)) SD = sigma * sqrt(dt); R = (u - 0.5 * sigma * sigma) * dt; // Ito's lemma S_t = S_tPrev = dbSpot0; cGen = 0; } double generate() { // convention: the very first spot is the initial spot S_tPrev = S_t; if (!cGen++) return S_t; if (!fUseTdistribution) S_t *= exp(R + SD * n_dist(randgen)); // normal distribution (Gauss) else S_t *= exp(R + SD * t_dist(randgen)); // t-distribution ("fat tails") return S_t; } double get_cur() const { return S_t; } double get_prev() const { return S_tPrev; } }; int main() { // define the input params to use in GBM: const double dbSpot0 = 100; // start with this stock price const double dbVolaPct = 30; // historic volatility const size_t nBarsPerDay = 780; // ie. 30-sec bars @ 23400 trading seconds per day GBM G(dbSpot0, dbVolaPct, nBarsPerDay, 252, 252.0, 0.0, 0.0, false); // fprintf(stderr, "Using: %s\n", G.fUseTdistribution ? "t-distribution" : "normal-distribution"); // create bar data for 21 days (= 1 trading month) and print as CSV: printf("Day,Bar,Spot,lnOfChg\n"); for (size_t d = 1; d <= 21; ++d) for (size_t b = 1; b <= nBarsPerDay; ++b) { const double dbCur = G.generate(); const double dbPrev = G.get_prev(); printf("%zu,%zu,%.5f,%.10f\n", d, b, dbCur, log(dbCur / dbPrev)); } return 0; }
BTW, the "AnnDriftPct" parameter is the expected annual returns in pct the stock makes (don't ask me how to know that in advance, but one can take the value from the last year for a simulation of a "continuation" using GBM...). Normally one can use 0 for it, especially if that value is unknown in advance.
It seems the good thing for GBM is you don't have to worry much about any significant market crash or black swan that you don't want to see in the simulation: https://www.stat.berkeley.edu/~aldous/Research/Ugrad/ZY3.pdf " We see this is a pretty nice simulation of future value of some commodity. The time is adjusted in terms of days. In the above graph, we see that after 1000 days, this commodity have a current price value of about 160. Unlike the our usual Brownian motion, Geometric Brownian motion is controlled by the "trend". That is if we do hundreds of simulation of Geometric Brownian Motion simulation, most of the graph will "heading toward a direction" with some deviation. To verify this, let's plot several graph of Geometric Brownian Motion with the same constant in a same graph. For example, We can use the following code to plot several GBM in the same graph to see the trend, where I choose to plot 5 GBMs with interest rate u to be 0.2, and volatility rate to be 0.005 in the graph. As we see from the graph, the trend of those GBM are roughly matches, it is the volatility factor and the internal random noise of Wiener Process cause these grpah to have different shape. "
GBM is not random data, it is simulated market data (some call it "synthetic market data"). Where can you get say 5-second bar data for hundereds of stocks for say 1000 years? With GBM it is possible to test 1000s of market situations fast and with no costs. It is very good for market modelling and forwardtesting, much like BSM is for modelling options. And: you even don't need to store the mass data in databases; just store the "seed number" which is just a single 32bit or 64bit value depending on the operating system... Ie. in the above code one would at program start in main() do something like this: Code: randgen.seed(123456); It means that you can replay/repeat the same sequence it generates. This is important during system design and development (ie. for debugging the system).