Your experience is common. Successful usage in this very complex area often requires an MS in Computer Science. Kicking around a few data files seldom does it. To your point on data: A large training set is required along with smaller sets for test and validation. For hourly EUR/USD about 24,000 bars work very well for a trade window of up to 12 hours into the future.
To be fair I do have a BS in computer engineering. I was using NEAT since it's what I could readily find without having to try to come up with a neuroevolving network from scratch. Despite this, after I ironed out all the bugs, it tended to saturate. Generally it decided each training pass that it was better to just saturate and take the hit rather than try to guess and potentially score worse. I'm intending to try again when I have indicators that have either proven to work well together or are profitable on their own. So it would be optimization moreso than discovery.
Yep, and regression as implemented in a NN is recursive adaptation or curve fitting to map input neurons to the dependent or output variable. Philosophically, if we ignore the hidden nodes, weight functions and a few other things this is exactly what happens in most trading system optimizations. You can do it in Excel in 20 minutes with Solver, etc. if you want. Find the perfect it between your goal and the data elements in your trading system. Before you run off to hide could you offer an opinion on why so many people do this on ET? In most other groups in technical or academic areas folks at least have the courtesy to say, "oops, I see your point" when you cite source material like Wikipedia or journal sources which deflate the other parties balloon. Here folks run away like kids on the playground in junior high. Why is ET like that?
I've found that in some cases a NN will learn more from the mathematical components used to calculate the indicator along with the indicator itself. Keep in mind that indicators were invented to dumb down complex market dynamics so it can be shown on a 2 dimensional chart. NN are inherently N dimensional and don't have this limitation. For example the components in the RSI model much better than the RSI itself.
RSI Calculation 100 RSI = 100 - -------- 1 + RS RS = Average Gain / Average Loss Average Gain = [(previous Average Gain) x 13 + current Gain] / 14 First Average Gain = Total of Gains during past 14 periods / 14 Average Loss = [(previous Average Loss) x 13 + current Loss] / 14 First Average Loss = Total of Losses during past 14 periods / 14 Note: "Losses" are reported as positive values. To simplify our explanation of the formula, the RSI has been broken down into its basic components which are the RS, the Average Gain, and the Average Loss. To calculate RSI values for a given dataset, first find the magnitude of all gains and losses for the 14 periods prior to the time where you wish to start the calculation. (Note: 14 is the standard number of periods used when calculating the RSI. If a different number is specified, just substitute that number in for "14" throughout this discussion.) It is important to understand that the RSI is a "running" calculation and the accuracy of the calculation depends on how long ago the calculations started. The first RSI value is an estimate - subsequent values improve on that estimate. You should calculate at least 14 values prior to the start of any values that you will rely on - going back 28+ periods is even better. To start the running calculation, the First Average Gain is calculated as the total of all gains during the past 14 periods divided by 14. Similarly, the First Average Loss is calculated as the total magnitude of all losses during the past 14 periods divided by 14. The next values for the "averages" are calculated by taking the previous value, multiplying it by 13, adding in the next Gain (or Loss), and then dividing by 14. This is Wilder's modified "smoothing" technique in action. The RS value is simply the Average Gain divided by the Average Loss for each period. Finally, the RSI is simply the RS converted into an oscillator that goes between zero and 100 using this formula: 100 - (100 / RS + 1). Here's an Excel Spreadsheet that shows the start of an RSI calculation in action. When the Average Gain is greater than the Average Loss, the RSI rises because RS will be greater than 1. Conversely, when the Average Loss is greater than the Average Gain, the RSI declines because RS will be less than 1. The last part of the formula ensures that the indicator oscillates between 0 and 100. Note: If the Average Loss ever becomes zero, RSI becomes 100 by definition.
The reason is you are using linear math in attempt to model a non-linear system. Try taking your current logical design and run it into a non-linear predivtve analytics application. See:http://www.kdnuggets.com/software/index.html
So the two components are close and exponential smoother. Components 14 day RSI: 1 - 14 : daily gains each day 15 to 28 ; daily losses each day 29; Average gain 30 Average loss total components = 30 -------------------------------------------------------------------------------- Quote from Jerry030: RSI Calculation 100 RSI = 100 - -------- 1 + RS RS = Average Gain / Average Loss Average Gain = [(previous Average Gain) x 13 + current Gain] / 14 First Average Gain = Total of Gains during past 14 periods / 14 Average Loss = [(previous Average Loss) x 13 + current Loss] / 14 First Average Loss = Total of Losses during past 14 periods / 14 Note: "Losses" are reported as positive values. To simplify our explanation of the formula, the RSI has been broken down into its basic components which are the RS, the Average Gain, and the Average Loss. To calculate RSI values for a given dataset, first find the magnitude of all gains and losses for the 14 periods prior to the time where you wish to start the calculation. (Note: 14 is the standard number of periods used when calculating the RSI. If a different number is specified, just substitute that number in for "14" throughout this discussion.) It is important to understand that the RSI is a "running" calculation and the accuracy of the calculation depends on how long ago the calculations started. The first RSI value is an estimate - subsequent values improve on that estimate. You should calculate at least 14 values prior to the start of any values that you will rely on - going back 28+ periods is even better. To start the running calculation, the First Average Gain is calculated as the total of all gains during the past 14 periods divided by 14. Similarly, the First Average Loss is calculated as the total magnitude of all losses during the past 14 periods divided by 14. The next values for the "averages" are calculated by taking the previous value, multiplying it by 13, adding in the next Gain (or Loss), and then dividing by 14. This is Wilder's modified "smoothing" technique in action. The RS value is simply the Average Gain divided by the Average Loss for each period. Finally, the RSI is simply the RS converted into an oscillator that goes between zero and 100 using this formula: 100 - (100 / RS + 1). Here's an Excel Spreadsheet that shows the start of an RSI calculation in action. When the Average Gain is greater than the Average Loss, the RSI rises because RS will be greater than 1. Conversely, when the Average Loss is greater than the Average Gain, the RSI declines because RS will be less than 1. The last part of the formula ensures that the indicator oscillates between 0 and 100. Note: If the Average Loss ever becomes zero, RSI becomes 100 by definition. --------------------------------------------------------------------------------