I am currently messing around with various NN models, one thing that I can't seem to find a clear answer on is the normalizing of input data into NN's. I can see how it is possible to retrospectively scale input data, but what about unseen data, how does one normalize this? (i.e. scaling an indicator input during forward testing).

Normalize it based on the data your feeding it. For example your neural net takes 10 inputs... then before you pass those values along to the network normalize them based on those 10 inputs.

If I normalize all data at the inputs (at each time interval), won't any data that exists in tighter ranges get squashed by the larger inputs?

regarding Neural Networks there is a good FAQ ftp://ftp.sas.com/pub/neural/FAQ.html as for normalizing inputs you can normalize them in your existing data set by multiplying them on some value. Obviously there is no way to calculate that multiplier for unseen data.

You could try this: 1) apply a median filter. This will help remove some outliers, which are almost always present unless the data has already been processed. Take all of the data within each domain, take the absolute value, sort it, and remove or clip anything larger than, say, the 95th percentile (taking care to honor the sign when dealing with negative values). 2) take the average and sdev of the remaining distribution, and rescale the data using v' = (v-average)/sdev This way, as long as your data is somewhat stationary, you shouldn't have too many surprises moving forward with data you haven't yet seen. If each of your inputs are orthogonal, you will need to treat each input node's data separately.

The correct way to normalize data for this purpose is to do it the same way that you did during training. For example, if the data is normalized by subtracting the mean and then dividing by the standard deviation, then the mean and standard deviation are calculated on the training data, and those same mean and standard deviation values are used for future data.

That's right. Be very careful to ONLY use training data to derive the normalization parameters (usually mean,std).. otherwise you are obviously peeking, and you'd be surprised how profitable your system would appear.

Interesting that the thread is a year old. However, considering the renewed interest in two threads, perhaps some of our experienced users can chime in on some of the following areas. 1) What training size and validation size sample sets have you found to be useful? 1a) What resolution are you using (tick, minute, daily, etc)? 1b) How often do you retrain? 2) What specific factors have you found to be useful in your inputs? 3) What type of hit rate are you achieving OOS (assuming binary classification)? 3a) Over what sample size of validation are you achieving this hit rate? -------------------------------------------- "Truth is much too complicated to allow anything but approximations." Von Neumann

This will depend on any number of factors, but a significance test of confidence interval can be used to assess the quality of performance measurements on the validation/test data. The Usenet comp.ai.neural-nets FAQ includes some guidance on this, but I'd recommend picking up a book, such as Computer Systems That Learn, by Weiss and Kulikowski.

Thanks, although I'm pretty familiar with the basics. I am more interested in experienced based specifics that work for you, such as architecture, OOS hit rate, input factors, etc.. (as explicitly mentioned in 1st post). Or, almost as useful, would be what didn't work for you. Sounds like you have mentioned machine learning in some other threads. Feel free to comment on specifics of other type of learners you have worked on or built.