I want to compare my results to "random" results to test null hypothesis. Because random methods rarely work at all because of commissions etc, what is the optimal way to construct the random dataset that I compare results to?

Which null hypothesis? There is infinitely more than one. You've answered your own question. "Because random methods rarely work at all" so why bother comparing them to a test strategy? Your benchmark should be the buy-and-hold results if you want to compare your strategy results to something.

The null hypothesis for Monte Carlo tests is that the the pairing of long/short/neutral positions with raw returns is random. The null hypothesis for bootstrapping tests is that the mean return is zero. These are standard null hypotheses. They certainly involve assumptions. Obviously the buy and hold return is not a test of any kind. Like in the case of the probability of ruin, you are also flat wrong in this case.

Actually I think both of you have good points. I think (and I could be wrong) using the S&P500 as a benchmark is common and is a stand-in for buy and hold. Using zero earnings minus commission and slippage (net loss = commission and slippage) would be the other one based on a more MC like or statistical null hypothesis.

Perhaps you can tabulate a few one tailed hypothesis "your mean return > s&p return for various timeframes-3mths, 6mths, 1 yr, 3yrs, 5yrs, 10yr.", like a time series forecast thing to compare the validity of your hypothesis results with actual ones for each timeframe interval. As described in http://en.wikipedia.org/wiki/Null_hypothesis , hypothesis with some directionality benchmark number could be more meaningful than general statement type hypothesis. Wonder if some probability numbers can be used/assigned instead of always using mean values for comparison...how is bayesian inference actually applied? Say I have (100samples) a series of random numbers between 1 to 10. How do I test for the hypothesis that "there is a bias for the numbers 5,6&7" ?