Gotta vectorize. I did an initial backtest in Python with lots of large dataframes using looping. That took almost an hour to complete. Then I vectorized everything and it ran in 4 minutes.
Depends how you loop. Sounds like you were using iterrows instead of itertuples or iteritems. iterrows requires typing of every item and that's ridiculously slow.
I don't have anything to add that is python specific. I find optimizing code one of the great joys of programming. First, I never optimize unless I find my self saying..."This is slow. This is painful!" So first, there must be pain. Next, I will look for obvious performance bottle necks. If after that I am still feeling pain, I will start to question my original design, sometimes puzzling for days over the problem. The key is you have to be willing to scrap code already written. I once had to optimize some javascript (Node.js) code and I said to the product owner we need to rewrite the entire application in GoLang and it would take 9 months. Of course I was joking...he was not amused.
See how you feel after backtesting a strategy on a day's worth of raw NASDAQ feed. OP, I use pandas extensively for backtesting and it certianly helps circumvent the shortcomings of Python. The process you outlined above is a sinple and effective workflow.
I've been thinking of keeping bid/ask data around for experimentation. Grows at the rate of 1G per week. Should do it...
If speed is your biggest concern....checkout dask and/or pytorch. Vectorization on crack....if you have a Nvidia cuda enabled gpu. Dask distributed is also nice, If you want to split your work across multiple machines.