Python Multi-processing for Backtesting

wfeagin3 · Jan 19, 2015

I currently am having trouble trying to get my "embarrassingly parallel" backtesting code converted to a code that will do parallel processing.

# Code Start
input 1 = [AAPL, GOOG, IBM, XOM]

alloc = []
for i in range(0,10):

for j in range(0,10):

for k in range(0,10):

for m in range(0,10):

if (i+j+k+m = 10)

temp = [i, j, k, m]
alloc = np.append(alloc,temp)

input 2 = alloc

def function(input1, input2)
#My function code

# Insert multiprocessing code here
# Code End

I have scoured the internet resources and have tried using their techniques. However, I do not have a programming background so interpreting the examples has proven difficult for me.

Can someone help me through the rest of this code?

Thanks in advance!

volpunter · Jan 20, 2015

The question is not whether or not you have a coding background but whether you are proficient in a programming language in order to justify investing your precious time in coding yourself. If you are not fully proficient in a programming language then I would strongly discourage you from touching anything that remotely smells like "async code execution", parallelization of algorithms, gpu computing...

btw, Stackoverflow is a much better place to ask this kind of question.

Just my 2 cents.

wfeagin3 said:
I currently am having trouble trying to get my "embarrassingly parallel" backtesting code converted to a code that will do parallel processing.

# Code Start
input 1 = [AAPL, GOOG, IBM, XOM]

alloc = []
for i in range(0,10):

for j in range(0,10):

for k in range(0,10):

for m in range(0,10):

if (i+j+k+m = 10)

temp = [i, j, k, m]
alloc = np.append(alloc,temp)

input 2 = alloc

def function(input1, input2)
#My function code

# Insert multiprocessing code here
# Code End

I have scoured the internet resources and have tried using their techniques. However, I do not have a programming background so interpreting the examples has proven difficult for me.

Can someone help me through the rest of this code?

Thanks in advance!
More...

AdrianHagh81 · Jan 20, 2015

import multiprocessing

def backteststuff():
"""worker function"""
print 'Make me money'
return

if __name__ == '__main__':
jobs = []
for i in range(5):
p = multiprocessing.Process(target=backteststuff)
jobs.append(p)
p.start()

AdrianHagh81 · Jan 20, 2015

Or do it all in SQL, like an OP in another thread does.

volpunter · Jan 20, 2015

ouch

AdrianHagh81 said:
Or do it all in SQL, like an OP in another thread does.

More...

wfeagin3 · Jan 20, 2015

volpunter:
I am not at a high level of proficiency in programming, but I am managing to do some basic things. All I am trying to do is analyze some data and crunch some numbers. The run time takes forever so parallel processing would speed things up. Thanks for the warning though.

oly · Jan 20, 2015

I agree with volpunter - don't mess with multithreading if you aren't very comfortable with it already.

You can speed things up by profiling and optimizing instead. Indeed, you should do this before you would multithread anyway. It usually leads to bigger gains as well.

For instance, if your data is in an SQL database and you forgot to index it profiling will tell you the DB call takes all the time. Then you find a way to fix that (i.e. add the index).

I have a non-threaded backtest of a straightforward ETF swing trading strategy. A basket of about 12 ETFs using common indicators trading over 10 years of historic Yahoo daily data takes a few seconds for me.

oly · Jan 20, 2015

meant to add, see this link for a nice intro to profiling

http://stackoverflow.com/questions/582336/how-can-you-profile-a-python-script

http://www.onlamp.com/pub/a/python/2005/12/15/profiling.html

2rosy · Jan 20, 2015

here's a worker example for you https://gist.github.com/audubon/de016aa2dd914691c633