Fully automated futures trading

Discussion in 'Journals' started by globalarbtrader, Feb 11, 2015.

  1. wopr

    wopr

    I had one contract in soybeans and corn each, and yesterday after the close, system computed optimal positions as 0.791 and 0.984, so didn't sell any. However, I *strongly* considered interfering yesterday evening when the grains markets opened and selling. I read somewhere that this was a 7 sigma down day in soybeans which happens once every many years or whatever and it got me thinking, but I managed not to touch it.
     
    #2771     Jun 18, 2021
  2. KevinBB

    KevinBB

    Well, that was a nice week!

    My small account size, makes me (relatively) a little bit less exposed to commodities / over exposed to currency and equities than others reading this post may be. Monday to Thursday weren't too bad, but when the Friday statement comes in it will show the biggest trend following down day since my system started last October or so.

    I can't complain, though. Over the total portfolio (which is made up of mainly Aussie large caps and an index based ETF), unless the short term trend changes, this June looks like it is heading for the first down month since last September.

    These volatile weeks are good for me, because they expose all the design faults / potential design faults in the way I have implemented Rob's framework. Its not the framework, but my implementation of it. The biggest lesson for me from this week is that I need to go back and look at how I've implemented buffering. The portfolio experienced quite a few whipsaw events during the latter part of the week, and I've put that down to buffering (or lack thereof) with a smaller number of contracts for each security.

    Still working on that.

    KH
     
    #2772     Jun 18, 2021
  3. Quick update on this; I found that process pool actually slowed this code down! (by about an order of magnitude)

    So basically I'm doing:

    loop over dates
    generate possible grid points for a given date

    And then this code:

    Code:
        
    grid_possibles = list(itertools.product(*grid_points))
    
        if use_process_pool:
            with ProcessPoolExecutor() as pool:
                results = pool.map(
                    neg_return_with_risk_penalty_and_costs,
                             grid_possibles,
                            itertools.repeat(optimisation_parameters)
                             )
        else:
            results = map(neg_return_with_risk_penalty_and_costs,
                          grid_possibles,
                          itertools.repeat(optimisation_parameters))
    
    
    Since each evaluation is doing something fairly short and simple, I think the overhead of spinning up a new process for each grid point is far exceeding the benefits from parallel execution.

    Process pool would probably be faster if I did the pool.map on each individual date, but because I'm using a trading cost penalty I need to evaluate them in date order, knowing what the positions were yesterday. So I can't do:


    Code:
        
    
            with ProcessPoolExecutor() as pool:
                results = pool.map(
                    find_optimal_portfolio_for_date,
                             list_of_dates,
                            itertools.repeat(optimisation_parameters)
                             )
    
    What I did find *much* faster was ensuring the code that run the evaluation of each point was in a single file all by itself, rather than clumped together with loads of other stuff. I'm guessing that map creates a copy of the entire name space around neg_return_with_risk_penalty_and_costs for each call, which obviously will be much smaller if that function is in a file by itself. I found this out entirely by accident, having written all my research code in a single massive file and then when refactoring it into smaller files.... which goes to prove "Refactor then optimise" is the way to go.

    GAT
     
    #2773     Jun 22, 2021
  4. Elder

    Elder

    Yes agreed. I have very limited used cases for overlaying multiprocessing onto my code and I use it sparingly. However, it has proven to be quite useful, as you have observed, when each process has to do some serious lifting.

    The overhead of multiprocessing is explained very well here:

    https://stackoverflow.com/questions...ool-slower-than-just-using-ordinary-functions
     
    #2774     Jun 22, 2021
  5. djames

    djames

    Hey Rob, I think what you are looking for is the chunksize param to "pool.map"

    Code:
      
    grid_possibles = list(itertools.product(*grid_points))
    
        if use_process_pool:
            with ProcessPoolExecutor() as pool:
                results = pool.map(
                    neg_return_with_risk_penalty_and_costs,
                             grid_possibles,
                            itertools.repeat(optimisation_parameters),
                            chunksize=len(grid_possibles)/num_processes
                             )
    
    
    Then as you say, each process will be fed with a large number of iterables, rather than spinning a new process for each iterable. Weirdly the default is chunksize=1, which is surely pants.
     
    #2775     Jun 22, 2021
  6. Thanks will try this.

    Out of practice with this stuff! Been nearly 8 years since I was playing with AHLs massive research cluster....

    GAT
     
    #2776     Jun 22, 2021
  7. Yeah that works well; I experimented with num_processes and anywhere between 4 and 16 does pretty similar; I used 8 since that is the number of cores I've got

    GAT
     
    #2777     Jun 22, 2021
  8. Kernfusion

    Kernfusion

    Not an expert on how to do it in Python, but in general yeah, if it spins\kills new process every time - that sound expensive, also copying large amounts of data to\from workers might overwhelm the benefits of parallel processing..
    So if it's possible to pre-load some static data into every process and pass only small changing parameters for every next computation (perhaps also in bulk) and keep reusing the same running workers that might help..
     
    Last edited: Jun 22, 2021
    #2778     Jun 22, 2021
  9. Elder

    Elder

    If its working well you probably don't need to change anything but fwiw you can also pass at instantation the max_workers param to avoid the overhead of spinning up to too many workers as in:
    with ProcessPoolExecutor(max_workers=cpu_count) as pool:

    The optimal setting is trial and error though, if there is a lot of waiting to read/write data your optimal may be slightly higher than cpu_count.
     
    #2779     Jun 23, 2021
  10. Back to the drawing board

    https://qoppac.blogspot.com/2021/06/optimising-portfolios-for-small.html

    "This was a cool idea! And I enjoyed writing the code, and learning a few things about doing more efficient grid searches in Python.

    But it doesn't seem to add any value compared to the much simpler approach of just trading everything and rounding the positions. And for such a hugely complex additional process, it needed to add significant value to make it worth doing.

    In the next post I'll try another approach: using a formal 'static' optimisation to select the best group of instruments to trade for a given amount of capital."

    GAT
     
    #2780     Jun 25, 2021
    Kernfusion and wopr like this.