Hide and Seek AI shows Emergent Tool Use

Discussion in 'Automated Trading' started by Sprout, Jan 25, 2021.

  1. Sprout

    Sprout

    “Machine learning has come a long way in the last decade, as it turned out throwing huge wads of computing power at piles of linear algebra actually turned out to make creating artificial intelligence relatively easy. OpenAI have been working in the field for a while now, and recently observed some exciting behaviour in a hide-and-seek game they built.

    The game itself is simple; two teams of AI bots play a game of hide-and-seek, with the red bots being rewarded for spotting the blue ones, and the blue ones being rewarded for avoiding their gaze. Initially, nothing of note happens, but as the bots randomly run around, they slowly learn. Over millions of trials, the seekers first learn to find the hiders, while the hiders respond by building barriers to hide behind. The seekers then learn to use ramps to loft over them, while the blue bots learn to bend the game’s physics and throw them out of the playfield. It ends with the seekers learning to skate around on blocks and the hiders building tight little barriers. It’s a continual arms race of techniques between the two sides, organically developed as the bots play against each other over time.

    It’s a great study, and particularly interesting to note how much longer it takes behaviours to develop when the team switches from a basic fixed scenario to an changable world with more variables.”

    https://hackaday.com/2021/01/25/hide-and-seek-ai-shows-emergent-tool-use/
     
    userque likes this.
  2. Sprout

    Sprout

    The video in the article is fascinating to watch.

    Perhaps it’s just a matter of time before this gets adapted to trading.
     
  3. userque

    userque

    Saw this a year or two ago. Looks like standard genetic programming...not new to market algos.
     
    damon_achey likes this.
  4. If I liked coding more I would already be on it.
     
  5. Coursera has had a class for awhile on reinforcement learning in finance from NYU. With trading, the reinforcement learning algorithm faces the same problem a human does though as far as discounting new information in order to change policy.
     
  6. New information as in news? That is I believe the hardest part..