RPS.bot: Opponent Modeling Tutorial

For the tutorial, suppose that you are entering 200-game matches in various scenarios. The scenarios lead to building a progressively more complex bot as you encounter progressively more difficult challenges.

Beat a fixed simple unknown opponent

In practice, a lot of value will come from exploiting weak bots or other opponents.

If you know that your opponent is fixed and has a simple strategy, then you can:

  • Start by playing random
  • See what happens
  • Beat simple opponent after you see what they’re doing in the first few games

Automate this

def detect_fixed_strategy(history, n_games=10): ““” Helper function to detect if opponent is using a fixed strategy Returns: (is_fixed, dominant_move) or (False, None) ““” if len(history) < n_games: return False, None

recent_moves = [move[1] for move in history[-n_games:]]
move_counts = Counter(recent_moves)

# If any move appears > 90% of the time
for move, count in move_counts.items():
    if count / n_games > 0.9:
        return True, move
        
return False, None

Beat multiple opponents with one bot

Suppose that you have three sequential opponents: 1. Rock only 2. Beat last move 3. Beat last move

What happens if we run our last bot in this scenario? We’d crush the Rock only bot, but then get crushed ourselves by the Beat last move bots. Our EV would be 100 - 99 - 99 = -98!

Well then we could do much better if we decided to beat the Beat last move bots instead of the Rock only bot. To beat them, we’d play:

  • If last move Paper, opponent plays Scissors, we play Rock
  • If last move Rock, opponent plays Paper, we play Scissors
  • If last move Scissors, opponent plays Rock, we play Paper

This means that if we play Rock on the first move, we’d forever play RSPRSPRSP… (similarly if we played Paper on the first move, we’d forever play PRSPRSPRS…)

If we cycle, that means that we breakeven vs. Rock only. Exploiting and not getting exploited, but not fully exploiting.

Beat multiple opponents with one bot

  1. Rock only
  2. Beat last move
  3. Equilibrium bot

That seems difficult to beat! Although worth thinking about, let’s prioritize beating the simpler bots that we already know how to beat.

Beat multiple opponents with one bot

  1. Rock only
  2. Beat last move
  3. Final bossbot

Assume that the final bossbot keeps a dictionary of your actions after each of the following action pairs, and then plays the move to beat those.

RR PP SS RP RS PR PS SR SP

That seems difficult to beat! Although worth thinking about, let’s prioritize beating the simpler bots that we already know how to beat.

Read which opponent you’re playing and switch to ensemble selection

Read history and give last 10 Determine if playing vs. A, B, or C

GTO deviation

Assume by default that the opponent is playing approximately GTO and then detect deviations and play accordingly dirichlet What’s the weakness here? This is only detecting probabilities and not patterns.

N-gram

N-gram models of opponents most recent same history Always asking does this model work? Playing to beat them if it does Look at a bunch of models, which has best predictive power against opponent, play that one How to parametrize opponent Multiple hypotheses, run counterfactually, pick best one
Pattern hypotheses, compare to general pattern about something else Iocaine Powder

Challenges

See our Challenge page to get your bots into action in our:

  • Rock Practice Gym
  • Daily Challenge
  • Weekly Challenge
  • One-off Hackathons

There’s a special Weekly Challenge for first-timers.