Summary Students build their own regret matching algorithm to learn the Nash equilibrium of Rock, Paper, Scissors through self-play.
Topics Regret matching, game theory, self-play, Nash equilibrium.
Audience Undergraduate students and above in STEM fields.
Difficulty Relatively challenging. Good students can complete the notebook within one hour during a seminar. Most students will make some progress in that time
Strengths Students experience writing their own algorithm, which should deepen their understanding of how regret matching works, and more broadly how to go from AI theory to application. Students will also learn about game theory.
Weaknesses May be difficult for students without python programming experience. Students do not get the full experience of algorithm design because the exercise is scaffolded by telling the students which functions they will need and supplying some pseudocode for those functions.
Dependencies Some basic python programming experience is needed. Light familiarity with game theory is preferred. Students will need python (3.6 or above) and the ability to run Jupyter notebooks.
Variants Within the assignment we suggest a follow-on project of generalising the algorithm to more games. The assignment could be adapted to make it more difficult by removing the pseudocode and/or the function definitions. And we welcome other instructors implementing the same exercise in different languages. The design follows test driven development, which we feel is a good design for student exercises as they get immediate feedback on whether their code is correct.