In psychology, game theory, statistics, and machine learning, win-stay, lose-switch (also win-stay, lose-shift ) is a heuristic learning strategy used for model learning in decision situations. It was first discovered as an improvement over randomness in bandit problems. Then applied to the prisoner's dilemma to model the evolution of altruism.
The rules of instruction bases its decisions only on the outcome of the previous game. Results are divided into success (win) and failure (loss). If playing in the previous round results in success, then the agent plays the same strategy in the next round. Alternatively, if the game produces a failure, the agent switches to another action.
A large-scale empirical study of rock game players, paper, scissors suggests that variations of this strategy are adopted by real-world players of the game, rather than Nash's equilibrium strategy to choose entirely randomly among the three options.
Video Win-stay, lose-switch
References
Maps Win-stay, lose-switch
See also
- Bounded rationality
Source of the article : Wikipedia