Specification gaming: the flip side of AI ingenuity

EdS · April 23, 2020, 2:02pm

Training of machine learning systems can give rise to unexpected results, as the system can find cheats - unexpected approaches which score well but miss the point. There’s a spreadsheet listing 60 examples linked in the article:

Two examples which could have implications for AI safety:

Agent kills itself at the end of level 1 to avoid losing in level 2
Self-driving car rewarded for speed learns to spin in circles