Representation-Induced Algorithmic Bias : an empirical assessment of behavioural equivalence over 14 reinforcement learning algorithms across 4 Isomorphic Gameform representations
Version 2 2023-08-18, 03:54Version 2 2023-08-18, 03:54
Version 1 2023-05-22, 20:28Version 1 2023-05-22, 20:28
In conceiving of autonomous agents able to employ adaptive cooperative behaviours we identify the need to effectively assess the equivalence of agent behavior under conditions of external change. Reinforcement learning algorithms rely on input from the environment as the sole means of informing and so reifying internal state. This paper investigates the assumption that isomorphic representations of environment will lead to equivalent behaviour. To test this equivalence-of assumption we analyse the variance between behavioural profiles in a set of agents using fourteen foundational reinforcement-learning algorithms across four isomorphic representations of the classical Prisoner’s Dilemma gameform. A behavioural profile exists as the aggregated episode-mean distributions of the game outcomes CC, CD, DC, and DD generated from the symmetric selfplay repeated stage game across a two-axis sweep of input parameters: the principal learning rate, α , and the discount factor γ , which provides 100 observations of the frequency of the four game outcomes, per algorithm, per gameform representation. A measure of equivalence is indicated by a low variance displayed between any two behavioural profiles generated by any one single algorithm. Despite the representations being theoretically equivalent analysis reveals significant variance in the behavioural profiles of the tested algorithms at both aggregate and individual outcome scales. Given this result, we infer that the isomorphic representations tested in this study are not necessarily equivalent with respect to the induced reachable space made available to any particular algorithm, which in turn can lead to unexpected agent behaviour. Therefore, we conclude that structure-preserving operations applied to environmental reward signals may introduce a vector for algorithmic bias.
History
Publication title
AI 2021: Advances in Artificial Intelligence 34th Australasian Joint Conference, AI 2021 Sydney, NSW, Australia, February 2–4, 2022 Proceedings