University of Tasmania
Browse

Cooperative intent: an exploration of computational learning in a discrete preference space

Download (16.33 MB)
thesis
posted on 2024-06-28, 02:02 authored by Simon Stanton

This thesis is an exploration of cooperation, game theory, and machine learning. The aim is to develop a method for profiling cooperative intent as found in the behaviour of learning algorithms over highly constrained games of cooperation; in order to step towards the future use of cooperation-as-policy in the reinforcement learning domain, and contribute to the theory of grounding emergent software systems.
The fundamental question explored is whether an agent can construct an online model—derived from a strictly ordinal 𝟐 × 𝟐 preference space—of its behaviour, to bridge numerical reward with the symbolism inherent to descriptive game theory models. Identification of a game model has utility in assessing the current state of a system, the current state of an agent, and, to assessing an agents’ cooperative intent, i.e., the cooperatively-contextualised reification of an agent’s internal state representation, or state-of-mind congruent to action. In doing this, the benefits and risks inherent in the anthropomorphisation of machine systems are placed within a mitigatory scope: defining intent, ultimately, as the reification of a trajectory between states (a state being found by a mapping from the agent’s behaviour to any one of the game models in a strictly ordinal preference space) that an agent may experience, which leads to the hypothesis that in nominating target states, and rewarding agents for reaching those targets, the application (via reflection) of cooperation as a policy instrument in machine learning algorithms may deliver beneficial outcomes to the agent and to the system as a whole.
This thesis develops through three sets of experiments that progressively undergo constraint devolution such that the rules and bounds of imperfect- and incomplete-information are (in-part) relaxed, while remaining based in Markov formalism. The initial set of experiments conducts a multi-model tournament to calculate a mutual cooperation rate metric on the performance of algorithms drawn from the game theory, bandit, and foundational reinforcement learning literature. The second set of experiments examines variance in the behaviour of agents under specific conditions of change and confirms that foundational reinforcement learning algorithms are sensitive to isomorphic changes in input; a finding that guides the third set of experiments with the development of a method for reflective-mapping, i.e., the identification of game models in sequential 𝟐 × 𝟐 matrix games; where agents are constrained by perfect- and incomplete-information visibility. For single agents, this method maps into the strictly ordinal preference space with a one-in-twelve resolution of the correct game model. An external observer, with the relaxed constraint of complete-information, offers singleton mapping into the strictly ordinal space. A singleton mapping is the recognition of an agent’s actions corresponding to the strategic dynamics of the game model constituting the environment.
These experiments are conducted in the context of concepts of cooperation, as have been developed in biological science, and their use in computational learning. This thesis attempts to place machine cooperation as a cooperative mode that is at once like, and unlike, biological cooperation, framed as the latter is by theories of evolution to which machines need not abide. Therefore, constraints that apply to biological modes of cooperation may not apply in an algorithmic context, as machine behaviour can manifest in unexpected, and seemingly irrational, ways. To gain traction on these problems, a descriptivist approach to game theory is pursued in an ethological manner, in order to empirically assess agent behaviour with respect to cooperative dynamics.

History

Sub-type

  • PhD Thesis

Pagination

xv, 239 pages

Department/School

School of Information and Communication Technology

Publisher

University of Tasmania

Event title

Graduation

Date of Event (Start Date)

2024-03-20

Rights statement

Copyright 2024 the author

Usage metrics

    Thesis collection

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC