Concurrent Q-Learning: Reinforcement Learning for Dynamic Goals and Environments

Ollington, Robert; Vamplew, PW

File(s) under permanent embargo

Concurrent Q-Learning: Reinforcement Learning for Dynamic Goals and Environments

journal contribution

posted on 2023-05-16, 17:30 authored by Robert OllingtonRobert Ollington, Vamplew, PW

This article presents a powerful new algorithm for reinforcement learning in problems where the goals and also the environment may change. The algorithm is completely goal independent, allowing the mechanics of the environment to be learned independently of the task that is being undertaken. Conventional reinforcement learning techniques, such as Q-learning, are goal dependent. When the goal or reward conditions change, previous learning interferes with the new task that is being learned, resulting in very poor performance. Previously, the Concurrent Q-Learning algorithm was developed, based on Watkin's Q-learning, which learns the relative proximity of all states simultaneously. This learning is completely independent of the reward experienced at those states and, through a simple action selection strategy, may be applied to any given reward structure. Here it is shown that the extra information obtained may be used to replace the eligibility traces of Watkin's Q-learning, allowing many more value updates to be made at each time step. The new algorithm is compared to the previous version and also to DG-learaing in tasks involving changing goals and environments. The new algorithm is shown to perform significantly better than these alternatives, especially in situations involving novel obstructions. The algorithm adapts quickly and intelligently to changes in both the environment and reward structure, and does not suffer interference from training undertaken prior to those changes.

History

Publication title

International Journal of Intelligent Systems

Volume

20

Issue

10

Pagination

1037-1052

ISSN

0884-8173

Department/School

School of Information and Communication Technology

Publisher

John Wiley & Sons, Inc.

Place of publication

United States

Rights statement

The definitive published version is available online at: http://www3.interscience.wiley.com/

Repository Status

Restricted

Socio-economic Objectives

Expanding knowledge in the mathematical sciences

Usage metrics

Keywords

No keyword provided

Licence

In Copyright

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

File(s) under permanent embargo

Concurrent Q-Learning: Reinforcement Learning for Dynamic Goals and Environments

History

Publication title

Volume

Issue

Pagination

ISSN

Department/School

Publisher

Place of publication

Rights statement

Repository Status

Socio-economic Objectives

Usage metrics

Categories

Keywords

Licence

Exports