(Created page with " == Abstract == In this paper, the authors describe how a distributed reinforcement learning problem, in which the returns of many agents are simultaneously updating a single...") |
m (Scipediacontent moved page Draft Content 213991674 to Pendrith 2000a) |
(No difference)
|
In this paper, the authors describe how a distributed reinforcement learning problem, in which the returns of many agents are simultaneously updating a single shared policy, is addressed by applying novel reinforcement learning techniques. A traffic simulator is used in the learning process. Two new algorithms are introduced: a value function-based algorithm and one that uses a direct policy evaluation approach. Both algorithms are shown to perform comparably well.
The different versions of the original document can be found in:
Published on 01/01/2000
Volume 2000, 2000
DOI: 10.1145/336595.337554
Licence: CC BY-NC-SA license
Are you one of the authors of this document?