Distributed reinforcement learning for a traffic engineering application - Pendrith 2000a - Scipedia

Collection of open conferences in research transport

M. Pendrith

Abstract

In this paper, the authors describe how a distributed reinforcement learning problem, in which the returns of many agents are simultaneously updating a single shared policy, is addressed by applying novel reinforcement learning techniques. A traffic simulator is used in the learning process. Two new algorithms are introduced: a value function-based algorithm and one that uses a direct policy evaluation approach. Both algorithms are shown to perform comparably well.

Original document

The different versions of the original document can be found in:

http://dx.doi.org/10.1145/336595.337554

http://www.sci.brooklyn.cuny.edu/~parsons/courses/790-spring-2004/notes/pendrith.pdf

https://dblp.uni-trier.de/db/conf/agents/agents2000.html#Pendrith00,

http://www.sci.brooklyn.cuny.edu/~parsons/courses/790-spring-2004/notes/pendrith.pdf,

http://portal.acm.org/citation.cfm?doid=336595.337554,

https://dl.acm.org/citation.cfm?id=336595.337554,

https://doi.org/10.1145/336595.337554,

https://trid.trb.org/view/715027,

https://core.ac.uk/display/101466319,

https://academic.microsoft.com/#/detail/2023790196

http://dl.acm.org/ft_gateway.cfm?id=337554&ftid=7539&dwn=1,

http://dx.doi.org/10.1145/336595.337554

Back to Top

Document information

Published on 01/01/2000

Volume 2000, 2000
DOI: 10.1145/336595.337554
Licence: CC BY-NC-SA license

Share this document

Keywords

claim authorship

Are you one of the authors of this document?