(Created page with " == Abstract == Traffic management problems provide a unique environment to study how multi-agent systems promote desired system level behavior. In particular, they represent...")
 
m (Scipediacontent moved page Draft Content 692429182 to Tumer et al 2011a)
 
(No difference)

Latest revision as of 08:00, 29 September 2020

Abstract

Traffic management problems provide a unique environment to study how multi-agent systems promote desired system level behavior. In particular, they represent a special class of problems where the individual actions of the agents are neither intrinsically “good” nor “bad” for the system. Instead, it is the combinations of actions among agents that lead to desirable or undesirable outcomes. As a consequence, agents need to learn how to coordinate their actions with those of other agents, rather than learn a particular set of “good” actions. In this chapter, the authors focus on problems where there is no communication among the drivers, which puts the burden of coordination on the principled selection of the agent reward functions. They explore the impact of agent reward functions on two types of traffic problems. In the first problem, the authors study how agents learn the best departure times in a daily commuting environment and how following those departure times alleviates congestion. In the second problem, the authors study how agents learn to select desirable lanes to improve traffic flow and minimize delays for all drivers. In both cases, they focus on having an agent select the most suitable action for each driver using reinforcement learning, and explore the impact of different reward functions on system behavior. Their results show that agent rewards that are both aligned with and sensitive to, the system reward lead to significantly better results than purely local or global agent rewards. They conclude this chapter by discussing how changing the way in which the system performance is measured affects the relative performance of these rewards functions, and how agent rewards derived for one setting (timely arrivals) can be modified to meet a new system setting (maximize throughput).

Document type: Part of book or chapter of book

Full document

The PDF file did not load properly or your web browser does not support viewing PDF files. Download directly to your device: Download PDF document

Original document

The different versions of the original document can be found in:

http://dx.doi.org/10.4018/9781605662268.ch012

http://web.engr.oregonstate.edu/~ktumer/publications/files/tumer-welch_maatte09.pdf

https://www.igi-global.com/chapter/traffic-congestion-management-learning-agent/26942,http://jmvidal.cse.sc.edu/lib/tumer09b.html,https://academic.microsoft.com/#/detail/2104736887

https://www.igi-global.com/viewtitle.aspx?TitleId=26942,http://dx.doi.org/10.4018/978-1-60566-226-8.ch012


DOIS: 10.4018/9781605662268.ch012 10.4018/978-1-60566-226-8.ch012

Back to Top

Document information

Published on 17/01/11
Accepted on 17/01/11
Submitted on 17/01/11

Volume 2011, 2011
DOI: 10.4018/9781605662268.ch012
Licence: CC BY-NC-SA license

Document Score

0

Views 2
Recommendations 0

Share this document

claim authorship

Are you one of the authors of this document?