(Created page with " == Abstract == Traffic management problems provide a unique environment to study how multi-agent systems promote desired system level behavior. In particular, they represent...") |
m (Scipediacontent moved page Draft Content 692429182 to Tumer et al 2011a) |
(No difference)
|
Traffic management problems provide a unique environment to study how multi-agent systems promote desired system level behavior. In particular, they represent a special class of problems where the individual actions of the agents are neither intrinsically “good” nor “bad” for the system. Instead, it is the combinations of actions among agents that lead to desirable or undesirable outcomes. As a consequence, agents need to learn how to coordinate their actions with those of other agents, rather than learn a particular set of “good” actions. In this chapter, the authors focus on problems where there is no communication among the drivers, which puts the burden of coordination on the principled selection of the agent reward functions. They explore the impact of agent reward functions on two types of traffic problems. In the first problem, the authors study how agents learn the best departure times in a daily commuting environment and how following those departure times alleviates congestion. In the second problem, the authors study how agents learn to select desirable lanes to improve traffic flow and minimize delays for all drivers. In both cases, they focus on having an agent select the most suitable action for each driver using reinforcement learning, and explore the impact of different reward functions on system behavior. Their results show that agent rewards that are both aligned with and sensitive to, the system reward lead to significantly better results than purely local or global agent rewards. They conclude this chapter by discussing how changing the way in which the system performance is measured affects the relative performance of these rewards functions, and how agent rewards derived for one setting (timely arrivals) can be modified to meet a new system setting (maximize throughput).
Document type: Part of book or chapter of book
The different versions of the original document can be found in:
http://dx.doi.org/10.4018/9781605662268.ch012
http://web.engr.oregonstate.edu/~ktumer/publications/files/tumer-welch_maatte09.pdf
https://www.igi-global.com/chapter/traffic-congestion-management-learning-agent/26942,http://jmvidal.cse.sc.edu/lib/tumer09b.html,https://academic.microsoft.com/#/detail/2104736887
https://www.igi-global.com/viewtitle.aspx?TitleId=26942,http://dx.doi.org/10.4018/978-1-60566-226-8.ch012
DOIS: 10.4018/9781605662268.ch012 10.4018/978-1-60566-226-8.ch012
Published on 17/01/11
Accepted on 17/01/11
Submitted on 17/01/11
Volume 2011, 2011
DOI: 10.4018/9781605662268.ch012
Licence: CC BY-NC-SA license
Are you one of the authors of this document?