Projects: Learning in Multi-Agent Systems
Many real world problems such as traffic control, network routing,
elevator dispatching, stock supply, transportation problems and pollution
detection can be naturally modeled with Multi-agent systems (MASs).
We are interested in the properties of such MASs. MASs consist of (adaptive)
agents which use inputs to select actions. The global behavior of a MAS
depends on the behavior of all agents and their interactions. We are
interested in using Machine Learning (ML) to optimize the behavior of each
agent. Agents use a decision policy to select actions based on inputs.
Their goal is
to optimize the rewards they get through their reward function which maps
changes of the environmental state to scalar reward signals. Since each
agent tries to maximize its own rewards, agents behave in their own interests.
Therefore conflicts between agents may arise and the need to coordinate the
agents emerges. Coordinating agents in a MAS can be done in different ways.
We can allow communication between agents, we can create management agencies,
or we can have systems in which reward functions are shared among agents.
In our project we identify different MAS systems and study on what kind of
problems they can be used efficiently.
Traffic light control
In one of our projects we study traffic light optimization in cities.
Our traffic light controllers use reinforcement learning techniques
for tracking waiting times for red lights of infrastructure users such
as cars and busses. The expected waiting time when a light is set to red
if of course higher than when the light is set to green. First of all,
the car cannot move closer to the intersection or cross the intersection,
but it has to wait at the same position in the traffic queue. Furthermore,
if the car could pass an intersection, it does not have to wait anymore for
the previous traffic light, whereas a red light could have stayed a while
on a red light. Therefore there is a gain of at least a single time step
for each car to have its light set to green. If the current lane almost
never has its light set to green, than the advantage of a green light for this
car may be very large. By monitoring cars driving through the city and
interacting with traffic lights, reinforcement learning can estimate the
expected waiting times online. The traffic light controllers use these
expected waiting times to compute gains for each car waiting for the
traffic light. After this, the traffic light controllers set the
allowed configuration with maximal gain to green lights or it explores.
We have developed a user friendly simulator for editing infrastructures using
the mouse. The simulator can track a variety of statistical measures to
compare different traffic light controllers. The software (written in Java1.4)
of the simulator, Green Light
District (GLD) can be found here: GLD.