Symbolic and Neural Approaches for Learning Other Agents’ Intentional Models using Interactive POMDPs

Han, Yanlin

Symbolic and Neural Approaches for Learning Other Agents’ Intentional Models using Interactive POMDPs

thesis

posted on 2019-02-01, 00:00 authored by Yanlin Han

Interactive partially observable Markov decision processes (I-POMDPs) provide a principled framework for planning and acting in a partially observable, stochastic and multi-agent environment. I-POMDPs augment POMDP beliefs with nested hierarchical belief structures. In order to plan optimally using I-POMDPs, we propose symbolic and neural approaches that learn others’ intentional models which ascribe to them beliefs, preferences and rationality in action selection. In the symbolic Bayesian approach, agents maintain beliefs over intentional models of other agents and make sequential Bayesian updates using observations. To deal with the complexity of the hierarchical belief space, we have devised a customized interactive particle filter (I-PF) to descend the belief hierarchy, parametrize others' models, and sample all model parameters at each nesting level. We have also devised a neural network approximation of the I-POMDP framework, in which the belief update, value function, and policy function are implemented by various neural networks (NNs). Then we combined the same network architecture with the QMDP planner, and trained it end-to-end in a reinforcement learning fashion. Empirical results show that our Bayesian learning approach accurately learns models of the other agent. It serves as a generalized Bayesian learning algorithm that learns other agents' beliefs, nesting levels, and transition, observation and reward functions. Moreover, we show that the model-based network which learns to plan outperforms the model-free network which only learns reactive policies. The learned policy can directly generalize to a larger, unseen setting.

History

Advisor

Gmytrasiewicz, Piotr

Chair

Gmytrasiewicz, Piotr

Department

Computer Science

Degree Grantor

University of Illinois at Chicago

Degree Level

Doctoral

Committee Member

Liu, Bing Ziebart, Brian Zhang, Xinhua Koyuncu, Erdem

Submitted date

December 2018

Issue date

2018-11-27

Usage metrics

Keywords

I-POMDP, sampling, multi-agent systems, deep reinforcement learning

Licence

In Copyright

Symbolic and Neural Approaches for Learning Other Agents’ Intentional Models using Interactive POMDPs

History

Advisor

Chair

Department

Degree Grantor

Degree Level

Committee Member

Submitted date

Issue date

Usage metrics

Categories

Keywords

Licence

Exports