ba_thesis

Code implemented for my BSc thesis "Optimal planning under Model Uncertainty" at TU Darmstadt 2018

Abstract

Bayesian model-based reinforcement learning is an elegant formulation to learning and planning optimal behavior under model uncertainty. In this work an extension to the Markov decision process model (MDP), used throughout in the field of reinforcement learning, is studied. The formalism of Bayes- Adaptive Markov decision processes (BAMDP) allows an intrinsic representation of model uncertainty and gathered information for action-selection. Thus solving a BAMDP is equivalent to finding an optimal exploration / exploitation tradeoff in the underlying MDP. I reviewed to approaches to solving BAMDPs: one offline approach based on policy improvement for stochastic finite-state controllers and one online approach using sample-based tree-search with given heuristics. I applied both approaches to two problems famous in literature: the chain problem and the four-dimensional queueing network.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

ba_thesis

Files

README.md

Latest commit

History

README.md

File metadata and controls

ba_thesis