仓库源文站点原文


layout: post title: "Reinforcement Learning 第十四周课程笔记" date: "2015-11-19 13:12:22" categories: 计算机科学 excerpt: "This week should watch CCC. The readings are: Zeibart et al. (2008). Bab..."

auth: conge

This week 

CCC

Coordinating and communicating

The decentralized partially observable Markov decision process (*Dec*-*POMDP*)

DEC-POMDPs properties

DEC-POMDPs example

Communicating and Coaching

Inverse Reinforcement Learning

Inverse Reinforcement Learning

MLIRL: Maximium Likelyhod inverse reinforcement learning.

MLIRL result

CCC

Policy Shaping

Policy Shaping

quiz 1:  Policy Shaping 

Quiz 2: Policy Shaping

Policy Shaping probabiligy calculation

quiz 3: How to combine info from multiple sources in Policy shaping?

Drama Management

Drama management world

Drama Management: what's a stroy

Trajectories as MDP

TTD-MDP: Targeted trajectory distributions MDPs

what have we learned

recap

2015-11-18 初稿 完成