仓库源文站点原文

This week 

CCC

Coordinating and communicating

The decentralized partially observable Markov decision process (*Dec*-*POMDP*)

DEC-POMDPs properties

DEC-POMDPs example

Communicating and Coaching

Inverse Reinforcement Learning

Inverse Reinforcement Learning

MLIRL: Maximium Likelyhod inverse reinforcement learning.

MLIRL result

CCC

Policy Shaping

Policy Shaping

quiz 1:  Policy Shaping 

Quiz 2: Policy Shaping

Policy Shaping probabiligy calculation

quiz 3: How to combine info from multiple sources in Policy shaping?

Drama Management

Drama management world

Drama Management: what's a stroy

Trajectories as MDP

TTD-MDP: Targeted trajectory distributions MDPs

what have we learned

recap

2015-11-18 初稿 完成