仓库源文站点原文


layout: post title: "Reinforcement Learning 第九周课程笔记" date: "2015-10-18 02:41:40" categories: 计算机科学 excerpt: "This week Watch Generalization. The readings are Gordon (1995) and Baird..."

auth: conge

This week

Generalization

Example

Generalization idea

Basic Update Rule

Linear Value Function Approximation

Linear value function

Quiz 1

Success and fail stories

Does it work?

Baird's Counter example

Quiz 2: Bad Update Sequence

Quiz 2: What if we initialize all the weights as 0

Averagers: looks like  Linear Value Function Approximation but is non-linear

quiz 3

Connection to MDPs

What have we learned

recap

2015-10-13 初稿 upto quiz 1
2015-10-17 完成
2015-12-04 reviewed and revisted