layout: post title: "AI 笔记 Week 15 Planning under uncertainty" date: "2018-12-02 17:11:33" categories: 计算机科学 excerpt: "Link to the videosLink to the video transcripts Introduction This lectur..."
This lecture will focus on marrying planning and uncertainty together to drive robots in actual physical roles and find good plans for these robots to execute.
Methods categorized based on the characteristics of the world (observability and certainty).
All these robots need to deal with uncertainties and observabilities to do their jobs (tour guide or mine explorer).
Absorbing states: search will end if the agent is at the absorbing states. Policy assign action based on the state the agent is in.
Question: what is the best action to take when an agent is in states a1, c1, c4 and b3?
The reason that the agent should be avoiding the b4 state is the cost.
I Quiz: calculate Value when an agent was at each state.
Calculation is complicated here because the reward of each action must be evaluated.
The policy can be defined by the value function after the value of each cell is calculated. The action policy is to choose the action which leads to the highest path reward.
If the cost of each state is positive, the policy will encourage action to stay in the current state.
If the cost is too low, the value of each state might become so low that the agent will try to end the search as soon as possible without looking for an optimal solution
POMDP would not work when there are two worlds So here's a solution that doesn't work: Obviously, the agent might be in 2 different worlds and it does not know. Solving the problem for both of these cases and then put these solutions together will not work because the average result will never let it go south to gather information.
POMDP on belief states will work. If the agent goes south and reaches the sign. 50% chance it will go the right-side belief state. if MDP was performed, then it will reach the +100 state. the same will happen if it goes to left-side belief state (50% chance).
Readings on Planning under Uncertainty
AIMA: Chapter 17
Further Study
Charles Isbell and Michael Littmann’s ML course:
- Markov Decision Processes
Peter Norvig and Sebastian Thrun’s AI course:
2018-12-01 First draft