Tuesday Jan 15 2013 - Lecture 3

7 Pages
Unlock Document

University of Guelph
Computing and Information Science
CIS 3700
Yang Xiang

Tuesday, January 15 2013 CIS 3700 Lecture 3 Review • Afully observable environment is an environment which an agent's sensors give it access to the complete state of the environment at any time (for example, a game board) • How do we determine that the agent is doing a good job? ◦ Observability depends on the performance measure ◦ Suppose, in the case of the vacuum, we wanted to add to the performance measure that the vacuum must minimize disturbance to the human occupants ▪ Want the vacuum to be aware of where humans are so they do not work in spaces that humans are in ▪ Need to add a variable to track this new information ▪ Occupied Є {yes, no} ▪ Observability now depends on how reliably the sensors can determine that there is a human in a space Properties of Task Environment - 2 • Deterministic vs Stochastic ◦ Astate is a given set of values for all of the environmental values ▪ Any change in any of the values denotes a change in state ◦ Deterministic: Next state of the environment is completely determined by current state and agent action ▪ Example: 8-puzzle (Slider puzzles with 9 squares and 8 pieces), chess, checkers ▪ Each state represents a specific configuration of the board ▪ If one of the player's makes a move, then the evironment moves from one state to another ▪ Performing action X in state 1 will definitely cause the environment to change to state 2 ◦ Agent in stochastic environment has only partial control over the environmental state and must prepare for failure ▪ Example: Diagnostic agent • Some patients will respond to treatment, while others with the same symptoms will not respond to the same treatment • Performing action X in state 1 could cause the environment to change to state 2 or state 3 ◦ Apartially observable, deterministic environment has to be treated as stochastic ▪ Example: Weather forecasting • We do not have the ability to observe every point in the atmosphere (We cannot afford to observe every point in the atmosphere) • Because it is only partially observable, the agent cannot control or predict changes in the environment with accuracy Properties of Task Environment – 3 • Episodic vs Sequential ◦ In an episodic environment, the agent's experience is divided into episodes that are independent of each other ▪ Examle: Detect defective parts on assembly line ▪ Decision at current episode has no consequence to future episodes • Whether or not the first part was defective has no bearing on whether or not the next part will be defective ◦ In sequential environments, current actions have long-term consequences ▪ Example: Opening plays in a board game • If you make a good move, you will gain advantages later in the game. If you make mistakes in your first moves, you will be at a disadvantage later in the game (possibly) • Each decision has reaching consequences to future episodes in the series ▪ Easier to make decisions in episodic environments becuase you do not have to consider possible future outcomes of the decision Properties of Task Environment – 4 • Static vs. Dynamic ◦ Static: Environment does not change while the agent is deliberating ▪ Ex: Board games without a clock ▪ Agent does not need to worry about the passage of time and monitoring the environment while deliberating ▪ Undecidedness does not count as doing nothing because there are no consequences for making decisions ◦ Dynamic: The environment could change while the agent is deliberating ▪ Example: Blackout crisis • While the hydro workers were working to restore electricity, the environment is changing ◦ Food is spoiling in fridges ◦ Patients dependent on light or electric devices are getting worse without electricity ◦ Houses are getting hotter in the summer (or colder in the winter) ▪ Undecidedness counts as deciding to do nothing , which has a negative consequence ▪ In the case of a dynamic environment, it is better to generate an acceptable move within the given time frame than to generate an amazing move that takes too long to compute ▪ Making decisions in a dynamic environment are harder ▪ Dynamic environments are not necessarily Stochastic Properties of Task Environment – 5 • Discrete vs Continuous ◦ Discrete: Each environmental variable typically has a finite number of possible values ▪ Implication to number of environmental states ▪ Example: Checker vs. Heating control • Checkers have a set amount of spaces • Heater must be able to determine any value of the temperature, which is a continuous value (ie. Range of [-10, 40]) ◦ Infinitely possible values ◦ Discrete representation of continuous environment can be accurate to any desired degree, but is inherently approximate ◦ If every variable within the environment state is discrete, then the environment is discrete ▪ If a single variable is continuous, the environment is continuous Properties of Task Environment – 6 • Single agent vs. Multiagent ◦ Ex: Printer trouble shooting ▪ Single agent ◦ Ex: Board Game ▪ 2 agents, therefor multiagent ▪ 2 agents are working against each other (If I win, you lose and vice versus) ▪ Competitive multiagent environment
More Less

Related notes for CIS 3700

Log In


Don't have an account?

Join OneClass

Access over 10 million pages of study
documents for 1.3 million courses.

Sign up

Join to view


By registering, I agree to the Terms and Privacy Policies
Already have an account?
Just a few more details

So we can recommend you notes for your school.

Reset Password

Please enter below the email address you registered with and we will send you a link to reset your password.

Add your courses

Get notes from the top students in your class.