PSY236 Lecture Notes - Lecture 4: Obedience Training, Contiguity, Psychiatric Hospital
PSY236: Week 4 – Operant Conditioning
→ operant conditioning = the relationship between behaviour and its consequences
Lecture outline:
• Definitions of consequences or behaviour i.e. reinforcement and punishment
• Schedules of reinforcement
• Schedules for reducing frequency of unwanted behaviours
• Operationalizing the use of reinforcers and discrimination training
o Magnitude of RF
o Delay of RF
o Contingency
• Secondary reinforcement e.g. money, tokens, clickers
• Activity reinforcers – Premack’s principle
• Complex sequence of behaviours – chaining
Puzzle boxes → runway mazes instrumental – learning
• Instrumental conditioning
o The behaviour in which the ‘organism’ engages in is instrumental to achieving some
desirable outcome
Runway mazes → Skinner box operant conditioning
• Operant conditioning
o The organism operates on its environment in some way to achieve some desirable
outcome
Operant = bell press = measure rate of bell rings to gain food (consequence)
• The consequence is pleasant → so probability of repeating the action increases thus POSITIVE
REINFORCEMENT
Consequences of our behaviour
• If something is ADDED to the environment increases the rate of activity then it is POSITIVE
REINFORCEMENT
• If something is ADDED but decreases rate of activity then it is POSITIVE PUNISHMENT
• If something is REMOVED from the environment and increases activity then it is NEGATIVE
REINFORCEMENT (avoidance learning)
• If something is REMOVED but decreases activity then it is NEGATIVE PUNISHMENT
Positive reinforcement (behaviour → consequence → outcome)
• Texting during class → receive reply that is more interesting than the lecture → keep on texting
• Checking email → receive desired message → check email more often
• Drinking coffee in the morning → feel more awake → have coffee every morn
Positive punishment
• Something is added to the environment, that causes the behaviour to decrease in frequency →
that something must have been unpleasant
• Depending on whether one has any control over these unpleasant consequences
Negative punishment
• Something is removed from environment, causes behaviour to decrease in frequency thus that
something must have been pleasant
• AKA response cost or omission training → but regardless of the name, they all involve the
removal of a stimulus, following the targeted behaviour that the person values/desires/enjoys
• i.e. if the person makes the ‘wrong’ response then they will lose something of value
find more resources at oneclass.com
find more resources at oneclass.com
• So they should learn to inhibit or omit the ‘wrong’ behaviour (omission learning)
• To facilitate the process, they may be reinforced for exhibiting another more desirable
behaviour (DRO: differential reinforcement or other behaviour)
• Wrong response then → lose something of value
• E.g. implement the threat an reward child for tidying toys
Negative reinforcement
• Something is removed from environment that causes behaviour to increase in frequency → that
something must have been unpleasant
• This is usually about avoidance learning – learning how to avoid unpleasant situations
Distinguishing between all 4 is to look at the emotions:
• Negative punishment = e.g. fined for speeding → removal of pleasant stimulus → anger
• Positive punishment = application of unpleasant stimulus e.g. apprehension, terror → fear
• Positive reinforcement = application of pleasant stimulus e.g. ecstasy, elation, pleasure →
happiness
• Negative reinforcement = removal of unpleasant stimulus → relief
Schedules of reinforcement
• Continuous schedule
o Behaviour is followed by a consequence each time it occurs
o Excellent for getting a new behaviour started
o Behaviour stops quickly when reinforcement stops
o Is the schedule of choice for punishment and time-out
• Thinning intermittent reinforcement
o One of two methods commonly used
o 1. Gradually increasing the response ratio or the duration of the time interval between
Response → reinforcer
o 2. Providing instructions such as rules, directions and signs to communicate the
schedule of reinforcement
o i.e. give a cue/signal that reinforcement is on its way (countdown at a crossing)
• Partial Schedules for resistance to extinction
o Ratio schedules: (responses/actions) e.g after the pre-determined number of responses
has been made → outcome
o Interval schedules: (time lapse) e.g. the 1st response after the specified time has elapsed
→ outcomes
o Fixed schedules: (set rate/time)
o Variable schedules: (random average)
o Combinations
▪ Fixed ratio
▪ Variable ratio
▪ Fixed interval
▪ Variable interval
Fixed-ratio schedule
• Behaviour/reinforcement (100/1 or 15/1)
• Response rate: (higher ratio = faster responding)
• Behaviour: (work hard) receive reinforcement; then brief post-reinforcement pause
• Resistance to extinction = LOW
• Behaviour on a fixed ratio
o High rates of presonding → pause after receiving reward → then onwards for the next
reward
o Make the number of response too high = ratio strain → a disruption in responding due
to an overly demanding response requirement
• Ratio strain
find more resources at oneclass.com
find more resources at oneclass.com