Course: PSYC*1000 (DE)
Professor: Harvey Marmurek
Schedule: Summer, 2012
Textbook: Psychology – Tenth Edition in Modules authored by David G. Myers
Textbook ISBN: 9781464102615
Module 21: Operant Conditioning
What is operant conditioning, and how is operant behaviour reinforced and shaped?
• Teaching a dog to salivate at the sound of a tone (classical conditioning)
• Teaching an elephant to walk on its hind legs (operant conditioning)
o Both are forms of associative learning
• Classical conditioning forms associations between stimuli (a conditioned stimulus and the unconditioned
stimulus, it signals). Involves respondent behaviour – actions that are automatic responses to a stimulus.
• Operant conditioning – organisms associate their own actions with consequences. Actions followed by
reinforcers increase; those followed by punishers decrease. Behaviour that operates on the environment to
produce rewarding or punishing stimuli is called operant behaviour.
With classical condition, we learn associations between events we do not control. With operant conditioning, we
learn associations between our behaviour and resulting events.
Thorndike used a fish reward to entice cats to find their way out of a puzzle box through a series of manoeuvres.
The cats’ performance tended to improve with successive trials, illustrating Thorndike’s law of effect.
• B. F. Skinner (1904-1990) college English major, aspiring writer who entered psychology graduate school.
Become modern behaviourism’s most influential and controversial figure. Skinner’s work elaborated on
what Edward L. Thorndike (1874-1949) called the law of effect: rewarded behaviour is likely to recur.
Skinner developed a behavioural technology that revealed principles of behaviour control. Skinner designed
an operant chamber, popularly known as a Skinner box – box has a lever that an animal presses. Design
creates a stage on which rats and other animals act out Skinner’s concept of reinforcement: any event that
strengthens a preceding response. Skinner’s experiments have done far more than teach us how to pull
habits out of a rat. They have explored the precise conditions that foster efficient and enduring learning.
Shaping – gradually guiding the rat’s actions toward the desired behaviour.
Might give rat a bit of food each time it approaches the bar. Finally, must touch the bar to get the food.
Shaping can also help us understand what nonverbal organisms perceive; some animals can form
When experimenters reinforced pigeons for pecking after seeing a human face, but not after seeing other
images, the pigeon’s behaviour showed that it could recognize human faces. In this experiment, the human face
was a discriminative stimulus. Like a green traffic light, discriminative stimuli signal that a response will be
A Gambian giant pouched rat, having been shaped to sniff out land mines, receives a bite of banana after
successfully locating a mine during training in Mozambique.
Types of Reinforcers
How do positive and negative reinforcement differ, and what are the basic types of reinforcers?
Ways to increase behaviour
Operant Conditioning Term Description Examples
Positive Reinforcement Add a desirable stimulus Pet a dog that comes when you call it;
pay the person who paints your house
Negative Reinforcement Remove an aversive stimulus Take painkillers to end pain; fasten seat
belt to end loud beeping. Negative reinforcement strengthens a response by reducing or removing something negative. Billy’s whining was
positively reinforced because Billy got something desirable.
How is operant conditioning at work in this cartoon? (Remember, when life gives you lemons, whine and pout and
cry until life can’t take it anymore and gives you cookies just to shut you up.)
If the child follows her older friend’s instructions, she will negatively reinforce her caregivers by ceasing her cries
when they grant her wishes. The parents will positively reinforce her whines with a treat.
Primary and Conditioned Reinforcers
Getting food when hungry or having a painful headache go away is innately satisfying. These primary
reinforcers are unlearned. Conditioned reinforcers, also called secondary reinforcers, get their power through
learned association with primary reinforcers. Other conditioned reinforcers: money, good grades, pleasant tone of
Immediate and Delayed Reinforcers
Let’s return to the imaginary shaping experiment in which you were conditioning a rat to press a bar. Before
performing this “wanted” behaviour, the hungry rat will engage in a sequence of “unwanted” behaviours –
scratching, sniffing, and moving around. If you present food immediately after any one of these behaviours, the rat
will likely repeat that rewarded behaviour. But what if the rat presses the bar while you are distracted, and you delay
giving the reinforcer? If the delay lasts longer than about 30 seconds, the rat will not learn to press the bar. You will
have reinforced other incidental behaviours – more sniffing and moving – that intervened after the bar press.
Unlike rats, humans do respond to delayed reinforcers: the paycheque at the end of the week, the good
grade at the end of the semester, the trophy at the end of the season. To our detriment, small but immediate
consequences (the enjoyment of watching late-night TV) are sometimes more alluring than big but delayed
consequences (feeling alert tomorrow).
How do different reinforcement schedules affect behaviour?
Schedules of Reinforcement
Ratio Every so many: reinforcement after nth After an unpredictable number: reinforcement after
behaviour, such as buy 10 coffees, get one a random number of behaviours, as when playing
free, or pay per product unit produced. slot machines or fly casting.
Interval Every so often: reinforcement for behaviour Unpredictably often: reinforcement for behaviour
after a fixed time, such as Tuesday discount after a random amount of time, as in checking for
prices. a Facebook response.
Continuous reinforcement = learning occurs rapidly as well as extinction.
Partial (intermittent) reinforcement = persist because occasionally rewarded; learning is slower but resistance to
extinction is greater
*Lesson for parents: partial reinforcement also works with children. Occasionally giving in to children’s tantrums for
the sake of peace and quiet intermittently reinforces the tantrums.
Fixed-ratio schedules reinforce behaviour after a set number of responses.
Skinner’s lab pigeons produced these response patterns to each of four reinforcement schedules.
(Reinforcers are indicated by diagonal marks). For people, as for pigeons, reinforcement linked to number of
responses (a ratio schedule) produces a higher response rate than reinforcement linked to amount of time elapsed
(an interval schedule). But the predictability of the reward also matters. An unpredictable (variable) schedule
produces more consistent responding than does a predictable (fixed) schedule.
Variable-ratio schedules provide reinforcers after a seemingly unpredictable number of responses. (slot
Fixed-interval schedules reinforce the first response after a fixed time period. (checking mail, checking jello
to see if it has set)
Variable-interval schedules reinforce the first response after varying time intervals (Facebook, email)
In general, response rates are higher when reinforcement is linked to the number of responses (a ratio
schedule) rather than to time (an interval schedule). But responding is more consistent when reinforcement is
unpredictable (a variable schedule) than when it is predictable (a fixed schedule).
Telemarketers are reinforced by which schedule? Variable-ratio
People checking the oven to see if the cookies are done are on which schedule? Fixed-interval Airline frequent-flyer programs that offer a free flight after every 25,000 miles of travel are using which