PSYC 2330 Chapter Notes - Chapter 5:

Ch. 5 Instrumental Conditioning: Foundations
- Habituation sensitization, and classical conditioning do not require the participant to make a
particular response to obtain food or other unconditioned/conditioned stimuli
- Classical conditioning reflects how organisms adjust to events in their environment that they do
not directly control
Instrumental Learning behaviour that occurs because it was previously effective in producing certain
consequences (stimuli an organism encounters are a result or consequence of its behaviour)
Similar actions produced same type of outcome in the past
Difficult to isolate without experimental manipulation
Early Investigations of Instrumental Conditioning
- Thorndike was originally interested in animal intelligence, especially when Darwins research
indicated that human intellectual capacities were present in animals as well
- He devised a series of puzzle boxes to study this
He placed hungry animals inside 2 different puzzle boxes with food outside in plain view
The task for the animal was to learn how to escape the box and obtain the food (this task was
completed by pushing down on a lever inside the box, which released a latch or from pulling
a ring to release a latch that blocked the door on the outside)
Initially the animals were slow in their responses but then they got out of the box faster and
- Thorndike interpreted these results as reflecting learning of an S-R association
Successful escapes led to learning the association between the stimulus (the lever) and the
escape response (pressing the lever to open the door)
Did not believe the animals got faster in escaping because they gained insight into the task or
figured out how to release mechanism was designed
- Thorndike formulated the law of effect:
If a response in the presence of a stimulus is followed by:
1) A satisfying event, the association between the stimulus (S) and the response (R) is strengthened
increases in behaviour
2) An annoying event, the S-R association is weakened decreases in behaviour
What is learned is an association between the response and the stimulus (consequence of the
response is not one of the elements in the association but serves to strengthen or weaken the
- Compels the organism to make response R whenever stimulus S occurs (Ex. compulsive habits
such as biting ones nails)
Modern Approaches to the Study of Instrumental Conditioning
Discrete-Trial Procedures
- Similar to Thorndike (putting animal in apparatus and ends with removal after response)
Discrete-trial procedures - a method of instrumental conditioning in which the participant can perform the
instrumental response only during specified periods, usually determined either by placement of the
participant in an experimental chamber, or by the presentation of a stimulus
- 2 mazes frequently used are:
The runway/straight-alley maze: this maze containing a start box at one end and a goal box at
the other (barrier separating the start box from the main section of the runaway is raised
allowing the rat to reach the goal box containing a reinforce)
The T maze: consists of a start box and the alleys arranged in a shape of a T (the goal box was
located at each arm of the T and since there were 2 choice arms, the arms can be
differentiated with colours and the animal can learn to use environmental cues to tell them
which way to turn rat pups were conditioned to go in the direction of their mother rather
than the one with a virgin female)
- Measures:
Running speed - measuring how fast the animal gets from the start box to the goal box (it
increases with repeated training trials)
Latency - measuring the time it takes the animal to leave the start box and begin moving
down the alley (the time decreases as training processes)
Free-Operant Procedures
Free-operant procedure allow the animal to repeat the instrumental response without constraint over and
over again without being taken out of the apparatus until the end of an experimental session (B.F Skinner)
This is used to study behaviour in a more continuous manner (measureable unit must be defined
before behavior can be experimentally analyzed)
Operant response - a response that is defined by the effect it produces in the environment
Behavior is not defined in terms of particular muscle movements but in terms of how the behavior
operates on the environemt
Examples include pressing a lever or opening a door to be reinforced
Any sequence of movements that depresses the lever or opens the door constitutes an instance of
that particular operant and have the same effect on the environment (press lever with right paw,
tail etc)
A) Magazine Training and Shaping
Magazine training - a preliminary stage of instrumental conditioning in which a stimulus is repeatedly
paired with the reinforcer to enable the participant to learn to go and get the reinforcer when it is
The sound of the food-delivery device, for example, may be repeatedly paired with food so that
the animal will learn to go to the food cup when food is delivered (classical conditioning)
The animal learns when food is available (sound elicits a classically conditioned approach
Shaping - reinforcement of successive approximations to a desired instrumental response; successful
shaping involved:
Food is given if rat does anything remotely close to pressing the lever
You have clearly defined the final response, you wish for the subject to perform
You have to clearly assess the starting level of performance
You have to divide the progression from the starting point to the end point, with training steps
- The execution of the training involves 2 tactics:
Reinforcement of successive approximations to the final behaviour give food when they
do each step
Non-reinforcement of earlier response forms after they achieve one step, dont reinforce it
again, but reinforce the next step you want
- Application of these steps can be tricky! And you can run into difficulties if:
Shaping steps are too far apart
You spend too much time on one particular step, which ends up affecting progress
B) Shaping and New Behaviour
- Instrumental conditioning often involves combining the familiar responses that the animal already
knows with a new activity
- It can, however also be used to produce responses that the animal has never done before, through
shaping (Ex. expert performances in sports)
- The shaping process creates new responses depending on the inherent variability of the
Ex. variability permits coach to set a new a new successive approximation & with a new
target will being to make longer throws
It takes advantage of this variability of behavior to gradually move the distribution of
responses away from the trainees starting point and toward responses that are entirely new in
the trainee’s repertoire (to generate new responses)
C) Response Rate as a Measure of Operant Behaviour
- With continuous opportunity to respond, the organism determines the frequency of its
instrumental response, not the experimenter
Free-operant conditioning allows this to happen and lets the experimenter observe the
changes in behaviour over time (permits continuous observation)
The experimenter then measures the rate of occurrence (frequency of the response per
minute) of operant behaviour
Highly likely responses occur often and have a high rate while unlikely reponses occur
seldom and have a low rate
Instrumental Conditioning Procedures
- Participant makes a response and thereby produces an outcome or consequence
Appetitive stimulus - a pleasant outcome (Ex. getting paid to mow the lawn)
Aversive stimulus - an unpleasant outcome (Ex. yelling at a cat for getting on the kitchen counter)
Positive Reinforcement
Positive Reinforcement procedure in which the instrumental response produces an appetitive stimulus
If response occurs, appetitive stimulus is presented, if response does not occur, appetitve stimulus
is not presented
+ve contingency b/w instrumental response and appetitive response
increase in rate of responding
Ex. father gives daughter a cookie for putting toys away
Positive Punishment instrumental response produces an unpleasant, or aversive, stimulus
+ve contingency between the instrumental response and the stimulus outcome (response produces
an aversive outcome)
decrease in rate of instrumental responding
Ex. boss criticizes you for being late to a meeting
Negative Reinforcement
Negative Reinforcement - instrumental response turns off an aversive (unpleasant) stimulus
-ve contingency between the instrumental response and aversive stimulus
increase instrumental responding
This means if the subject performs the response the aversive stimulus is terminated/cancelled to
increase behavior and if it’s not, then it isn’t
Example: opening an umbrella to stop the rain from getting you wet
Omission Training or Negative Punishment
Negative Punishment - procedure in which the instrumental response prevents the delivery of an
appetitive (pleasant) stimulus
