Learning involves the acquisition of new knowledge, skills, or responses from experience that result in a relatively
permanent change in the state of the learner. This definition emphasizes these key ideas: Learning is based on
experience. Learning produces changes in the organism. These changes are relatively permanent.
Behaviorism, with its insistence on measuring only observable, quantifiable behavior and its dismissal of mental activity
as irrelevant and unknowable, was the major outlook of most psychologists working from the 1930s through the 1950s.
John Watson kick started the behaviourist movement arguing that psychologists should never use the terms
consciousness, mental states, mind, content, etc.
Ivan Pavlov worked with Watson. Pavlov won the nobel prize for his work in salivation of dogs. Pavlov studied the
digestive processes of laboratory animals and explored into spit and drool revealed one form of learning. Classical
conditioning occurs when a neutral stimulus produces a response after being paired with a stimulus that naturally
produces a response.
Four basic elements of classical conditioning:
1) When food is presented, the dogs began to salivate. Pavlov called the presentation of food an unconditioned
stimulus (US), or something that reliably produces a naturally occurring reaction in an organism.
2) He called the dogs’ salivation an unconditioned response (UR), or a reflexive reaction that is reliably produced by
an unconditioned stimulus.
3) Sometimes a buzzer could cause the dogs to salivate. This become a conditioned stimulus (CS), or a stimulus that
is initially neutral and produces no reliable response in an organism
4) When dogs learn to associate the sound of the buzzer with food and they will be able to produce a response or
salivation - Pavlov called it the conditioned response (CR), or a reaction that resembles an unconditioned
response but is produced by a conditioned stimulus.
In summary, the presentation of food (the US) has become associated with a complex CS—your getting up, moving into
the kitchen, opening the cabinet, working the can opener—such that the CS alone signals to your dog that food is on the
way and therefore initiates the CR of her getting ready to eat.
Basic Principles of classical conditioning
This was the kind of behaviorist psychology John B. Watson was proposing: An organism experiences events or stimuli
that are observable and measurable, and changes in that organism can be directly observed and measured.0
1) Acquisition - the phase of classical conditioning when the CS and the US are presented together. For eg: when u
first get a dog and the dog does not know that food is on the way when you walk in to the kitchen. During the
initial phase of classical conditioning, typically there is a gradual increase in learning: It starts low, rises rapidly,
and then slowly tapers off
2) Second Order Conditioning - conditioning where the stimulus that functions as the US is actually the CS from an
earlier procedure in which it acquired its ability to produce learning. For example, in an early study Pavlov
repeatedly paired a new CS, a black square, with the now reliable tone. After a number of training trials, his dogs
produced a salivary response to the black square even though the square itself had never been directly
associated with the food.
Second-order conditioning helps explain why some people desire money to the point that they hoard it and
value it even more than the objects it purchases.
3) Extinction - the gradual elimination of a learned response that occurs when the US is no longer presented. For
eg: when you present the buzzer (CS) but not the food (US)
4) Spontaneous Recovery - the tendency of a learned behavior to recover from extinction after a rest period.
Extinction does not wipe out the learning that had been acquired.
5) Generalization & discrimination - the CR is observed even though the CS is slightly different from the original one
used during acquisition. This means that the conditioning “generalizes” to stimuli that are similar to the CS used
during the original training. When an organism generalizes to a new stimulus, two things happen: 1) by
responding to the new stimulus, the organism demonstrates that it recognizes the similarity between the
original CS and the new stimulus. 2) by displaying diminished responding to that new stimulus, it also tells us
that it notices a difference between the two stimuli. In the second case, the organism shows discrimination,
or the capacity to distinguish between similar but distinct stimuli. Story of Albert
Every time Albert reached out to touch a white rat, the steel bar was struck and this frightened Albert. This caused
Albert to cry at the sight of a rat. Second-order conditioning helps explain why some people desire money to the point
that they hoard it and value it even more than the objects it purchases. Through this Watson showed that emotional
responses such as fear and anxiety could be presented due to classical conditioning and not only due to unconscious
processes like Freud argued
Cognitive elements of classical conditioning
Why didn`t the dogs react when they saw Pavlov like they would reach when hearing the buzzer? Pavlov was not
a reliable indicator of the arrival of food. Pavlov was linked with the arrival of food, but he was also linked with other
activities that had nothing to do with food.
In delay conditioning, the CS is a tone that is followed immediately by the US, a puff of air, which elicits an eyeblink
response. Trace conditioning uses the identical procedures, with one difference: In trace conditioning, there is a brief
interval of time after the tone ends and the air puff is delivered. Trace conditioning depended on awareness of the
relationship between US and CS but delay conditioning did not.
Amnesic patients (those who lack explicit memory of recent experiences) show delay conditioning of eye blink responses
but do not show trace conditioning. This is because delay conditioning did not require awareness of relationship
between tone and air puff.
Conditioning procedures are being used to study the relationship between hallucinations and reality, as often occurs in
patients with schizophrenia.
Neural elements of classical conditioning
The cerebellum is critical for both delay and trace conditioning. The cerebellum is part of the hindbrain and plays an
important role in motor skills and learning. The hippocampus is important for trace conditioning but not delay
conditioning. The amygdala, particularly an area known as the central nucleus, is also critical for emotional conditioning.
The action of the amygdala is an essential element in fear conditioning, and its links with other areas of the brain are
responsible for producing specific features of conditioning.
Evolutionary Element of Classical conditioning
Evolution and natural selection go hand in hand with adaptiveness: Behaviors that are adaptive allow an organism to
survive and thrive in its environment. Any species that forages or consumes a variety of foods needs to develop a
mechanism by which it can learn to avoid any food that once made it ill. To have adaptive value, this mechanism should
have several properties:
There should be rapid learning that occurs in 1 or 2 trials. If learning takes more trials than this, one could die from
eating a toxic substance.
Conditioning should be able to take place over very long intervals, perhaps up to several hours. Toxic substances
often don’t cause illness immediately, so the organism would need to form an association between food and the
illness over a longer term.
The organism should develop the dislike to the smell or taste of the food rather than its ingestion
Learned dislikes should occur more often with new foods than familiar ones. It is not adaptive for an animal to
develop a dislike to everything it has eaten on the particular day it got sick. Our psychologist friend didn’t develop
an aversion (dislike) to the Coke he drank with lunch or the scrambled eggs he had for breakfast that day; however,
the sight and smell of hummus do make him uneasy.
Biological preparedness - a propensity for learning particular kinds of associations over others, so that some behaviors
are relatively easy to condition in some species but not others. Eg: the taste and smell stimuli that produce food
aversions (dislike) in rats do not work with birds
The study of classical conditioning is the study of behaviors that are reactive. When an animal salivates, it is an
involuntary action during the conditioning process. Classical conditioning has little to say about these voluntary
behaviors, we turn now to a different form of learning: operant conditioning, a type of learning in which the consequences of an organism’s behavior determine whether it will be repeated in the future. The study of operant
conditioning is the exploration of behaviors that are active.
Thorndike developed the law of effect, which states that behaviors that are followed by a “satisfying state of affairs”
tend to be repeated and those that produce an “unpleasant state of affairs” are less likely to be repeated. He developed
this by putting a cat in a puzzle box and only one thing would let the cat have access to food but the cat had to try
several other things before pushing the lever to obtain food.
In Pavlov’s work the US occurred every training trial no matter what the animal did – it was provided food whether it
salivated or not. But in Thorndike’s work the animal got food depending on its behavior and action.
Skinner coined the term operant behavior to refer to behavior that an organism produces that has some impact on the
environment. The operant chamber aka Skinner box allows a researcher to study the behavior of small organisms in a
Skinner’s approach to the study of learning focused on reinforcement and punishment. A reinforcer is any stimulus or
event that functions to increase the likelihood of the behavior that led to it, whereas a punisher is any stimulus or event
that functions to decrease the likelihood of the behavior that led to it. Eg: going on a roller coaster will be reinforcing for
the group that loves it but punishing for those who hate it.
Reinforcement is effective than punishment in promoting learning because: Punishment signals that an unacceptable
behavior has occurred, but it doesn’t specify what should be done instead.
Positive reinforcement is where a rewarding stimulus is presented and negative reinforcement is where an unpleasant
stimulus is removed. They increase the likelihood of the behaviour.
Positive punishment is where an unpleasant stimulus is administered and negative punishment is where a rewarding
stimulus is removed. They decrease the likelihood of the behaviour
Food, comfort, shelter, or warmth are examples of primary reinforcers because they help satisfy biological needs.
Secondary reinforcers derive their effectiveness from their associations with primary reinforcers through classical
conditioning. Eg: handshakes, encouraging grin, etc.
The over justification effect happens when external rewards undermine the natural satisfaction of performing a
behavior. For eg: when kids got awarded for drawing continued to perform the task for the sake of it and they were not
keen it in
Properties of Operant Conditioning
1) Discrimination, Generalization and importance of context - Learning takes place in contexts, not in any possible
situation. Skinner rephrased it later, most behavior is under stimulus control, which develops when a particular
response only occurs when an appropriate discriminative stimulus is present. Skinner discussed this in a “three-
term contingency”: discriminative stimulus (friends drinking coffee at timmies), a response (joking comments about
a professor) and a reinforcer (laughter among friends)
2) Extinction - operant behavior undergoes extinction when the reinforcements stop. Eg: you wouldn’t put money into
a vending machine if it fails to give you candy. In classical conditioning, the US occurs on every trial no matter what
the organism does. In operant conditioning, the reinforcements only occur when the proper response has been
made, and they don’t always occur even then.
3) Schedules of reinforcement - Unlike classical conditioning, where the sheer number of learning trials was important,
in operant conditioning the pattern with which reinforcements appeared was crucial. The two most important are
interval schedules & ratio schedules
- Interval Schedule: fixed interval schedule (FI), reinforcers are presented at fixed time periods, provided that
the appropriate response is made. Rats and pigeons in Skinner boxes produce predictable patterns of behavior
under these schedules. Students’ cramming for an exam is another example. Variable interval schedule (VI),
a behavior is reinforced based on an average time that has expired since the last reinforcement. Eg: radio
station and giveaways
- Ratio Schedules: fixed ratio schedule (FR), reinforcement is delivered after a specific number of responses
have been made. For eg: presenting reinforcement after every 4 response. Eg: workers getting paid after a
certain number of shirts have been washed. Continuous reinforcement is presenting reinforcement after each
response. Variable ratio schedule (VR), the delivery of reinforcement is based on a particular average number
of responses. Eg: slot machines in a casino – you might win after 3 pulls or even after 80 pulls - Variable ratio schedules produce slightly higher rates of responding than fixed ratio schedules primarily
because the organism never knows when the next reinforcement is going to appear.
- Intermittent reinforcement - when only some of the responses made are followed by reinforcement - they
produce behavior that is much more resistant to extinction than a continuous reinforcement schedule. Eg:
when you put a dollar into a slot machine and it is broken but you don’t know that it is broken. You won’t stop
after one try; you will continue trying without winning anything. Under intermittent reinforcement, organisms
will show resistance to extinction.
- This relationship between intermittent reinforcement schedules and the robustness of the behavior they
produce is called the intermittent-reinforcement effect, the fact that operant behaviors that are maintained
under intermittent reinforcement schedules resist extinction better than those maintained under continuous
4) Shaping through successive approximations - Our actions and behaviour causes the world around us to react in
response to our actions. Most of our behaviors, then, are the result of shaping, or learning that results from the
reinforcement of successive steps to a final desired behavior.
5) One of the keys to establishing reliable operant behavior is the correlation between an organism’s response and the
occurrence of reinforcement. In continuous reinforcement, when every response is followed by the presentation of
a reinforcer, there is perfect, correlation. In the case of intermittent reinforcement, the correlation is weaker.
However, some behaviour is an accidental correlation. Hypothesis is just one of many examples of human
Cognitive elements of Operant Conditioning
Edward Chace Tolman advocated a cognitive approach to operant learning and provided evidence that in maze learning
experiments, rats develop a mental picture of the maze, which he called a cognitive map. He said there’s more to
learning than just knowing the circumstances in the environment. He proposed that an animal established a means-ends
relationship. Tolman conducted studies on latent learning and cognitive maps, two phenomena that suggest that simple
stimulus-response interpretations of operant learning behavior are poor.
In latent learning, something is learned but it is not manifested as a behavioral change until sometime in the future.
Cognitive map- mental representation of the physical features of the environment. Eg: instead of remembering start
here and end here and its more like two lefts and two rights.
Neutral elements of operant conditioning
Pleasure centers – contributes to the process of reinforcement. The nucleus accumbens, medial forebrain bundle, and
hypothalamus are all major pleasure centers in the brain.
The neurons in the medial forebrain bundle, a pathway that meanders its way from the midbrain through
the hypothalamus into the nucleus accumbens, are the most susceptible to stimulation that produces pleasure. The
neurons all along this pathway and especially those in the nucleus accumbens itself are all dopaminergic; that is, they
secrete the neurotransmitter dopamine.
Evolutionary elements of operant conditioning
The associative mechanisms that underlie operant conditioning have their roots in evolutionary biology. Some things are
relatively easily learned and others are difficult; the history of the species is usually the best clue as to which will be
Observational learning- learning takes place by watching the actions of others. This learning challenges behaviorism’s
reinforcement-based explanations of classical and operant conditioning, but this type of learning also produces changes
in behavior. This learning plays an important role in surgical training.
Observational learning is based on cognitive mechanisms such as attention, perception, memory, or reasoning. But
observational learning also has roots in evolutionary biology and for the most basic of reasons: It has survival value.
Diffusion chain- individuals initially learn a behavior by observing another individual perform that behavior, and then
serve as a model from which other individuals learn the behavior
Sports is an example of observational learning – you learn by watching your coaches. Fears may emerge not from
specific conditioning experiences but from observing and learning from the reactions of others. Species also learn through this learning. Eg: pigeons. “Enculturation hypothesis”: being raised in a human culture has a
profound effect on the cognitive abilities of chimpanzees.
Mirror neurons are a type of cell found in the brains of primates (including humans). Mirror neurons fire when a monkey