Pavlovian Conditioning 4/8/2013 7:56:00 PM PAVLOV  Concluded that there are two types of reflexes: conditional & unconditional o Unconditional reflexes: an unconditional reflex is just a reflex (discussed before) – consist of an unconditional stimulus and an unconditional response  Inborn & usually permanent  Unconditional stimulus: the stimulus that elicits an unconditional response  Usually things important to survival  Unconditional response: the response elicited by an unconditional stimulus o Conditional reflex: a reflex acquired through Pavlovian conditioning and consisting of a conditional stimulus and a conditional response  Depend on experience & learning  Conditional stimulus: the stimulus part of a conditional reflex; the stimulus that elicits a conditional response  Conditional response: the response part of a conditional reflex; the response elicited by a conditional stimulus o Found that anything could become a conditional stimulus if it became paired with an unconditional stimulus  For example, clapping your hands could become a conditional stimulus if you paired it with something like putting breadcrumbs in your dogs mouth right after clapping – elicits salvation  Eventually, the dog would start salivating as soon as you clapped  The PROCEDURE is known as Pavlovian or Classical Conditioning  Two important notes for this type of conditioning:  The CS and US are presented regardless of what the individual does  The behaviour involved is a reflex response, such as salivating, blinking an eye, sweating or jumping in response to loud noise HIGHER-ORDER CONDITIONING  Higher-order conditioning: a variation of Pavlovian conditioning in which a stimulus is paired, not with a US, but with a well established CS MEASURING PAVLOVIAN LEARNING  Test trials: when you show a conditional stimulus alone (without the unconditional stimulus) to see if you can provoke a conditional response o These are just randomly placed at intervals & plotted on a curve – learning is represented as an increase in the frequency of the conditional response  Psuedoconditioning: the tendency of a neutral stimulus to elicit a CR after a US has elicited a reflex response o For example, say a nurse coughs and she gives you a shot after that, if she coughs again after the shot you are more likely to wince – cannot say this is because you are conditioned, you’re just sensitized to other stimuli VARIABLES AFFECTS PAVLOVIAN CONDITIONING PAIRING CS & US  There are 4 ways of pairing stimuli: trace conditioning, delayed conditioning, simultaneous conditioning, and backward conditioning  Trace conditioning: the CS begins and ends before the US is presented o A buzzer will sound, and after the buzzer stops then air is blown into a person’s eye to make them blink – the buzzer has stopped before the air blows  Delayed conditioning: when the CS and US overlap o The US appears before the CS has disappeared o PRETTY FUCKING STRAIGHTFORWARD  Simultaneous conditioning: the CS and US coincide exactly o This is a very weak way to establish conditional response  Backward conditioning: the CS follows the US CS-US CONTINGENCY  Contingency: if-then statement; Y (an event) only occurs if X occurs – Y is contingent on X  The rate of Pavlovian conditioning will vary depending on the degree of CS-US contingency CS-US CONTIGUITY  Contiguity: the closeness in time or space between 2 events  Interstimulus interval (ISI): the interval between the CS and US  The shorter the interval, the more effective the conditioning HOWEVER occurring simultaneously is not effective STIMULUS FEATURES  If you use a compound stimulus, often overshadowing occurs – one stimulus completely overpowers the other ones & is effective  Stronger stimuli are more likely to get conditioned  Intensity of US is also important PRIOR EXPERIENCE WITH CS AND US  Latent inhibition: the failure of a CR to appear as a result of prior presentation of the CS in the absence of the US  Blocking: if you have an already established stimulus and you then pair it with a novel stimulus, the novel one will not work on it’s own because the prior experience with the established stimulus blocks CR with the new one  Sensory preconditioning: if two neutral stimuli are paired together, and then just one is paired with a US so that it becomes a CS allows the other neutral stimuli to become a CS more rapidly NUMBER OF CS-US PAIRINGS  The first few pairings are most important and then after that the response is established  Graph on page 78 INTERTRIAL INTERVAL  Intertribal interval: the gap between successive trials  OBVIOUSLY THIS EFFECTS THE RATE EXTINCTION OF CONDITIONAL RESPONSES  Extinction: repeated presenting the CS without the US making the CR weaker and weaker  Spontaneous recovery: the sudden reappearance of a CR after extinction  A CR will also be more readily reestablished after extinction than it was initially – shows that the learning from conditioning are never entirely diminished THEORIES OF CONDITIONING STIMULUS SUBSTITUTION THEORY  This theory basically says that the US evokes a response from birth – reflex & all you’re doing with the CS is substituting it for the US to get the some reflex response  One problem with this theory is that the UR occurs faster, more reliably, and more often than the CR  Cannot account for blocking, latent inhibition, and extinction RESCORLA-WAGNER MODEL  This model argues that there is a limit to the amount of conditioning that can occur in the pairing of two stimuli o Ex. the first time you get stung by a bee, you’ll be scared, then second time your fear will increase, the third time maybe it will increase a little more, but at some point you’ll get as afraid of bees as you can o Came up with a formula based on this idea of a decelerating learning curve (page 78 figure again) Pavlovian Applications 4/8/2013 7:56:00 PM FEAR  John B. Watson was the first person who said that fear, love, hate, and disgust are largely learned through Pavlovian conditioning o He called these emotional responses to a stimulus conditioned emotional responses  They were able to prove that phobias were just products of conditioning by the Little Albert experiment o They used loud noise as a US and paired it with a rat o Soon through Pavlovian conditioning Albert used to cry as soon as he saw the rat because they established a conditional fear response  Counter conditioning: the use of Pavlovian procedures to reverse the unwanted effects of conditioning o ex. Peter was very scared of rabbits and so a researcher used Pavlovian conditioning to remove this fear  She would first bring the rabbit into view, while Peter ate crackers  The crackers acted as a positive US while the fear was the CS  Everyday she would bring the rabbit closer and closer while Peter ate his crackers  Soon, the rabbit was in his lap and Peter was able to eat his crackers and play with the rabbit without any fear  Counter conditioning is often called exposure therapy  There are many different forms of counter conditioning and the best known is called systematic desensitization  Systematic desensitization: a procedure for treating phobias in which a person imagines a progressively stronger forms of the frightening CS while relaxed PREJUDICE  Prejudice is also seen as a conditioned emotional response  It has been proven that when people are shown certain ethnicities and those ethnicities are paired with negative words, people will be more likely to dislike those groups  Higher-order conditioning can account for the acquisition of likes and dislikes toward ethnic groups, including groups with which we have not had direct contact ADVERTISING  When manufacturers pair products with pleasant or positively arousing scenes, people are more likely to want that product  Marketing experts also do this by pairing products they want to sell with items that already arouse positive emotions  There have been experiments that reinforce the idea that brand names, positive scenes, and celebrities can have an effect on what people will or will not buy THE PARAPHILIAS  Include: voyeurism, exhibitionism, fetishism, transvestism, sadism, masochism, and rape  One form of treatment that can be used to treat paraphilias is aversion therapy  Aversion therapy: a form of counterconditioning in which a CS is paired with an aversive US, often nausea-inducing drugs o This type of therapy has seen high success rates in dealing with many people and various paraphilias TASTE AVERSIONS  Conditioned taste aversion: an aversion, acquired through Pavlovian conditioning, to foods with a particular flavor  Eating a food that we have had many times before is not likely to result in a taste aversion – novels foods are usually the ones that become avoided IMMUNE FUNCTION  Researchers have shown that allergic reactions can become conditioned to certain things (odors, items) that were associated with the allergen at some point DRUG OVERDOSE DEATHS  Sometimes, conditional stimuli can become preparatory indicators of the body and the US o For example, if you always take a drug in the same environment, certain aspects of that environment become CS’ and your body prepares for the drug – your body can suppress the effects of it (drug tolerance) o When you then take the drug in a new location, your body has no preparatory indicator and so there is no drug tolerance  same amount of drug can lead to a fatal overdose o Reinforcement 4/8/2013 7:56:00 PM Thorndike  Was trying to prove to people that animals did not learn through reasoning and were not able to logically think about things  Tested animal intelligence by presenting animals with problems and then seeing whether their performance improved or not o One experiment he did was with cats: he placed food in plain view but out of reach  Cat started off with ineffective ways to try to get the food – when it figured out how to open the door, the ineffective ways got less and less each time  He concluded that a given behavior typically has one of two kinds of consequences or effects o 1) Satisfying state of affairs  when the animal does get what it wants – cat gets food o 2) Annoying state of affairs  when the animal can’t get what it wants – cat remains angry because it can’t get to the food  Law of effect: the statement that behaviour is a function of its consequences. So called because the strength of a behaviour depends on its past effects on the environment. Implicit in the law is the notion that operant learning is an active process because it is usually the behavior of the organism that, directly or indirectly, produces the effect  Was the first person to show that behavior is systematically strengthened or weakened by its consequences Skinner  Operant learning: experiences whereby behaviour is strengthened or weakened by its consequences o Called instrumental learning because the behaviour is instrumental in producing the consequences o AKA operant conditioning o Main difference between this & Pavlovian conditioning is that in operant learning we act on the environment and change it, and the change thus produced strengthens or weakens the behaviour that produced the consequence TYPES OF OPERANT LEARNING  4 types of operant learning o positive reinforcement o negative reinforcement o positive punishment o negative punishment  Reinforcement: an increase in the strength of behaviour due to its consequence o An experience must have 3 characteristics to qualify as reinforcement: a behaviour must have consequence, the behaviour must increase in strength, the increase in strength must be the result of the consequence *Both positive  Positive reinforcement: a behaviour is followed by the appearance and negative of, or an increase in the intensity of, a stimulus reinforcement o The stimulus is called a positive reinforcer and it is generally increase the something the individual seeks out strength of o Because the reinforcers involved in positive reinforcement are behaviour: usually things most people consider rewarding, positive positive – reinforcement is sometimes called reward learning reinforcing o The only defining characteristic of a positive reinforcer is that consequence is when it is presented following a behaviour, it strengthens that the appearance of behaviour a stimulus;  Negative reinforcement: a behaviour is strengthened by the negative – he removal of, or a decrease in the intensity of, a stimulus reinforcing o This stimulus is called a negative reinforcer and it is consequence is something the individual tries to escape or avoid the removal of a o What reinforces behaviour in negative reinforcement is stimulus escaping from an aversive stimulus o Once you learn to escape it you often avoid it entirely  Sometimes referred to as escape-avoidance learning because of this DISCRETE TRIAL & FREE OPERANT PROCEDURES  Discrete trial procedure: an operant training procedure in which performance of a behaviour defines the end of a trial o The dependent variable is often the time taken to perform some behaviour under study  Skinner used a free operant procedure o Free operant procedure: an operant training procedure in which a behaviour may be repeated any number of times o Usually the dependent variable in free operant experiments is the number of times a particular behaviour occurs per minute OPERANT & PAVLOVIAN LEARNING COMPARED  Differences between Pavlovian & operant learning: o In operant learning, an important environmental event is contingent on behaviour where as in Pavlovian conditioning one stimulus (US) is contingent on another (CS) o Pavlovian conditioning generally involves ―involuntary‖ behaviour and operant usually involves ―voluntary‖ behaviour o Can’t always say that one is happening and the other is not; can be very intertwined KINDS OF REINFORCERS  Primary reinforcers: any reinforcer that is not dependent on another reinforcer for its reinforcing properties o Food, water, sexual stimulation, relief from heat and cold, and certain drugs o Why are there only a few primary reinforcers & why are they limited in their role in human learning?  a) because primary reinforcers are generally readily available in advanced societies  b) lose their effectiveness quickly – satiated  Secondary reinforcers: those that are dependent on their association with other reinforcers o Praise, recognition, smiles, positive feedback o Even subtle changes in the environment will act as reinforcers if they are paired with other reinforcers, including other secondary reinforcers o Secondary reinforcers owe their effectiveness directly or indirectly to primary reinforcers o Also called conditioned reinforcers because they seem to acquire their reinforcing power by being paired CONTRIVED & NATURAL REINFORCERS  Contrived reinforcers: events that have been arranged by someone, usually for the purpose of modifying behaviour o Food, water, sexual stimulation, comfortable temperature and humidity, sleep, and other primary reinforcers  Natural reinforcers: typically events that follow automatically— naturally—from the behaviour  The thing that distinguishes these two from each other is whether it is a spontaneous consequence of the behaviour or a consequence that has been arranged by someone to change behaviour o Ex. teeth brushing SHAPING & CHAINING  Shaping: a training procedure; the reinforcement of successive o How children’s tantrums become louder and louder so the parent gives in o Shaping makes it possible to train behaviour in a few minutes that never occurs spontaneously  Behaviour chain: a connected sequence of behaviour o Ex. the routine gymnastics have to perform on the balance beam  Chaining: training an animal or person to perform a behaviour chain o Ex. making a phone call, brushing your teeth, eating out o First step in chaining is to break down the task – task analysis o Two ways to link the parts of the chain:  Forward chaining: trainer begins by reinforcing performance of the first link in the chain  Repeated until task is performed without hesitation  Then trainer requires performance of the first two links  Keeps going until the chain is done smoothly  Trainer can use shaping to build a link that does not readily occur  Backward chaining: begin with the last link in the chain and work backward toward the first link VARIABLES AFFECTING REINFORCEMENT  Contingency o Refers to the degree of correlation between a behaviour and its consequence – when talking about operant learning o Rate at which learning occurs varies with the degree to which a behaviour is followed by a reinforcer o Even small reinforcers can be very effective if there is a strong correlation between the behaviour and reinforcer  Contiguity o The gap between a behaviour and its reinforcing consequence has a powerful effect on the rate of operant learning o The shorter the interval, the faster learning occurs REINFORCER CHARACTERISTICS  Smaller reinforcers given frequently usually produce faster learning than large reinforcers given infrequently BUT if everything is equal then a larger enforcer is more effective  The more you increase the reinforcer magnitude (the relation between reinforcer size and learning), the less benefit you get from the increase o Students who got bonuses while working TASK CHARACTERISTICS  Certain qualities of the behaviour being reinforced affect the east with which it can be strengthened MOTIVATING OPERATIONS  Motivating operation (establishing operation): is anything that establishes conditions that improve the effectiveness of a reinforcer o Ex. starving the animals for a few hours will prove food to be a more effective reinforcer o The greater the level of deprivation, the more effective the reinforcer OTHER VARIABLES  Having to work hard for a reinforcer or having to wait a long time for its delivers seems to make it more effective in the future  Competing contingences – the effects of reinforcing a behaviour will be very different if the behaviour also produces punishing consequences or if reinforcers are simultaneously available for other kinds of behaviour o Procrastinating  there are more appealing (reinforcing) things to do than what we’re supposed to NEUROMECHANICS OF REINFORCEMENT  Reward pathway: the neural pathways believed to be associated with positive reinforcement. It is thought to be an area in the septal region, the area separating the two cerebral hemispheres and running from the middle of the brain to the frontal cortex  Positive reinforcement is associated with the release of dopamine in the brain EXTINCTION OF REINFORCED BEHAVIOUR  Extinction: withholding the consequences that reinforce a behaviour o Pressing a lever always releases food – then suddenly you stop the food  Although the overall effect of extinction is to reduce the frequency of the behaviour, the immediate effect is often an abrupt increase in the behaviour on extinction  extinction burst o Extinction can also increase the frequency of emotional behaviour, particularly aggression o One extinction session is often not enough to extinguish behaviour. What happens is: the rate of the previously reinforced behaviour declines and finally stabilizes at or near its pretraining level but then if the subject is put back in the training situation the behaviour occurs again AS IF IT NEVER DISAPPEARED  This reappearance of behaviour is spontaneous recovery  Resurgence: reappearance of previously reinforced behaviour o This is when you teach a subject a behaviour, and then extinguish that behaviour, and then teach a new behaviour, and as that second behaviour is taught to be extinct, the first behaviour starts coming back as the second behaviour declines  If a behaviour is kept continuously on extinction, it will continue to decline in frequency. When it no longer occurs, or occurs no more than it did before extinction, we can say that it has been extinguished.  Rate at which extinction occurs depends on: number of times the behaviour was reinforced before extinction, the effort the behaviour requires, and the size of the reinforcer used during training THEORIES OF REINFORCEMENT HULL’S DRIVE-REDUCTION THEORY  Hull believed that animals and people behave because of motivational states called drives o Ex. an animal deprived of food is driven to get some  A reinforcer is a stimulus that reduces one or more drives  His theory works well with primary reinforcers such as food and water because these reinforcers alter a physiological state  To talk about how other reinforcers (praise, money, etc) work with his theory Hull suggested that: secondary reinforcers derive their reinforcing powers from, and are dependent on, their association with drive-reducing primary reinforcers RELATIVE VALUE THEORY AND THE PREMACK PRINCIPLE  Premack said that reinforcers could be thought of as behaviour instead of stimuli o Ex. instead of thinking of food as a reinforcer, you can think of the act of eating o Relative value theory: theory of reinforcement that considers reinforcers to be behaviors rather than stimuli and that attributes a reinforcer’s effectiveness to its probability relative to other behaviors o Premack principle: high-probability behaviour reinforces low- probability behaviour  Strong behaviour strengthens weak behaviour  Ex. if a rat shows a stronger inclination to drink rather than run you can use drinking to reinforce running  only if the rat run does it get water  saw the time spent running increased *DRINKING WAS ABLE TO REINFORCE RUNNING o The relative value of activities determine their reinforcement value o An event is reinforcing simply because it provides the opportunity to engage in preferred behaviour o Problems – can’t explain some secondary reinforcers; also sometimes low-probability behaviour can reinforce high- probability behaviour if the participant has been prevented from performing the low probability behaviour for some time RESPONSE DEPRIVATION THEORY  Timberlake & Allison proposed the response deprivation theory  The main idea is that behaviour becomes reinforcing when the individual is prevented from engaging in it at its normal frequency  Every behaviour has a baseline level and if we restrict access to a behaviour so the rate of it falls below the baseline level, the subject will engage in behaviour that provides it with access to the restricted behaviour  Similar to Premack’s theory but in this theory the relative value of one reinforcer to another is not vital  So basically predicts that the opportunity to engage in any behaviour that has fallen below the baseline level will be reinforcing  HOWEVER, like all the theories discussed so far it has trouble explaining the reinforcing power of words such as yes, right, correct etc. THEORIES OF AVOIDANCE  In negative reinforcement, a behaviour is strengthened when it is followed by the removal of a stimulus o Stimulus is an aversive event o Escape from an aversive stimulus is reinforcing  The dog, light, and shock TWO PROCESS THEORY  Two process theory: states that two kinds of learning experiences are involved in avoidance learning, Pavlovian & operant  Sidman avoidance procedure: an escape avoidance training procedure in which no stimulus regularly precedes the aversive stimulus. Also called unsignaled avoidance o Ex. shock is not signaled by anything ONE PROCESS THEORY  One-process theory: proposes that avoidance involves only one process: operant learning o Both escape and avoidance behaviors are reinforced by a reduction in aversive stimulation o An animal that has learned to avoid shocks by jumping a barrier continues to do so because by doing so it continues to avoid shock  What if you disconnected the shock and prevented the animal from jumping? It would try...but it wouldn’t be able to and there would be no shock delivered so after a while it would stop trying to jump o The best way to get an animal to stop performing and unnecessary avoidance behaviour is to prevent the behaviour from occurring Schedules of Reinforcement 4/8/2013 7:56:00 PM  Learning can refer to the acquisition of a new behaviour, an increase in a rate of behaviour, a reduction in the rate of a behaviour, or a change in the patter of performance  Schedule of reinforcement: the rule describing the delivery of reinforcement o A particular kind of reinforcement schedules tends to produce a particular pattern and rate of performance  schedule effect o When a given schedule is in force for some time, the pattern of behaviour is very predictable o If a behaviour is occurring at a steady rate and the reinforcement schedule changes, usually the behaviour will change in predictable ways SIMPLE SCHEDULES CONTINUOUS REINFORCEMENT  Continuous reinforcement (CRF): the simplest of simple schedules; a behaviour is reinforced every time it occurs o Ex. a child is getting CRF if she gets praise whenever she hangs up her coat, or if you put money into a vending machine and get something out o Each reinforcement strengthens behaviour, so CRF leads to leads to very rapid increases in the rate of behaviour o Rare in the natural environment  Ex. a parent can’t praise their child EVERY time she hangs a coat o Intermittent schedule: when reinforcement occurs on some occasions but not others FIXED RATIO SCHEDULES  Fixed ratio schedule (FR): a behaviour is reinforced when it has occurred a fixed number of times o Ex. a rat is trained to press a lever for food – after a while every third lever press is reinforced  Animals on fixed ratio schedules perform at a high rate, often punctuated by short pauses after reinforcement  Postreinforcement pauses: the pauses that follow reinforcement o This pause is not because of fatigue o The more work required for each reinforcement, the longer the postreinforcement pause VARIABLE RATIO SCHEDULES  Variable ratio (VR schedule): a reinforcement schedule in which, on average, every nth performance of a behaviour is reinforced o If postreinforcement pauses occur, they appear less and are shorter than in a FR schedule o VR schedule produces more behaviour in an hour than a FR schedule o Common in natural environments FIXED INTERVAL SCHEDULES  Fixed interval (FI) schedule: behaviour under study is reinforced the first time it occurs after a constant interval o Ex. a pigeon pecks at something, and food appears, but then no food appears for 5 seconds – during those 5 seconds food pecking produces no reinforcement o Produce postreinforcement pauses o Not really seen in nature o Pecking, or checking, for the reinforcer increases near the end of the interval VARIABLE INTERVAL SCHEDULES  Variable interval (VI) schedules: the length of the interval during which performing is not reinforced varies around some average  Produce high, steady run rates o Higher than FI but not higher than FR and VR o Ex. a leopard waiting for its prey – sometimes the wait may be long, sometimes it may be short but it is reinforced by the appearance of prey OTHER SIMPLE SCHEDULES  Fixed duration (FD) schedule: reinforcement is contingent on the continuous performance of a behaviour for some period of time o A child is required to practice piano and once finished receives a prize as a reinforcer  Variable duration (VD) schedule: the required performance varies around some average o Child practicing piano may practice 20 mins one day and 30 another o On average the child will practice for 30 minutes before getting something, but there is no telling when the reinforcer may appear  Parents usually using duration schedules don’t provide reinforcement – they think that getting better at the piano is enough but this intrinsic reinforcer is too weak  Differential reinforcement of low rate (DRL): a form of differential reinforcement in which a behaviour is reinforced only if it occurs no more than a specified number of times in a given period o Produce higher rates of behaviour than any other schedule o Most useful when you want to increase the rate of a behaviour  Noncontingent reinforcement (NCR): a schedule in which reinforcers are delivered independently of behaviour o 2 main kinds:  Fixed time (FT) schedule: a reinforcer is delivered after a given period of time without regard to behaviour  Ex. a pigeon will receive food every 10 seconds whether it picks the disk or not  Variable time (VT) schedule: reinforcement is delivered periodically at irregular intervals regardless of what behaviour occurs
