Class Notes (839,146)
Canada (511,218)
Psychology (3,977)
PSYC 2330 (214)
Lecture

learning mid 2-notesolution.docx

51 Pages
108 Views

Department
Psychology
Course Code
PSYC 2330
Professor
Francesco Leri

This preview shows pages 1,2,3,4. Sign up to view the full 51 pages of the document.
Description
1/21/2013 4:10:00 PM WEEK 6 Operant Conditioning (S-R: learning)  3 primary elements: o 1) stimulus (S)  stimulus is also a signal.. allows the response to occur.  you respond because that stimulus has become important through classical conditioning.  Operant stimulus, is the stimuli you are responding to  There are other conditioned stimuli as well o 2) response (R) o 3) significant event (S*)  stregthens the S-R bond. o Here you are not responding to the burger, you are responding to the McDonalds sign which became important through classical conditioning. Sign and burger have been associated together over and over and over.  Therefore, what makes it operant is the RESPONSE.  Example: o Operant stimulus  the cigrette! (you pick it up, smoke it- this is the most direct element, -the operant response is smoking) o ..the other stimuli are conditioned stimuli that cause a conditioned response.  (contextual cues, smell, smoke.. etc) o Stimuli:  - Marlboro  visual stimulus also smell  - Contextual cues  bars or smoking chamber  - Only one is the operant stimuli  the cigaret  (that is what you operate) - look at what the most direct element is that you respond to  Biologically significant stimuli (S*) Tobacco - Nicotine (the drug) o Response:  -Operant response  the act of smoking (related to the cigarette) Reinforcing Stimuli  1) Primary Reinforcers o stimuli needed for survival = food, water, sex o stimuli that mimic the effects of food, water and sex in the brain = drugs o sensory stimulation & novelty  require no learning, they are biologically important  2) Secondary Reinforcers o A previous neutral stimulus that has acquired the capacity of strengthen responses because it has been repeatedly paired with food or with some other primary reinforcer  the S* isnt what strengthens the bond.. it is the CS  (which was learned through conditioning)  The S associated with the S* (S-S*) Conditioned Stimulus  Meaningful because it has been paired with a US  use CS to reinforce link with other stimulus, not to alter behavior. o (Wolfe 1936)  Chimpanzees pressing a lever for tokens  Trained Chimps to press lever for tokens. And they could give the tokens in for bananas. Treated the CS as food. Chimps would take tokens from each other. The token became a CS to promote the bond. o Humans biggest secondary reinforcer is money!  Can be used to alter behaviour  3) Social Reinforcers o Stimuli whose reinforcing properties derive uniquely from the behavior of other members of the same species: praise, affection, attention. They are usually a blend of primary and secondary reinforcers (smile & good)  stimuli that have ability to strengthen the bond, comes from behaviour of others.  works well within species, but not among different species.  word 'good' becomes important (social reinforcer/CS) bc its associated with positive reactions.  can be negative.. but would be odd to call that a 'reinforcer' (rather punishment)  Ability to strengthen the link between a stimulus and a response - comes from a reaction within the genes. Works between species very well.  Facial reactions - inborn tendency to recognize smiles (good)  Can be negative - but difficult to call a reinforcer - punishment  Conditioned Reinforcers & shaping o There is a difference of what a stimulus does to behaviour, and what it does to how you 'feel' about it. We are discussing strictly behavioural modification  (series of actions that become more/less frequent depending on if they are reinforced) o Rat in Box example:  Elements:  Stimulus (S) = lever  intsrutment animal is operating  -is also the CS after it's learned  Response (R)  action of pressing the lever  Reinforcer/ Significant Stimulus  food pellet  Shaping  active process of teaching the response...  -make rat hungry (restrict food, make them excited)  -they may press the lever, how do they make the link between the lever and food?  Stage 1:  put animal in chamber & turn on food dispenser. o (animal doesnt do anything, just hear clicking of food falling in and getting the food.)  need to establish a reliable CS predictive of the food (clicking it makes when falling in dispenser)  Stage 2:  -animal turns to face lever, you make noise by droping food. every time it looks at lever, you make noise again.  -you then play trick.. when it looks at lever, you DONT make noise.. then the animal will approach the lever.  Stage 3:  -you then make noise every time it approaches the lever.  -then you don't make noise when it approaches..  Stage 4:  -then the rat will press the lever & you make the noise immeidately after, and everytime they touch the lever.  -the reinforcer (clicking noise) is still a CS (still used as a CS)  -you shape the response (shaping), and then the response is maintained by the reinforcer. (operant conditioning)  (shaping is what you need for the response) Law of effect THORNDIKE:  „Law of effect‟ term first used to describe process of operant conditioning. o (discovered that experience can modify non-reflexive behaviour.)   gradual modification of non-reflexive behaviour by experience o “of several response made to the same situation, those which are accompanied or closely followed by satisfaction to the animal will be more firmly connected with the situation, so that when it reoccurs, they will be more likely to reoccur.”  Great the satisfaction  greater the strengthening of the bond  Talking about how animal „feels.‟  o -the stimulus (box) is followed by the response (pulling lever), if the bond is strengthened by reinforcer/satisfaction (freedom)  raw component  satisfaction   satisfaction “stamps in” the connectin between S & R o “by a satisfying state of affairs is meant one which the animal does nothing to avoid, often doing such things as attain and preserve it.”  Essence is the „stamping in‟ of the S-R bond  (caused by satisfaction)  Puzzle Box example: o -used puzzle box for cat o -put cat inside, cat tries to get out, until they find the lever which opens the box door. after they've done this so many times, they are able to escape in seconds. o -example of non-reflexive behaviour that is modified by experience! BOUTON   Instrumental learning generally works so that organisms develop responses that maximize benefit (obtain stimuli with positive survival value) and minimize cost (prevent stimuli with negative survival value) o Instrumental behavior increases or decreases depending on its effect on the environment  Good S* (good for survival)  Bad S* (will kill you) o (reward learning)  perform a behvaiour that obtains stimuli that are GOOD S* o (omission learning)  perform behaviour that prevents you form obtaining GOOD S* o (punishment learning)  perform behvaiour that obtains stimuli that is BAD S* o (avoidance learning)  iperform behvaiour that prevents you from obtaining BAD S* “I can‟t get no Satisfaction”   response will increase if followed by a satisfying outcome, but the only way we know if the outcome is satisfying, is if the response increases. o if something there isn't any satisfaction, there would be no learning?  If rewards are stimuli that produce satisfaction by reducing drives, then behavior should not increase in the absence of satisfaction or drive reduction.  Goal Box Experiment: (Sheffield, Wulf, Backer) o -rats mount females in the box, and just before ejaculation, rat is removed (stopping the satisfaction- ejactulation which spreads speed)  -thus.. they should go slower, because satisfaction was removed?  -BUT they go faster!  behvaiour increases in the absense of satisfaction.  Paradoxical reward effect (AMSEL) o  no satisfaction, yet behaviour still increases.  (obviously satisfaction is not always necessary)  Latent learning (TOLMAN) o  animal actually is always learning in the absence of the satisfaction, and only use that learning when it's presented.  Human experiment o -individual receives a dose of morphine through a line inserted into their muscle. (they don't know what drug will be injected) o -there response is pressing a button in front of them a number of times.  number of response per second  (probing their behaviour) o -immediately after the injection, you ask them "how much do you like your injection from 0-50"  (probing subjective responses- their feelings) o Results:  -dose is 0:  response is low for both behaviour & liking  -dose is 15:   response of behaviour increases   response of liking doesn't  dose is 30:  response is high for both behaviour & liking. o KEY  Affective (feels good or bad) reaction to a S* is NOT the ONLY element that is key to its effect on behavior  -the drug is clearly reinforcing behaviour, in the absence of subjective liking (they can disassociate, don't need to go together!)  meaning: o reinforcement  the effect on behaviour. (what we focus on)  increases behvaiour o reward  is the subjective feeling.  example : son associates riding the horse with the barbers.  -if it was reinforcer, every time he see's horse he will want to get hair cut (reinforcer increases the behavaiour)  ...instead the horse is the REWARD (doesn't strengthing behaviour) o shaping  the experimenter shapes the response o auto-shaping  programed into a computer & let rat go. QUESTION: primary reinforcing stimuli:  A)are only stimuli needed for survival  B) possess UC motivation value  C) are rewarding  D) reward good behvaiours  E) none of the above Reinforcement Theories Contiguity Theory (GUTHRIE)   Operant conditioning occurs when S, R and S* occur together in time. o -each individual will typically press the padle in the same way. the response was very stereotypic, repetitive, and identical from time to time. the animal is just associating these things together, and responding like a machine.  Stop Action Principle: o  any position when reinfoced (S*), you will be likely to repeat.  Any specific bodily position and the muscle movements occurring when the S* is delivered will have a higher probability of occurring in the future.  -argue that operant conditioning leads to these responses, rather than cognitive reactions. but also true that operant conditioning doesn't always lead to these automatic behaviours.. o superstitious behaviours   - form of automatic responding, pulling up your socks every time you take a shot. Wear lucky pants every time, after you've done well. (no actual cognitive link) Cognitive Theory (TOLMAN)   During operant conditioning, animals make S-S* associations. Rs are highly flexible, and the primary role of a S* is to motivate behavior. o “we agree with the other school that the rat in running a maze is exposed to stimuli and is finally led as a result of these stimuli to the response which actually occur. However, we feel that the intervening brain processes are more complicated and autonomous than the stimulus- response psychologists do.”  Both views are correct, in different situations.  Monkey Expeiriment: o tell animal it is a memory test, banana on one side, none on the other. o Cover it, then remove the screen and ask animal to make response.  if they respond to where food was  matching. o They switch the banana with lettuce, the animal gets mad.  (they expected to find banana).  This is indication that animals can form cognitions about the consequences of their actions. (expectation.) Reinforcement Theory (SKINNER)   law of effect by skinner, defines reinforcing stimulus as "stamping-in". o A reinforcer is really allowing you to create a bond between S-R. o reinforcers are acting on storage of information (memory).  obvious because if you remove the sugar from the despenser, the animal still responds (it is the memory that drives the behaviour)  Reinforcing stimulus o  An event that enhances the storage of information about situations in which it is encountered - “Stamping-in” o  This enhanced storage increases the probability that the behavior leading to the reinforcer will be repeated in the future, even in the absence of the reinforcer. Why is a Reinforcer, Reinforcing?  A reinforcer is an event that follows a response and changes the probability that the response will be emitted in the future o How can the event change behavior, when the new behavior occurs in absence of the event?  1) enhancement of memory consolidation o  Reinforcing events enhance the acquisition and the storage of information in the brain.  Reinforcers enhance/promote memory  2) Attribution of conditioned motivation o  Learning is the formation of representations of the relationships among objects and events. A representation of a reinforcer will motivate behavior.  Reinforcers give motivation flavour to behaviour & situation  provide a motivational context for behaviour.  Attribute motivational/learned/conditioned value) o  after you are exposed to a stimulus and have a response, a memory trace is produced. A reinforcer enhances the memory, and gives motivation to the stimuli & responses. Effect of Reinforcers on Memory Consolidation   as you form a new memory, it is fragile (not permanent). as your brain is processing it, the memory can be changed. the transition from a fragile to permanent state is called consolidation. o (this is an active process) A) Inhibition of memory consolidation  1) learning other information o  if you learn a series of letters, 2 mins later you learn another series. what you learned last, interferes with what you learned first. this is because your learning the 2nd set while you are processing the 1st. you are interfering with the consolidation process.  2) ECT (electroconvulsive therapy) o  ECT will erase memory that hasn't been consolidated. patients experience memory loss of hours before the treatment. (temporary amnesia)  3) Trauma o  can produce stoppage of activity, and produce loss of memories that are still in the consolidation process. B) Facilitation of memory consolidation  1) Emotional Events o  the memory is encoded and flashed by something in the brain that is activated by a very strong emotion  (you remember it better)  2) Reinforcing Stimuli o  explained in depth….. Passive Avoldance Task (HUSTON)   to determine if a stimulus can enhance memory consolidation. o animal completes a task to create a memory. then when the memory is getting processed, you give reinforcing stimulus. o if the next day, the animal does task better..  you improved the memory (food reinforcer enhanced memory, but isn‟t associated with task).   animal is placed in cage, if animal gets a shock, it stands on the platform (to not get shocked). o Group 1  fed in cage immediately after training session o Group 2  fed in cage hours after the training session  when fed hours later, they still get food, but outside of the consolidation period.  Results: o Group 1 (immediately fed) remained on the platform longer than Group 2 (delay in reinforcer)  Shows: o 1) food reinforcer influenced the animals behavior by strengthening the representation of the contingent relationship between stepping down and shock  - the food reinforcer is strengthening the memory of the task (nothing to do with the food) o 2) the animals learned nothing about the rewarding motivating properties of food  - the animal isn‟t making any link with the food or task (you would expect them to step down faster to get food)   “To observe the enhancing function of reinforcers we need to study situations where the reinforcer is non-contingent upon the response” o need to study when the reinforcer is not contingent on response. Reinforcers & Consolidation  A) Electrical stimulation of the brain (BLOCK) o  by passed sensory forms of stimulation (food), and used direct stimulation of areas of the brain after learning of a task to reinforce it.  reticular formation:involved in general arousal of rest of brain.  (cutting this part from the rest of brain causes coma).  (trained animal on task, stimulated the reticular formation this afterwards, and found that it strengthened the memory) o Post-training electrical stimulation of the RETICULAR FORMATION enhances retention of both appetitive and aversive tasks. o Self Stimulation Experiments (HUSTON)  -put electrodes in different brain regions to find sites that produce stimulation. by mistake, discovered a site where the animal will go back to the location where the stimulation was delivered.  Medial Forebrain Bundle (MFB).  group of axons that project out of the midbrain neurons.  When electrodes stimulate the MFB, animals will go back to places where the stimulation was delivered.  intracranial self-stimulation.  Animal will press the lever  stimulating MFB reinforces from a behavioural point of view (produces behaviour to return), and also from a memory point of view (promotes memory consolidation)  experiment:  Group 1   received 30 min stimulation of MFB after making a pathway choice.  Group 2   received no stimulation  Results:   group 1 which received stimulation of MFB after making choice, learned faster than the other group.   Stimulation of the MFB has strong reinforcing properties  B) Drugs of Abuse o Ex. amphetamine, cocaine, morphine/heroin, nicotine, caffeine, alcohol, benzodiazepines, sucrose o  if you learn something, and during memory consolidation, you use drug of abuse OR any substance which can release dopamine (sugar, physical activity, etc), you will find the memory has been enhanced.  -drug has to be given in the critical period of consolidation.  -you need a control which gives drug outside this period.  -must have appropriate dosages, just enough to turn on dopamine activity. o Experiment:  Train on a task, then post training you give a drug.  then later you test there memory  you see memory has been enhanced.  incidental recall   didn‟t even tell them you are retesting them Dopamine:  dopamine cells come from VTG, which has axons projecting to many regions.  the MFB is the bundle of these axons, which projects dopamine to areas.  substantia nigra also produces dopamine, and projects to other areas. Effect of Reinforcers on Conditioned Motivation  reinforcer is not only changing behaviour through memory consolidation. o also because the situation around the reinforcer becomes motivating to the individual (emotional component).   a reinforcing stimulus produces a motivational state which is usually liked. This state, will be associated to any other stimuli that are present (contextual stimuli), and these stimuli will become motivationally important (motivationally salient, incentive value) o  Introducing a reinforcer into a learning situation confers its motivating power (i.e., motivational salience) on previously non-motivating stimuli o  The stimulus acquires secondary reinforcing properties and thus it becomes a conditioned motivator.  Smoking Experiment: o -testing cigarettes with & without nicotine.  non-nicotine cigarettes people still keep smoking them (not zero)   strong reinforcer is NOT the nicotine, it is the smoke.  can keep people smoking even without nicotine.  -nicotine had been paired enough with nicotine, that it became a conditioned/motivational reinforcer itself.  (acquires something that you like). o  the nicotine enhances memory & gives motivational value to the stimulus.  (this is why nicotine is a reinforcer) QUESTION: stimulation of the MFB (medial forebrain):  a) maintains self-stimulation behaviour  b) feels great  c) is rewarding  d) can also be used to punish behaviour  e) has no affect on memory consolidation Conditioned Motivation:  Facial Reactions (LIKING): o There is a commonality of facial reactions to like & dislike tastes  (not learned)  -we can infer a measure of 'wanting' from the behaviour.  Liking  measure from Facial reactions  Wanting measure from approach behaviour  -a lot of stimuli we consider rewarding is also reinforcing  -study by disassociating liking & wanting  (normal liking, no wanting) o we know we can disassociate liking and wanting!  Addiction:  addicts don‟t like drugs, but they do them  they want them! (wanting is measure of behaviour)   show a lot of wanting, no liking. Dopamine: Wanting but not Liking.  BERRIDGE & ROBINSON o  Study to distinguish between wanting and liking  -Measuring dopamine  - there are dopaminergic neurons in the mid brain which send dopamine to the striatum. if you inject a drug into the striate (nucleus accum), the dopaminergic neurons pick it up, transport it, and then die.  can be sure your making lesions particularly to dopamine cell, because you can see that only they die. o experiment:  made 6 hydroxydopamine lesions to the VTA (injected drug into nucleus accumbens or neostriatum)  only certain neurons will pick it up , transport it to cell bodied, then kill them.  (specific lesion to dopamine)  Wait until the neurons die and than measure the amount of dopamine in the area  - 90% depletion and 74.1% depletion o results:  caused an animal to not drink, eat or have sex.  Will move, but approach/goal orientated behaviour goes away almost completely.   produced severe aphasia (don‟t eat)  don‟t do these things, because they don't approach.  Not because they can‟t do them. o  dopamine is involved in wanting, but NOT liking.  Dopamine does NOT mean pleasure  o extension on experiment:   measure the liking component:  group 1  give animal a sweet (Sucrose) solution  group 2 give animal bitter (Quinine) solution.  Results:  see normal liking reactions for both solutions, but see no wanting (approach) behaviour  liking not effected   Measure aversive reactions in both cases as a control (no aversive in sucrose)  -React like normal animals in terms of liking behaviour   different regions of the brain affect your liking of something (emotional value), and effect wanting approach behaviour)  wanting  striatum & nucleus accumb. WEEK 7 Effectiveness of Reinforcment  Dependent on: o 1) Drive   can effect whether something is a reinforcer. o 2) Incentive value of S*   relative  (ex. what is reinforcing to a person with a normal eating style, is not incentive to a vegetarian- cheeseburger) o 3) Delay of reinforcement   delay between C & US will not effect learning. o 4) Stimulus Control o 5) Schedule of Reinforcment Delay of Reinforcement:  HULL o  experiment to look at roll of delayed reinforcement.  study the effect of a short delay between a response & reinforcer, versus a long delay. o Features:  stimulus choice point.  Response  going right.  reinforcer (S*)  food. o Procedure:  -animals put in T mazes, were reinforced for turning right (food).  -then, confine them in delay box for whatever time.  animal with longer delay, takes longer to learn, should eventually never learn (to take right hand turn)… o Results:  could delay for 20 mins and still learn normally....  -must consider pavlonian conditioning coming in, and filling the gap of time. animal is using conditioned cues as reinforcers, which fill the time gap and allow them to still make associations.  ( rg-rs mechanisims) o rG-sG mechanisms  SG/S*  stimulus in goal box  RG  reactions in goal box   Stimuli in the start box and delay box come to elicit rG  rG  fractional anticipatory goal responses (salivation)  1) energizes behaviour  2) causes sG  3) sG guides behaviour   when in start or delay box, the animal can feel salivation, and know they are in the right spot ("cheating")  sees delay box, starts to salivate, gets in, salivates, knows it made the right choice.   sG can also serve as conditioned reinforcers because of their association with SG/S* (food)  -reinforced to turn right not only by the food, but by a variety of CS associated, some which are INTEROCEPTIVE (inside the animal)  -delay boxes are different to the rat  (when animal makes the proper turn, they salivate) o SUMMARY:  Animals reinforced to turn right  so they go to goal box (food)  - Reinforced by receiving the food  -confined the animal in delay box right after they make the response, before getting to the goal box.  -See how long it takes the animal to learn, eventually will never learn? longer the delay, won't learn to turn right and get the food?  BUT.. after 20 mins the animal would still go to the food!   Must start to consider pavlovian conditioning seeking into operant condition. Animal is using pavlovian cues as reinforcers  that fill the gap of time and still allow the animal to learn  Before you get to the stimulus, other stimulus presented  Before you make the right turn, it experience some responses (Fractional goal responses)  then as it enters the goal box, responses get stronger and stronger.  - Consequence of having these response:  it is energized - animal knows they are in the right spot.. salivates at the delay box - knows that they made the right choice.  -Some of these responses are interoseptive (inside animal)  response guides stimulus, and stimulus guide behaviour (ex. stomach noise)  Essence: Reason why the animal can bridge gap   during the delay, the animal experiences a variety of Conditioned Stimuli which reinforcers the behaviour of turning right. They turn right, feel them turning right. Then its reinforced by these anticipatory reinforcers in the delay box, then even more eventually by the reinforcer.  SPENCE: o Experiment:  task is to follow dark compartment  even small delays prevent learning, because all CS that lead animal to food are removed (dark)  eliminate all sources of possible interoceptive cues  - Different delays (zero sec, 0.5 sec, 1.2 sex, 10 sec)  0.5 delay: performace slows down  1.2: slows down even more  10: never learn it.  - Delay of 10 seconds - will NEVER learn  Use conditioned reinforcers to gap the delays (as humans) o  we removed all stimulus to tell them that they made the right turn (black space), there is no predictors to allow for pavlonian conditioning to occur. Therefore, we see that a delay before getting a reinforcer slows down learning, and a big enough delay means they will never learn it.   When proprioceptive, as well as exteroceptive, conditioned reinforcers are eliminated, even a brief delay in the presentation of the reinforcer prevents learning Stimulus Control:   behaviour that is reinforced, is usually under control of many stimuli. the first is one that produces the response, the other are usually contextual. o -you want to control that stimulus. if you cant, you want to control contextual stimuli. (what we do)  Behavior that has been reinforced in the presence of one stimulus is controlled by the presence/absence of that stimulus.  However, responding often generalizes to other stimuli on the basis of their similarity to the training stimulus. o Example:  auto-shaping  pigeons peck at 580 wavelength, give food  Change intensity, don‟t give food.  Go back to 580 wavelength, and give food.   learn that 580 wavelength means food.  (is autoshaping because pecking is natural)  (keep reinforcing pigeons for pecking at a particular light colour, they will peck at the highest intensity. deviate from this colour, they will peck less) o Stimulus Generalization Gradient   the wide gradient of a variety of stimuli, or narrow gradient of just 2 particular stimuli.  can train discrimination go from natural wide gradient of responding (wide), to only respond to one colour (narrow)   the shape of the gradient is affected by learning!  want one particular colour  encouraging discrimination.  Reinforce for responding to particular colours  good at discriminating particular wavelengths o  you get a NARROW peak  want generalized btw all colours encourage generalization.  reinforce them for responding at every colour. o  you get a WIDER distribution  o Generalization & Discrimination Training:   yo
More Less
Unlock Document

Only pages 1,2,3,4 are available for preview. Some parts have been intentionally blurred.

Unlock Document
You're Reading a Preview

Unlock to view full version

Unlock Document

Log In


OR

Join OneClass

Access over 10 million pages of study
documents for 1.3 million courses.

Sign up

Join to view


OR

By registering, I agree to the Terms and Privacy Policies
Already have an account?
Just a few more details

So we can recommend you notes for your school.

Reset Password

Please enter below the email address you registered with and we will send you a link to reset your password.

Add your courses

Get notes from the top students in your class.


Submit