Study Guides (238,471)
Canada (115,151)
Psychology (919)
PSYC 2330 (40)

Things to Remember - Learning.docx

18 Pages
Unlock Document

University of Guelph
PSYC 2330
Francesco Leri

Learning Final Exam - Summary Chapter 1 - Introduction Behaviour is triggered by stimuli that have motivational value through our central motivational state (motivational value is relative) and from past experiences (have memory for stimuli because we have experienced it already and have learned from the outcome. Rene Descartes was a dualist that said that both the mind and the physical world could produce behaviour. He said there were two kinds of behaviours. Involuntary is the only one that is shared with animals because they have no mind. The senses bring in external world, relays to brain, which initiates muscle movements. Voluntary movements were exclusively human because again, animals had no free will. Sensory organs take in external environment, goes to brain and then to pineal gland which links physical brain to non-physical mind (because in center of brain). Mind initiates behaviour, which is then sent to pineal gland, brain and muscles. This theory was important because it emphasized that stimuli must exist for behaviour and that the brain is involved. Stimuli can be neutral (no motivational value), appetitive (positive motivational value, move towards) or aversive (negative motivational value, move away). Responses can either be learned or innate (an instinct) and can be in the somatic NS (movement of muscles for example, observable) or in the autonomic NS (heart rate, BP, harder to control, not observable but measurable). Nativism was also proposed by Descartes and it is the idea that everything we need to know we have at birth. Conversely, Locke suggested the idea of empiricism, which says that we are a blank slate at birth and our knowledge comes with experiences. Both are right; some behaviour we are genetically predisposed to, others are learned. Hobbes was a dualist but also suggested that the mind behaves in a predictable way: to seek rewards and avoid punishment. Watson is the father of behaviouralism and stated that if it cannot be observed than it is not worth mentioning. He implied that we are victims to our experiences and if we know the rules and we know the stimuli, we can predict the response. He conducted the famous Little Albert experiment. Darwin suggested that humans had evolved from animals and therefore if animals don‟t have minds, we do not have minds. This forced us to either accept that animals have minds or that we do not. Nervism was a theory that stated that there are biological mechanisms behind our behaviour. Pavlov looked to see how reflexes could be shaped by experiences (classical conditioning). Classical conditioning is when we pair a neutral stimulus with a biologically significant stimulus that elicits a reflexive response. Over time with training, the neutral stimulus turns into a predictor of the biologically significant stimulus and will elicit that response. Extinction trials are when we no longer have the biologically significant stimulus - show the CS but no US follows. If this is done enough, the CR will extinguish, as the CS is no longer a predictor of the US. Sometimes behaviour can relapse and come back. This can happen because of stress (chemical, psychological, physical), drug associated cues (e.g. the place that you normally smoked at, ashtrays) or the drug itself (chocolate bowl scenario, potato chip effect). Operant conditioning is when the animal has a stimulus that motivates them to operate something (an instrument) as a response because that stimulus has been paired with a biologically significant stimulus by classical conditioning in the past. As a result, the animal either gets or does not get the biologically significant stimulus. This result either strengthens or weakens the association between the stimulus and the response making it more or less likely to occur in the future. Chapter 2 - Motivation and Learning Goal directed behaviours are behaviours that are directed towards something the animal is trying to achieve. They can be instinctual - unlearned, resistant to change, stereotypical, occurs only when circumstances are present, or they can be learned - easy to alter, adapted to environment There are two main instinct theories that try and explain why behaviour occurs. The first is from William James that says instincts motivate behaviour - they are the fuel that „activates‟ you to do something. The problem with this theory is that we cannot say, or measure how many instincts we have but what we take away from it is that behaviour requires energy. The second instinct theory comes from ethology and states that instincts are behaviour, how they differ is their sensitivity to change in the environment. One type of behaviour is appetitive. These behaviours are learned, flexible, and often the beginning of a behaviour sequence. The second type of behaviour is consummatory. These behaviours are also known as fixed action patterns as they are rigid, independent of learning, insensitive to the environment and are highly stereotyped. Key (sign) stimuli are stimuli that attract more behaviour and, elicit a fixed action pattern. The most vicious social releaser is yawning (sign stimulus released by one individual that elicits a fixed action pattern in another). Homeostasis is the fact that our body tries to maintain equilibrium and if equilibrium is disturbed, our body mounts a response in the opposite direction to try and regain it (opponent response). This is where the “buy happiness (drugs), get sadness for free” saying comes from. In order for behaviour to occur there either needs to be lots of drive for it to happen (the tank is very full and pushes valve open) or the perfect stimulus has to come along (adds weight to the end of the valve and causing it to open). When homeostasis is disturbed you create a need, which in turn elicits drive (energy) to response in a way that reduces the need and regains equilibrium. We test this by manipulating the need in animals and see how many crossings of a shock bridge they will make (conflict situation). For exploration, only 1 crossing is made (know what is on the other side, doesn‟t matter anymore. For sex the more deprived they are the more crossings they make but then it flat lines (demonstrating the difficulties in depriving a rat of sex). And for food and water it is a peak function (after a certain amount of deprivation from food and water they don‟t have the energy to go across the bridge / die). Hull came up with a drive reduction theory and the equation SHr x D = SEr where SHr is the strength of the habit, D is the drive and SR is the strength of the response. Hull said that habits were things that had been learned and becomes stronger as a function of how often they are followed by satisfying events (reducing a need is satisfying). What he found was that when there is a high need for something and therefore a drive but the animal does not know what to do (SHr is zero) no behavior will over. Conversely, if the animal knows what to do but there is no drive (D is zero) behaviour also will not occur. Drive is necessary to learn something - when drive is increased again in the future you activate the things you have learned and tend to do the same thing. Hull saw humans as very mechanistic and that there was only 1 drive that was energy pool for all needs. Later it was proposed that a stimulus drive is also activated in parallel to the general drive. This stimulus drive is drive-dependent and is only activated if we have had experience reducing the drive before in the past. Hull then introduced the notion of stimulus incentive value when he acknowledged that not all stimuli have the same motivational value for everyone. Therefore the equation is now K x D x SHr = SR. K is learned, relative to other stimuli and relative to the motivational state. There is an optimal level of motivation for performance and this peak function different for different tasks (Yerkes-Dodson Law - there is an inverse relationship between task difficulty and optimum motivation). Too much motivation or drive can actually impair performance. You also do not need drive reduction in order to learn. Latent learning is learning that occurs but is only shown when there is a reason to show it. When you put rats in a maze without cheese they wander around. When you put that same rat in the maze with cheese in it they solve the maze faster than those who had never been in the maze before therefore learning had occurred but the rat had no motivation to show it. Habituation is an incremental decrease in responding due to repeated stimulations. In the drug world this is called tolerance. Sensitization is an incremental increase in responding due to repeated stimulations. This was studied through the aplysia slug and the gill withdrawal reflex (monosynaptic). When you touch their gill it retracts - adapted for survival. You put an electrode in the sensory neuron and one in the motor neuron and measure the responses. After poking it multiple times (habituation) there is the same response in the sensory neuron but less of a response in the motor neuron therefore showing habituation at the neuron level. This is because calcium is responsible for docking the vesicles of neurotransmitters in the pre synaptic neuron and if there is less calcium allowed into the pre synaptic neuron it is directly affecting how much neurotransmitter that is released. Conversely we shock the tail of the slug and measure the response of the gill withdrawal reflex and it is faster. The shock sensitized the animal to everything (not stimulus specific). An interneuron impinges on the presynaptic sensory neuron, which causes it to release more neurotransmitter and engage the motor neuron to more of an extent. One-shock can last minutes, 5 shocks cause the growth of new synapses and can last days. Chapter 3 and 4 We can test classical conditioning by just presenting the CS to see if it elicits a response. The CS can be a conditioned excitor, meaning it predicts the presence of the US or it can be a conditioned inhibitor, meaning it predicts the absence of a US. There are two main ways we can make a CS a conditioned inhibitor. The first way is called differential inhibition when one CS is paired with food and the other is paired with nothing therefore the second CS is a conditioned inhibitor. The second way is through conditioned inhibition where one CS is paired with a US and then both CS are paired with nothing. In this case, the animal learns that the when the second CS is present there will be no US. To test the conditioned inhibitor we can do two things. First we can do the summation test where we pair the conditioned inhibitor with another CS and see if there is a response. If there is no response, then it is a conditioned inhibitor. Secondly we can do a retardation of acquisition test where we try and make this conditioned inhibitor a conditioned excitation. If the CS really is a conditioned inhibitor it should take longer for it to turn into a conditioned excitation than a neutral stimulus. There are some arguments that suggest that what is learned in classical conditioning is not S-S* learning. One argument, pseudo-conditioning, says that we only respond because we have been exposed to the US so many times therefore we salivate (respond) to everything. To prove this wrong we need a control group that is only given the US and test to see if they salivate to the CS (bell) and it turns out they do not. The second argument is sensitization to the CS and they say that there is only a response because you have increased the sensitivity of the CS by presenting it with the food so many times. To prove this wrong we have a control group that is only given the CS and we look to see if they salivate (respond) and they do not. Lastly, people argue that it is not US-CS learning, it is CS-R learning. To prove this wrong we look at a variety of evidence, first is post-conditioning US devaluation. What we find is that if we devaluate the US by habituation, pairing it with illness etc. that the CS no longer elicits a response. Next we look at CS-CS associations such as second order conditioning where one CS is paired with a US, then that CS is paired with another CS and the second CS comes to elicit the CR as well. We can also do sensory preconditioning where two CS are paired together and then one of them is paired with a US and we see that the other CS also elicits the CR! This shows that we can make stimulus- stimulus associations. There are a variety of classical experiments with respect to classical conditioning. One is the eye blink conditioning where the US is the puff of air to the eye, the UR is a blink, the CS is a tone or light and the CR is also a blink. The second is fear conditioning where the US is a shock and the UR is jumping, the CS is a tone or a light and the CR is freezing, we are conditioning fear, an emotional state. Conditioned place preference is when you inject a rat with a drug in one context and then put them in another context where they are not injected with the drug. When you do this multiple times, you then test to see where the animal spends more time and they will spend more time in the context where they were injected - they are showing a place preference what was learned (conditioned). The conditioned response can be of multiple levels. It can be autonomic, somatic, motivational, and you can measure the strength of the CS based on how much it changes past behaviour. The suppression ratio is a measure of how fearful you are to the CS. A suppression ratio of 0 is very fearful and a suppression ratio of 0.5 is not very fearful (careful when reading graphs!!). We can increase strength of conditioning many ways. One way is to increase the number of US- CS pairings. Another is to use salient and novel stimuli (noticeable and do not have any prior associations to them that could make conditioning more difficult. And lastly contiguity; the timing between CS and US (best when in trace conditioning when CS is still on when US presented but we can also handle short delays when CS presented then there is a short period of time before the US is presented). Rescorla stated that the CS must not only be present when the US is but that it also must be contingent with the US. Phi is the probability of the US occurring given that the CS is there - the probability of the US occurring given that the CS is not there. The closer phi is to zero, the worse CS is as a predictor. Phi of 1 is when the CS is a perfect predictor of the presence of the US. A phi of -1 is when the CS is a perfect predictor of the absence of the US. The Rescorla Wagner Model says that we learn in two situations. First, when the US is surprising, we look around for predictors of the US in the future, and secondly when we expect the US but we do not get it therefore we are also surprised and look for predictors of the absence of the US (extinction). What was found was that additional training with one CS blocked learning for the second CS with the US. This means that conditioning is more than just a stimulus-stimulus association and for it to occur the CS needs to be informative and the US surprising. When paired together, the first CS was already predictive of the US therefore we did not need to look or learn from the second CS. Then when we are given the second CS only, we have not learned anything from before and it does not elicit a CR. Rescorla Wagner created a learning curve and stated that we learn the most (ΔV) when we are the most surprised (the first trial), after this point we learn predictors of the CS and become less and less surprised therefore the successive ΔV are smaller in magnitude, where V is the associative strength of the bond between the CS and the US. We can measure the ΔV of any given trial by taking the maximum amount you can learn (there is a biological threshold - why it is an asymptote), V max and subtract from it what you already know ΣΔV to get what you learn at any given trial. ΔV gets smaller with each trial because the US becomes less surprising as you learn more and more predictors (CS) of it. V max is determined by the magnitude of the US and is how much you can learn. We can also change the learning curve by changing the slope - how fast you learn it, and that depends on the salience of the US (β) and the CS (α). Overall, the Rescorla Wagner model formula is: ΔV =nα β (V max - ΣV n. V max for extinction is zero (you are removing strength from the association and will get negative strengths because you are subtracting it). The Rescorla Wagner model explains blocking. When you have two CS, the association or expectation at beginning of a trial would be the sum of the strengths of each of the stimuli present. Therefore the amount of conditioning on a compound trial would be ΔV = ΔV = α β a b (V max - ΣV ab In the blocking group, the animal is trained extensively with one of the CS therefore V a 1 while nothing happens with the other V = 0. bhis means that ΣV = 1+0 = 1ab Then, when we do conditioning trials with compound stimulus we see that ΔV (what is bearned about b) = α β (V max - ΣV ab= α β (1-1) = 0! You will not get blocking if the V max is not 1 or if you do not train the first CS enough to get a V =1a The overexpectation effect is an application of the Rescorla Wagner model. This is when additional conditioning to the light and tone subtracts associative strength from any one separately. We expect going into it that since the tone and the light are shown together that the shock we receive will also summate and become bigger but when it does not, we become less afraid of the tone or the light individually. When the US is not contingent on the CS it does not mean there is no learning happening. There is one thing that is constant throughout conditioning and that is the context and therefore we use the context as a CS. In these experiments you put a rat in a box and there are 2 stimuli (CS, context) and shock occurs. We then look to see and learn that both box and tone are equal predictors of the shock. In the next trail we have just the shock and learn that the box is a predictor of the shock and therefore it increases in associative strength while the tone‟s associative strength stays the same. These trials alternate over time until we hit overexpectation - the sum of the strength of the two stimuli is larger than V and we lose associative strength max from each CS. The next trial it is only the box as a CS and therefore it increases in associative strength. The trial after this there is the tone and the box, which is in overexpectation again and both lose associative strength. Over time we will see that the box‟s associative strength varies slightly going up and down over the trials but the strength of the tone is more of a peak function. Eventually fear to the tone goes away while fear to the box stays. There are two main problems to the Rescorla Wagner model. First is the issue of surprisingness. There is an exclusive emphasis on the surprisingness of the US but the importance of the CS changes over conditioning (alpha is not constant). The second problem with the Rescorla Wagner model is the fact that it implies that extinction is unlearning and we can prove this is not the case. Spontaneous recovery is when the behaviour is extinct and it spontaneously reappears. Renewal is when extinction is context dependent therefore the behaviour is extinguished in one condition but in another the behaviour is not. Reinstatement is when the presentation of the US, even a little bit can cause the behaviour to return (the potato chip effect, once an addict always an addict). Rapid reacquisition is when it takes less time to learn the behaviour again. All of these suggest that extinction is a parallel form of learning and is not equal to unlearning. Prediction error has also been used to explain the Rescorla Wagner model. When we are surprised we make a prediction error but as we learn about the predictors (CS) of the US we make less prediction errors and as a result are less surprised. The brain has mechanisms to detect prediction error - dopamine. The ventral tegmental area is the area of the brain that makes dopamine and it has a variety of terminal areas. Agonists are drugs that enhance the action of dopamine such as cocaine, amphetamines, methylphenidate and antagonists are drugs that inhibit the action of dopamine such as chlorpromazine, and antipsychotic drugs. To determine dopamine‟s role in prediction error we place an electrode into the VTA and have animals do a classical conditioning task. What we find is that prior to conditioning, the VTA fires right after receiving the US and according to the Rescorla Wagner model, this is where the most amount of learning is supposed to occur. After conditioning the US is predicted by the CS and the VTA becomes active when the CS is presented but not when the US is presented. Therefore these neurons are needed when there is a prediction error or when you are surprised. During extinction they neurons fire when the CS is presented but when the US is not presented the neurons go silent (encoding for different type of prediction error). Auto-shaping is when we can shape the behaviour of an animal automatically and the animal continues a response even when the US delivery is independent of their response. When we test animals they show one of two responses. They are either sign trackers or goal trackers and this can be genetically modified. If we put a probe in one of the terminal areas of the VTA (the nucleus accumbens) and measure the levels of dopamine in their brain. Sign trackers respond to both the CS and the US at first and then with conditioning, only respond to the CS - as expected. Goal trackers respond to neither to CS or the US at first and then start responding to both. This shows that goal trackers never actually learn the association between the CS and the US. There are certain inborn predispositions to learn and they are (a) the separation in time should be short between CS and US and (b) that any CS can be used. We have proven both of these wrong in taste aversion experiments. We have rats that drink either sweet water or water that is paired with bright light or noise. Then we expose the rats to an x-ray, which makes them nauseous and see how their drinking behaviours changed. We see that they drink less of the sweet water and the same of the bright noisy water. In another group we expose them to foot shocks and see how their drinking behaviour changed. We see that they drink less of the bright noisy water and the same amount of the sweet water. This means that the CS you use does matter. Learning is dependent on the relevance of the CS to the US. Also, you do not get sick right away showing that even though there is a delay, animals make the link. Conditioned taste avoidance is measured by consumption test. Conditioned disgust is measured by taste reactivity test (tongue protrusions or gapes). For drug conditioning in rats and shrews, emetic drugs produce no conditioned place preference but they do show conditioned disgust and conditioned avoidance. For drugs of abuse, the rats show conditioned place preference, conditioned avoidance but no conditioned disgust while shrews show conditioned place preference and no conditioned taste avoidance. Why? Because rats cannot vomit and therefore they like to avoid novel stimuli. Therefore the subjects you choose are very important for your study. Preparedness states that we are more readily able to associate some things together more than others and it is measured by the number of trials needed to make the association. Conditioned eye blink is a prepared association while foot shocks with a light are not a prepared association. We also have the problem of instinctive drift, where the animal‟s instincts intrude on learning of the task. This was seen when researchers tried to get pigs to bring coins over to a piggy bank and then they were given food. The pigs then associated the coins with food and started treating the coins as if they were food and never really learned the task. This is an example of a counter prepared association. Responses mediated by our autonomic nervous system can be either identical to the UR or opposite to the UR. This is possible when the response you are studying is a two-phase homeostatic system therefore the CR will mimic the UR if the UR is monophasic (eye blink, no b process) or it will be opposite of the UR if the UR is biphasic (heart rate, the b process) - as according to the sometimes opponent process theory (SOP theory). Conditioning in humans is difficult because we have a large cortex and are able to analyze what is happening to us. We have two systems: the automatic associative system (that we share with a rat) and the cognitive associative system, which allows us to think about what is happening. We can make participant behave like a rat in conditioning by making them do another cognitive task at the same time, thus eliminating the cognitive associative system. In drug addiction there are two types of conditioning occurring at the same time. There is direct conditioning which is conditioning the effect the drug has on you and there is homeostatic conditioning, which is conditioning of the homeostatic response of the body. In direct conditioning the drug is present and you can pair a CS with either the a process (drug-like responses) or the b process (drug-opposite responses). In drug-like responses you are presenting a cue while the person is experiencing the primary effect of the drug and example of this is conditioned place preference. Needle freaks got pleasure just from the injection itself (shown by replacing the drug with water and the needle freaks still reported a positive mood). Conditioned withdrawal is when you present cues when experiencing the b process. An example of this is conditioned place avoidance. If you give someone methadone (antagonist of heroin) and just before you present them with a smell, the heroin addict will stay away from the smell. Homeostatic conditioning can occur because it is the drug that produces the a process and your body produces the b process (the opponent process). With repeated exposures to the drug, your body will start to look for predictors of the drug and will mount opponent responses in preparation for the incoming drug. These are also called conditioned compensatory responses. This then requires a higher dosage of the drug to get the same desired effect from the a process. This is called conditioned tolerance. Overdoses occur when you are taking that high dosage of drug (because the opponent process is making it seem like you are getting no effect) in an unfamiliar environment - one that lacks the cues for the homeostatic conditioning. This means that the body does not mount the opponent response because of the lack of cues and the high dosage leads to complications and possibly death. Tolerance is conditioned to particular stimuli and contexts. Operant Conditioning - Introduction It is called operant conditioning because you are operating an instrument to get the biologically significant stimulus, which then increases the strength of the response and makes it more likely to occur in the future. Primary reinforcers are stimuli that are biologically significant (food, water, drugs) or stimuli that mimic the effects of biologically significant stimuli. Secondary or conditioned reinforcers are ones that have been learned and which have had motivational value attributed to them because they have been paired with a primary reinforcer. A social reinforcer is one that is administered verbally such as “good.” There are three components to operant conditioning: the instrument, the response and the reinforcer. We teach a rat to make the operant response by shaping - reinforcing behaviour that is close to the behaviour that you want and making it harder to get. It is impossible to get classical conditioning out of operant conditioning because it is the conditioned reinforcer that truly maintains the response. The law of effect by Thorndike took cats and put them into a puzzle box. While in the puzzle box the cat makes a variety of responses but only one is successful. As a result this event is satisfying, which in turn, strengthens the association between the stimulus and the response. This then means that when the cat is in the puzzle box the next time it will take them less time to get out of the box. A response will increase if it is followed by a satisfying outcome and something is satisfying if its responding increases. Therefore if we do not get satisfaction, the response should be weaker. What we find is that when rats are copulating and are taken to end of straight maze right before ejaculation (satisfaction) they run faster. Therefore you do not need satisfaction for learning. It is a paradoxical reward effect because you have taken away something good and they are still performing the behaviour or the behaviour even increases. The law of effect by Bouton said that we look to develop responses that maximize benefits and minimize costs. When we get something that we like it is reward learning. When we get something we don‟t like it is punishment learning. When we don‟t get something we like it is omission learning. And when we don‟t get something we don‟t like it is avoidance learning. Reward and reinforcement are not the same thing. When we have people self administering different dosages of drug, when there is a low dosage they report low levels of liking and low levels of responding but when there are higher dosages, they report a low level of liking yet they are doing a high level of responding. This shows that what we like and what we do are two different things. Therefore a reward is something that makes you feel good and there is a level of subjectivity within it. A reinforcer is something that increases behaviour in most if not all individuals. By rewarding behaviour you can actually make that behaviour go away by undermining it. Contiguity theory was originally suggested by Guthrie who said that the S, R and S* all occur at the same point in time and whatever action is taking place when the S* is presented is the one that is going to be reinforced and increase in frequency in the future. This is called the stop action principle and is the principle behind superstitious behaviour. You bounced the ball 10 times and then made the perfect serve. You attribute that perfect serve to bouncing the ball 10 times and as a result you are more likely to bounce the ball 10 times before a serve in the future. The cognitive theory was introduced by Tolman and suggested that responses are flexible, S* must motivate behaviour and that during operant conditioning S-S* associations are made. This means we can then anticipate consequences. This can be seen with delay matching to sample studies in chimps when you show them banana, cover up and ask which box they want and you switch the banana with lettuce and the chimp gets very angry because it expected to get the banana. The law of effect by Skinner stated that events that enhance storage of information are more likely to occur in the future. When you are given a stimulus and your behaviour increases this is positive reinforcement. When you are given a stimulus and your behaviour decreases this is positive punishment (or punishment). When you are not given a stimulus and behaviour decreases this is extinction or omission. When you are not given a stimulus and behaviour increases this is avoidance or escape learning. A stimulus is reinforcing if it enhances memory consolidation and gives emotional tone to the memory. This can happen because memories are fragile at first and become more stable over time and most memories have an emotional flavour to them. You can interfere with memory consolidation in three main ways: trauma, new information or through electroconvulsive therapy. You can also make a memory trace stronger though reinforcing stimuli a
More Less

Related notes for PSYC 2330

Log In


Don't have an account?

Join OneClass

Access over 10 million pages of study
documents for 1.3 million courses.

Sign up

Join to view


By registering, I agree to the Terms and Privacy Policies
Already have an account?
Just a few more details

So we can recommend you notes for your school.

Reset Password

Please enter below the email address you registered with and we will send you a link to reset your password.

Add your courses

Get notes from the top students in your class.