Textbook Notes (368,229)
Canada (161,721)
York University (12,820)
Psychology (3,584)
PSYC 2210 (13)
Chapter 7

Learning Chapter 7

8 Pages
Unlock Document

PSYC 2210
Anthony Nield

Chapter 7 - Instrumental Conditioning: Motivational Mechanisms → Informal reflection suggests that individuals perform instrumental responses because they are motivated to obtain the goal or reinforcer that results from the behaviour. But what does it mean to be motivated to obtain the reinforcer? The motivation of instrumental behaviour has been considered from two radically different perspectives. The first originated with Thorndike and involves analysis of the associative structure of instrumental conditioning.As this label implies, this approach relies heavily on the concept of associations and hence is compatible with the theoretical tradition of Pavlovian conditioning. The associative approach takes a molecular perspective. It focuses on individual responses and their specific stimulus antecedents and outcomes. To achieve this level of detail, the associative approach examines instrumental learning in isolated behavioural preparations. Because associations can be substantiated in the nervous system, the associative approach also provides a convenient framework for studying the neural mechanisms of instrumental conditioning. The second strategy for analyzing motivational processes in instrumental learning is behavioural regulation. This approach was developed within the Skinnerian tradition and involves considering instrumental conditioning within the broader context of the numerous activities that organisms are constantly doing. In particular, the behavioural regulation approach is concerned with how an instrumental conditioning procedure limits an organism’s free flow of activities and the behavioural consequences of such constraints. Unlike the associative approach, behavioural regulation considers the motivation of instrumental behaviour from a more molar perspective. It considers long-term goals and how organisms manage to achieve those goals within the context of all of their behaviour options. TheAssociative Structure of Instrumental Conditioning → Edward Thorndike was the first to recognize that instrumental conditioning involves more than just a response and a reinforcer. Hence, there are three events to consider in an analysis of instrumental learning: the stimulus context (S), the instrumental response (R), and the response outcome (O), or reinforcer. Skinner also subscribed to the idea that there are three events to consider in an analysis of instrumental or operant conditioning. He described instrumental conditioning in terms of a three-term contingency S, R, and O. The S-RAssociation and the Law of Effect → The basic structure of an instrumental conditioning procedure permits the development of several different types of associations. The first of these was postulated by Thorndike and is an association between the contextual stimulus (S) and the instrumental response (R): the S-R association. Thorndike considered the S-R association to be the key to instrumental learning and central to his Law of Effect. According to the Law of Effect, instrumental conditioning involves the establishment of an S-R association between the instrumental response (R) and the contextual stimuli (S) that are present when the response is reinforced. The role of the reinforcer is to “stamp in” the S-R association. Thorndike thought that once established, this S-R association was solely responsible for the occurrence of the instrumental behaviour. Thus, the basic motivation for instrumental behaviour was he activation of the S-R association by exposing the subject to the contextual stimuli (S) in the presence of which the response was previously reinforced. An important implication of the LAw of Effect is that Instrumental conditioning does not involve learning about the reinforcer (O) or the relation between the response and the reinforcing outcome (the R-O association). The law of Effect assumes that the only role of the reinforcer is to strengthen the S-R association. The reinforcer itself is not a party or participant in this association. Wood and Neal proposed a study of human habits. Central to the model is the idea that habits “arise when people repeatedly use a particular behavioural means in particular contexts to pursue their goals. However, once acquired, habits are performed without mediation of a goal”. Rather, the habitual response is an automatic reaction to the stimulus context in which the goal was previously obtained, similar to Thorndike’s S-R association. Thorndike’s S-R association is also being seriously entertained as one of the mechanisms that may explain the habitual nature of drug addiction. In this model, procuring and taking a drug of abuse is viewed as instrumental behaviour that is initially reinforced by the positive aspects of drug experience. However, with repetitive use, taking the drug becomes habitual in the sense that it becomes an automatic reaction to contextual cues that elicit drug seeking behaviour, without regard to its consequences. Even negative consequences, according to the S-R association, are not relevant. Expectancy of Reward and the S-OAssociation → You come to expect that something important will happen when you encounter a stimulus that signals the significant event or allows you to predict that the event will occur. Pavlovian conditioning is the basic process of signal learning. Specification of an instrumental response ensures that the participant will always experience certain distinctive stimuli (S)in connection with making the response. Whatever the stimuli may be, reinforcement of the instrumental response will inevitable result in pairing these stimuli (S) with the reinforcer or response outcome (O). Such pairings provide the potential for classical conditioning and the establishment of an association between S and O. One of the earliest and most influential accounts of the role of classical conditioning in instrumental behaviour was offered by Clark Hull and later elaborated by Kenneth Spence. Their proposal was that the instrumental response increases during the course of instrumental conditioning for two reasons. First, the presence of ‘S’comes to evoke the instrumental response directly through Thorndike’s S-R association. Second, the instrumental response also comes to be made in response to an S-O association that creates the expectation of reward. Aparticularly influential formulation was the two-process theory of Rescorla and Solomon. Two-Process Theory → The two-process theory assumes that there are two distinct types of learning: Pavlovian and instrumental conditioning. The theory further assumes that these two learning processes are related in a special way. In particular, during the course of instrumental conditioning, the stimuli (S) in the presence of which the instrumental response is reinforces, become associated with the response outcome (O) through Pavlovian conditioning, and this results in an S-O association. Rescorla and Solomon assumed that the S-O association activates an emotional state which motivates the strumental behaviour. The emotional state is assumed to be either positive or negative, depending on whether the reinforcer is an appetitive or an aversive stimuli (ex. food, shock). How can we test that S-O association (and the expectancies or emotions that such an association activates) can motivate instrumental behaviour? The basic experimental design for evaluating this idea is what has come to be called the Pavlovian-Instrumental Transfer Test in the behavioural neuroscience literature. In one phase, subjects receive standard instrumental conditioning (ex lever pressing is reinforced with food). In the next phase, they receive a pure Pavlovian conditioning procedure (the response lever is removed from the chamber and a tone is paired with the food). The critical transfer phase occurs in Phase 3, where the subjects are again permitted to perform the instrumental lever-press response, but now the Pavlovian CS is presented periodically. If a Pavlovian S-O association motivates instrumental behaviour, then the rate of lever pressing should increase when the tone CS is presented. The experiment is called the Pavlovian Instrumental Transfer Test because it determines how an independently established Pavlovian CS transfers to influence or motivate instrumental responding. Phase 1 can precede or follow Phase 2, the order does not matter.As predicted, the presentation of a Pavlovian CS for food increases the rate of instrumental responding for food. This presumably occurs because the positive emotion elicited by the CS+ for food summates with the appetitive motivation that is involved in lever pressing for food. The opposite outcome (a suppression of responding) is predicted if the Pavlovian CS elicits a negative emotion.According to the two-process theory, conditioned suppression occurs because the CS+ for shock elicits an emotional state (fear) that is contrary to the positive emotion or expectancy (hope) that is established in instrumental conditioning with food. Response Interactions In Pavlovian Instrumental Transfer → Classically conditioned stimuli elicit not only emotional states, but also overt responses. Consequently, a classically conditioned stimulus may influence instrumental behaviour through the overt responses it elicits. Consider a hypothetical situation where the classically conditioned stimulus elicits sign tracking that movies the animal to the left side of the experimental chamber but the instrumental response is pressing a lever on the right side. In this case, presentation of the CS will decrease the instrumental response simply because the sign tracking behaviour (going to the left) will interfere with being on the right to press the bar. Conditioned Emotional States or Reward-Specific Expectancies? → The two-process theory assumes that classical conditioning mediates instrumental behaviour through the conditioning of positive or negative emotions depending on the emotional valence of the reinforcer. However, animals also acquire specific reward expectancies instead of just categorical positive or negative emotions during instrumental and classical conditioning. In one study, for example solid food pellets and a sugar solution were used as USs in a Pavlovian instrumental transfer test with rats. During the transfer phase, the CS+ for food pellets facilitated instrumental responding reinforced with pellets much more than instrumental behaviour reinforced with the sugar solution. Correspondingly, a CS+ for sugar increased instrumental behaviour reinforced with sugar more than instrumental behaviour reinforced with food pellets. Thus, expectancies for specific rewards rather than a general positive emotional state determined the results in the transfer test. R-O and S(R-O) Relations in Instrumental Conditioning → So far we have considered two different associations that can motivate instrumental behaviour, Thorndike's S-R association, and the S-O association, which activates a reward- specific expectancy or emotional state. However, for a couple of reasons, it would be off to explain all of the motivation of instrumental behaviour in terms of these two associations alone. First, notice that neither the S-R nor the S-O association involves a direct link between the response (R) and the reinforcer or outcome (O).Another peculiarity of the associative structure of instrumental conditioning assumed by two-process theories is that S is assumed to become associated directly with O on the assumption that the pairing of S with O is sufficient for the occurrence of classical conditioning. However CS-US pairings are not sufficient for the development of Pavlovian associations. The CS must also provide information about the US, or in some way be related to the US. In an instrumental conditioning situation, the reinforcer (O) cannot be predicted from ‘S’alone. Thus, instrumental conditioning involves a conditional relation in which ‘S’is followed by ‘O’only if ‘R’occurs. Evidence of R-OAssociations → Anumber of investigators have suggested that instrumental conditioning leads to the learning of response-outcome associations.A common technique involves devaluing the reinforcer after conditioning to see if this decreases the instrumental response (analogous to US devaluation in Pavlovian conditioning). If US devaluation after conditioning disrupts the CR, one may conclude that the CR was mediated by the CS-US association. In a corresponding fashion, reinforcer devaluation has been used to determine if an instrumental response is mediated by an association between the response and its reinforcer outcome. In a definitive demonstration, Colwill and Rescorla first reinforced rats for pushing a vertical rod either to the right or left. Responses to one direction were reinforced by food pellets and responses in the opposite direction were always reinforced with a bit of sugar solution (sucrose).After both responses had become established, the rod was removed and the reinforcer devaluation procedure was conducted. Onf of the reinforcers was periodically presented in the chamber, followed by an injection of lithium chloride to condition an aversion to that reinforcer.After an aversion to the selected reinforcer has been conditioned, the vertical rod was returned, and the rats received a test, during which they were free to push the rod either to the left or to the right, but neither food nor sucrose as provided. The important finding was that the rats were less likely to make the response whose reinforcer had been made aversive by pairings with lithium chloride. For example, if sucrose was used to reinforce responses to the left and an aversion was then conditioned to sucrose, the rates were less likely to push the rod to the left than to the right. Studies of reinforcer devaluation are conducted in a manner similar to the procedures used by Colwill and Rescorla.An initial phase of instrumental conditioning is followed by a phase in which the reinforcer is devalued by pairing it with illness or by making the subject full so that it no longer feels like eating. The rate of instrumental behaviour is then measured in the absence of the reinforcer. However, there is another important step in the process. The subject has to experience the new value of the reinforcer. That is, the subject has to taste how bad the food became after it was paired with illness or how unpalatable the food is once the subject is not longer hungry. This is called incentive learning. Only if the subject has had a chance to learn what the new incentive value of the reinforcer is will its instrumental behaviour be reduced. The results of the former experiment constitute particularly good evidence of R-O associations because alternative accounts are not tenable. For example, the selective response suppression cannot be explained in terms of an S-O association. Pushing the vertical rod left or right occurred in the same chamber, with the same manipulandum, and therefore in the presence of the same externa
More Less

Related notes for PSYC 2210

Log In


Join OneClass

Access over 10 million pages of study
documents for 1.3 million courses.

Sign up

Join to view


By registering, I agree to the Terms and Privacy Policies
Already have an account?
Just a few more details

So we can recommend you notes for your school.

Reset Password

Please enter below the email address you registered with and we will send you a link to reset your password.

Add your courses

Get notes from the top students in your class.