York University
PSYC 2210
Tony Neild

Chapter 7: reinforcement schedules: a rule that states under what conditions a reinforcer will be delivered continuous reinforcement (CRF): every occurrence of the operant response is followed by a reinforcer, every response if reinforced cumulative recorder: records responses in a way that allows an observer to see the moment to moment patterns of a subject THE FOUR SIMPLE REINFORCEMENT SCHEDULES fixed ratio schedule (FR): a reinforcer is delivered after every n responses, where n is the ratio example: FR 20 schedule- every 20 responses will be followed by a reinforcer - FR1 schedule is the same as continuous reinforcement - -stop and go pattern -After each reinforcer, there is a pause in responding which is called the post- reinforcement pause - once the responding begins, the subject typically responds at a constant, rapid rate until the next reinforcer is delivered - the average size of the post-reinforcement pause increases as the size of reinforcement increases - the subject's rate of responding after the post-reinforcement pause decreases fairly gradually as the size of the ratio increases ratio strain: the general weakening of responding that is found when large response: reinforcer ratios are used Variable Ratio (VR): on average a subject will receive one reinforcers for every n responses, but the exact number if responses required at any moment may vary widely -the pattern of responding is rapid and steady -major difference between FR and VR performance is the absence of long post reinforcement pauses on VR schedules -the pauses on VR schedules are several times smaller than those found on FR schedules with equal response:reinforcer ratios because on VR after each reinforcer there is at least the possibility that another reinforcer will be delivered random ratio (RR): type of VR, each response has an equal probability of reinforcement Example: For RR 20 schedule, very response has one chance in 20 of being reinforced regardless of how many responses have occurred since the last reinforcer Fixed Interval (FI): the first response after a fixed amount of time has elapsed will be reinforced - typically makes more responses per reinforcer than the one that is required -there is post reinforcement pause, but after the pause, the subject starts by responding quite slowly, as the interval progresses the subject responds more and more rapidly - just before the reinforcement the response rate is very rapid (Accelerating repose pattern) -pattern: fixed interval scallop Variable Interval (VI): the amount of time that must pass before a reinforcer varies unpredictably for reinforcer to reinforcer -typically produce a steady, moderate response rate - because a reinforcer might be stored at any moment, a long pause after reinforcement would not be advantageous -by maintaining a steady response rate, the subject will collect each reinforcer soon after it is stored, thus keeping the VI clock moving most of the time Extinction and the four simple schedules rules about the resistance to extinction: -the extinction is more rapid after CRF than a schedule of intermittent reinforcement -this is called the partial reinforcement effect - one explanation of the partial reinforcement effect: -discrimination hypothesis: in order for a subjects behaviour to change once extinction begins, the subject must be able to discriminate the change in reinforcement contingencies - With CRF, where every repines is reinforced, the change to extinction is easy to discriminate and so it does not take longer for responding to disappear - it takes longer to discriminate the change from VR schedule to extinction -generalization decrement hypothesis: the decreased responding one observes in a generalization test when the test stimuli become less and less similar to the training stimulus -less rapid responding -responding during extinction will be weak if the stimuli present during extinction are different from those that prevailed during reinforcement, but strong if these stimuli are similar to those encountered during reinforcement -there is a large generalization decrement when the schedule switched from CRF to extinction because the subject has never experienced a situation in which its responses were not reinforced Other Reinforcement Schedules differential reinforcement of low rates schedule (DRL): a response is reinforced is and only if a certain amount of time has elapsed since the previous response - if the subject responds before the schedule, the schedule will reset -produce very low rates of responding Example: if the reinforcement schedule is set to every 10 seconds, if the subject responds in 8 seconds, the clock resets to zero and the subject must wait 10 more seconds before the response can be reinforced differential reinforcement of high rates (DRH): a certain number of responses must occur within a fixed amount of time -can be used to produce higher rates of responding For example: a reinforcer might occur each time the subject makes 10 responses in 3 seconds or less concurrent schedule: the subject is presented with two or more response alternatives (Eg: several different levers) , each associated with its own reinforcement schedule chained schedules: the subject must complete the requirement for two or more simple schedules in a fixed sequence and each schedule is signalled by a different stimulus Factors Affecting the performance on Reinforcement schedules - the effectiveness of a reinforcement schedule depends on the nature of the reinforcer that is delivered, and three important features of an reinforcer: its quality, its rate of presentation, and its delay -other factors: response effort, amount of reinforcement, level of motivation Behavioural Momentum -when a heavy object starts moving, it acquires momentum and becomes difficult to stop -behavioural momentum of an ongoing operant behaviour - a behaviours resistance to change (which is a measure of behavioural momentum) depends on the association between the discriminative stimulus and the reinforcer - implications outside the laboratory: behavioural therapy Example: behaviour therapists want to make sure that a newly trained behaviour will persist in the presence of potential disruptors, newly trai
