Study Guides (248,131)
Canada (121,336)
Psychology (693)
PS260 (45)

Midterm Review

27 Pages
Unlock Document

Todd Ferretti

Intro to Cognitive Psych – Lecture 1 Scientists have a tendency to begin courses with the history of their discipline. This is understandable-we want to show you where it all came from. Still, in some ways, history makes more sense when you know something about a field already, so maybe it should be discussed at the end. Or maybe both. I've decided to go the standard way, starting with a brief history to set the context for what follows. This decision is based on the old maxim-"what better place to start than the beginning?" Also, this brief history will provide an introduction to many of the topics we will be discussing in this course. I promise not to go on at great lengths, despite the many fascinating individuals and ideas in the history of thinking about the operation of mind. Representation and Process Before heading back in time, there are a few terms and ideas we need to consider. It seems to me that there is one central theme that runs through the history of memory and cognition from earliest philosophy to current psychology. This concerns what I will call the two critical strands of cognition: 1. representation 2. process When you ask what someone knows, you are asking about the representational component of cognition. Representation: the knowledge we possess, or the information that is in our memory. Questions like this also lead to questions about the structure or architecture of the cognitive system. Over the years, people have argued for two types of representational structure: 1. static: virtually never changing 2. dynamic: always changing The earlier view was the static one, which seems quite intuitive. As one illustration, this provided the basis for the idea of superior hypnotic recall (i.e., recalling information under hypnosis that could not be recalled in one's normal state), because the original information had to be in memory. After all, you rarely see other structures spontaneously change (e.g., buildings), so why should the contents of memory do so? Or, better yet, how? Today, we prefer the dynamic view of cognition - that the system is constantly changing. Although a representation can be dynamic in its own right, the primary way this dynamic quality is imparted to the system is through process. Process: an operation on an external stimulus or on an internal representation. A process can create a new memory representation or make use of an existing one. Of course, a process can also update existing representations or reinterpret them, leading to the ever changing quality of cognition. Otherwise, how would you learn your new phone number, or forget your old one? We must be able to reorganize information and ideas, based on the different tasks we are faced with. This active view of cognition is the very heart of today's cognitive psychology. Of course, the constant interaction of representation and process means that it is virtually impossible to untangle these two threads of cognition, yet it may help us to try to think of them as separate from time to time. Indeed, Paul Kolers argued that what is kept in memory following an experience is not a record of the details of that experience but rather a record of the processes used during that experience - in his view, the processes are the structure! I'll have more to say about that later in the course. Definition of Cognition Well, perhaps it is time to try to define cognition. Webster's says it is derived from the Greek gnosco, meaning "to know"-the "co" part comes from "con", meaning "with", and defines it as-Knowledge from personal view or experience; perception; a thing known. Clearly, this emphasizes representation. However, the word may also derive from the Latin cogito, meaning "I think", which adds process to representation. Let us try a more psychological definition. William James (1890) defined psychology as "the science of mental life, both of its phenomena and their conditions." James' definition turned out be far too narrow a definition of psychology, but does describe cognitive psychology. Ulric Neisser (1967; Cognitive Psychology) wrote in the first text devoted to the study of cognitive psychology: "Cognition refers to all processes by which the sensory input is transformed, reduced, elaborated, stored, recovered, and used". Note that Neisser's definition very much reflects the process view of cognition. We can adopt a definition of cognitive psychology that tries to capture the ideas of James and Neisser and current researchers in the field. Cognitive psychology is the study of skills and knowledge-how they are acquired, stored, transformed, and used. The term "cognition" emphasizes the symbolic, mental, and inferred processes of mind. Cognition refers to the set of processes and representations involved in such activities as those listed on your syllabus-learning, remembering, thinking, reasoning, communicating, deciding, and so on. One thing that you should be aware of right away is that this is a very difficult field because the very thing we are trying to study is the one thing we cannot observe directly. You can't see these activities; you can only see their consequences. I can't see you remembering, for example, I can only tell from your behaviour (sometimes!) whether you remember or not. This is the classic learning/performance distinction, perhaps better called the cognition/performance distinction. In studying cognition, you can only study it indirectly, and then base your inferences on those indirect data. Consider the following two examples: 1. After a lovely romantic dinner, your partner asks you if you remember your first kiss (with him or her!). If you do, you're safe and you say YES and can even describe it. If not, what do you say? Of course, you say, with a dreamy look on your face, YES, reach for his or her hand, and then try to shift the subject. Performance suggests the presence of knowledge, but the knowledge isn't there. 2. On the other side of the coin, your partner asks you if you remember agreeing to make dinner tonight. If you do, and still feel like it, you can say YES. If you do, but no longer feel like doing it, you may try to get out of it by saying NO. [I recommend suggesting dinner out.] Here, performance suggests the absence of knowledge, but the knowledge really is there. The point of these two examples is to demonstrate that performance and cognition do not always match, and this is true not just in such contrived examples. For example, anyone who plays a sport (e.g., tennis or squash) can recall excruciating instances where a shot they know how to make simply doesn't work out. The knowledge is there but the performance isn't. This is a problem we will be dealing with throughout the course. In a very real sense, more than we care to accept, we do not know what is in our own minds, only what we think is there. But British philosopher Immanuel Kant in the 1700s had an idea that is very relevant to all science, and certainly to cognitive psychology. He called it the "transcendental method" and essentially it involved working backward from observable effects to infer their most probable causes. Now let's get back to the problem at hand, and briefly consider the history of the study of cognitive psychology. Early History The study of cognition begins with the Greek philosophers. Prior to Socrates, the emphasis was on perception and perceptual processes, and memory and cognition were seen as outgrowths of perception. Memories were simply stored literal perceptions or traces, and often there was a kind of primitive physiological basis in trying to locate these traces. To understand mind, it was thought that physics was the critical science because the mind was just a copy of the physical world. The first to actually deal with such problems as how information from the different senses was integrated was Diogenes of Apollonia who conceived of the cognitive system as the integrator, and gave us the original meaning of the term "common sense." However, he was still a bit off track. Of the primitive elements the Greeks recognized, he chose air as the most likely candidate for the basic element of cognition based on his observation that it was the only substance that regularly went in and out of the body. But then along came Plato who sharply upgraded the thought on cognition. While still emphasizing the individual senses, his avowed goal was to discover the object of mind-in our terms, the representational structure of cognition. His key idea was that of universals underlying perception of particulars (e.g., "dog" for "Lassie"). This is still a critical idea in modern cognitive theory, as we will see near the end of the course when we discuss concepts and reasoning. However, Plato gave us a static view of cognition in his "wax tablet" analogy of memory. It is a mark of his impact how persistent this view has been. Consider the following quotes: "Imagine, then, for the sake of argument, that our minds contain a block of wax, which in this or that individual may be larger or smaller, and composed of wax that is comparatively pure or muddy, and harder in some, softer in others, and sometimes of just the right consistency." (Plato, approx 380 BC) "Some minds are like wax under a seal-no impression, however disconnected with others, is wiped out. Others, like a jelly, vibrate to every touch but under usual conditions retain no permanent mark." (James, 1890, p. 293) Although Plato soon backed off from this view, because it created problems for his developing theory of universals, it is still with us today in popular ideas. But it was Aristotle (around 400 B.C.) who had the major influence and whose ideas are most relevant today. First, he did not see universals as separate from particulars the way Plato did; to Aristotle, they formed part of the particular. Plato: Dogness; Lassie, Fido, Benji Aristotle: Lassie dogness; Fido dogness; Benji dogness This suggested a certain kind of representation, and the only way to know a universal would be via active processing, because they were not directly available to the senses. He didn't suggest how this might be done, but then again we still have not figured it out! Aristotle's principal contribution was his doctrine of association, which held that mental life could be explained in terms of two basic components: ideas (the elements) and associations (the links) between them. He posited three laws of association: 1. contiguity: same time or space 2. similarity: alike conceptually 3. contrast: opposites These bases for connection still appear in theories today (cf. the importance of contiguity in classical conditioning; or Murdock's (1972) distinction between item and associative information in memory). We have to jump through the dark ages all the way to the 1700's, when the British Associationists, led by Hobbes and Locke, reformulated the concept(s) of association. Although others had promoted the study of mental life in the interim, it was only then that things cognitive began to get moving again. This is interesting because so much of the resulting British Empiricism is just a complete rediscovery of Aristotle's ideas. Through the late 1700's and early 1800's, all sciences blossomed, and in the early 1800's research on the physics of sensory systems became particularly focal. A history of psychology could detail the development of psychophysics under such giants as Helmholtz, Fechner, Weber, and others. These men were building a way to study the unobservable world of the mind, a necessary precursor to the emergence of cognitive psychology. Psychophysics: the systematic study of the relation between the physical characteristics of stimuli and the sensations that they produce. Franciscus Cornelius Donders (1868) was a Dutch physiologist who was the first to find a way to measure "thinking time". He measured the time between a stimulus and different types of responses. The reaction or response time was used to infer the duration of mental processing. For example, subjects were asked to press a key as quickly as they could when they felt a touch on either foot (simple reaction time), or to press the key when they felt a touch only on their right foot (choice reaction time). Donders compared simple reaction time with choice reaction time and the difference was a measure of the decision time for the choice. We will see that measuring response time is still an important tool in the study of cognitive psychology. Wundt (1879) usually receives credit for the creation of psychology, but it was all around scientists of the day. Interestingly, Wundt was initially anti cognition. He divided the subject matter of psychology into the simple psychical processes (reflex, sensation), which could be studied scientifically, and higher psychical processes "about which nothing can be discovered in such experiments." Yet Wundt, and later his students, led by Edward Titchener in the US, emphasized a method of study of psychical processes that was, in its way, quite cognitive. He wanted to study conscious mental processes, and since only the individual could know his or her own thoughts, the best way to study them was to look in on them. This technique came to be called introspection, and people were trained to make reports on their own minds. This way of doing research came to be called Structuralism, because the goal was to find the structural elements of mind. But it soon had apparent disadvantages, like you couldn't study unconscious events, and there is no way to avoid the "filter" of an individual's experience, beliefs, biases, etc. Introspection was subjective, not objective, and we demand that science be objective. Wundt's impact was great enough that little empirical work on "higher" cognition was carried out in the 1800's with the notable exception of Ebbinghaus's (1885) treatise called Memory, which really revolutionized experimental psychology. Ebbinghaus estimated the forgetting curve, and developed an ingenious procedure to measure memory, the method of savings. Ebbinghaus memorized lists of nonsense syllables and tested his own memory for these items at different intervals. Even though he might not be able to recall any of the nonsense syllables after a time, he found that it took less time (fewer trials) to relearn a previous list than it took when he originally learned the list. The difference or advantage between relearning and original learning is the savings, and provides a sensitive measure of memory. We'll talk about his work more during the course. In the late 19th century, Functionalism was very cognitive, as set out by William James at Harvard. The goal here, influenced by Ebbinghaus's success in experimenting on memory, was to develop experiments to test theoretical ideas about how cognition worked. James also made the distinction between primary and secondary memory - we know this today as the difference between short-term and long-term memory. And then came the major movement of Psychology's history. In the 20th century, Behaviourism was incredibly anti cognitive, going so far as to suggest there was nothing at all in the black box of mind (cf. behaviourism under Watson, and radical behaviourism under Skinner). Although many good empirical methods were developed in the first half of the century, the climate was totally wrong for a science of thought. Only the Gestaltists in Germany fought this tide, but their impact at the time was minimal, and largely constrained to the study of sensation and perception. (In contrast to Structuralism, Gestaltists believed that sensations and perceptions could not be reduced to their basic elements. Rather, they believed that the "whole" was greater than the "sum of its parts" and had to be studied as such.) How did cognitive psychology finally emerge from the shadow of behaviourism? Emergence of Cognitive Psychology It is hard to imagine today believing that thought did not exist, and that all psychology could be explained in terms of stimulus response links as the behaviorists did. Still, their view had the advantage of dealing entirely in observables, and it was well defined. It was also a simple theory, worth a try to see how far they could run with it. Yet they ignored so much of what went on before and during their reign. For instance, in his classic book Principles of Psychology, William James (1890) discussed in quite modern terms attention, memory, imagery, and reasoning. Kohler (1913) carried out studies of problem solving by apes on the Isle of Tenerife by observing how apes would use sticks and boxes to obtain bananas that were hung out of their reach. (As an historical aside, it has been suggested that during the course of his research, Kohler also served as a spy for his native Germany during World War I.) It was Kohler who introduced the concept of insight. Sir Frederick Bartlett developed a very cognitive view of memory in Remembering (1932), where he described memory as a reconstructive process. His book was ignored at the height of behaviourism. Similarly, the importance of Duncker's (1945) work on thinking (in particular functional fixedness - the limitation in our ability to see a novel use for an item that already has a purpose, which interferes with creativity) was not fully appreciated until the fall of behaviourism. What finally brought about the change? It was a confluence of factors in the mid 1950's and on. First and most obvious, Behaviourism was failing. Some began to realize that mediation (i.e., internal thought) was a necessary construct, totally at odds with behaviourism. Piaget's (1954) developmental studies were crucial here. Second, information theory and communication theory grew up in the late 40's (cf. Shannon & Weaver, 1948), leading to a new perspective on information and how it is processed and how it could be measured. Third, modern linguistic theory developed new ways of looking at language and in particular grammatical structure (cf. Chomsky, 1957), often at odds with behaviourism. [Indeed, Chomsky wrote a famous treatise that savaged Skinner's huge work on language called "Verbal Behavior," and is often considered to be the death knell for behaviourism.] Fourth, computers became quite available and with them a new perspective on processing and storage of information. All of these led to a new desire to understand the more complex mental processes, and the birth of contemporary cognitive psychology. The critical work of the period includes several papers and books that we will be dealing with in weeks to come. But two rather general books on cognition stand out-Miller, Galanter, and Pribram's (1960) Plans and the Structure of Behavior and, the first totally cognitive book, Neisser's (1967) Cognitive Psychology. By the mid 1960's, cognitive psychology was clearly establishing itself as the dominant in the field of psychology, important enough, as Reisberg notes, to be described as the cognitive revolution. Lecture 2: Sensory Memory Associated with each of our sensory systems is a memory that briefly holds the incoming information. Neisser (1967) named the sensory memory associated with the visual system iconic memory ("icon" or visual image), and the sensory memory associated with the auditory system echoic memory (the root word being "echo"). Today we are going to focus on iconic memory, in part, because more research has been done on this sensory memory system. As an historical footnote, iconic memory was first documented by Segner (1740) a Swedish scientist. He attached a glowing coal to the freely spinning wheel of a cart and gradually increased the rate of revolution until people reported seeing a continuous circle. After calculating the time needed for a single revolution at this speed, he determined that the duration of iconic memory must be 100 msec. We will see that this turned out to be a fairly accurate estimate. Let us begin our examination of iconic memory with how we process a visual scene. We do not take in all of the information available to us at once. Rather, we create an interpretation of the visual scene through a series of fixations. Each fixation lasts approximately 200 milliseconds (msec) or 1/5 of a second. The movements of our eyes from one fixation to another are called saccades. Saccadic (voluntary) movements take about 50 - 100 msec. Thus, we have 3-4 fixations per second. The lecture slides show a typical scene, and a series of fixations numbered in order from 1 to 30. Note how the fixations center on the information rich portions of the scene as they trace out the parts of the picture. Part of the information in the picture is taken in during each fixation and these different parts are assembled to form an interpretation of the entire picture. During saccades, little or no information is processed, as the scene is a blur during these rapid movements. (You can determine for yourself that little information is taken in during these movements. Try to see your own eyes move in a mirror. You can't!) Baxt (1895), who had an interest in the process of reading, wanted to know how much information we can glean from a single glance, or fixation. To answer this question he asked subjects to read a set of random letters. The letters were covered with a solid wheel that had a segment cut out of it. When Baxt spun the wheel, the letters could be briefly seen through the empty segment. Baxt found that subjects could report 4-5 letters, on average. This limit became known as the perceptual span, or span of apprehension. For many years, the perceptual span was interpreted as the limit on stimuli available for further processing. Then, in 1960, George Sperling did his PhD thesis on the subject and changed our ideas. In his first experiment, he redid what Baxt had done. Sperling, however, had equipment (a tachistoscope) that allowed him to very precisely display visual images. When Sperling displayed a random set of letters for exactly 50 msec (less than the duration of one fixation), he observed, as Baxt had, that subjects could correctly report about 4 or 5 items. Was this the limit on the amount perceived in a single fixation? Subjects taking part in Sperling's experiment noted two things: 1. They claimed that they had actually seen the whole array, but that they "forgot" it while reporting. Was this a perceptual problem or a memory problem? 2. They claimed that the array seemed to fade before their mind's eye, but that it was definitely available to examine even after the display went off the screen. Were they right, and if so what did this mean? Sperling proposed a model of visual processing that included a visual sensory store that briefly held information.The information in this store is analyzed ("read") by a pattern recognition process that identifies (or "names") the stimuli, and the analyzed information is then held in immediate or short-term memory. Sperling set out to answer the question of what limits whole report to 4-5 items. Is the pattern recognition process too slow? Or, is it due to limitations of short-term memory? Sperling presented the letter display to subjects for 500 msec instead of 50 msec. If the pattern recognition process is slow, then giving this process more time should increase the number of letters subjects can read. If it is a limitation of short-term memory, then the extra time will have no effect. Sperling found that even when presenting the display for 500 msec subjects still could only report 4-5 letters. He concluded that this limit was not because the pattern recognition process was slow. Rather, the limitation had to be due to short-term memory. To overcome the limitation of short-term memory, Sperling came up with the partial report procedure. For some displays, he would ask subjects to report the whole display as had been done before (the whole report procedure). However, on some other displays, he would signal subjects to report only one of the rows in the display. For instance, he signaled to recall the top row with a high tone, the middle row with a medium tone, or the bottom row with a low tone. Subjects would not know until the instant the display went off the screen whether to report all or part, and if it was part, they would not know which part until the partial report cue was presented. In this task, the letter array was on the screen for 50 msec, and the tone occurred just after the array disappeared. Sperling could estimate how many letters were available or held in iconic memory by multiplying the average number correct on the partial report trials by the number of rows. (This estimate is based on the logic that if subjects could correctly recall, say 3 of 4, letters from one of three rows cued at random, then subjects must have 3 of 4 letters available from every row. So, 3 letters recalled times 3 rows would equal 9 letters available in iconic memory.) Sperling found that the estimate of the number of letters available in iconic memory was greater than 4-5, the limit of whole report. So, iconic memory can hold more information than can be held in immediate memory. Therefore, perceptual span or the "span of apprehension" is really a limitation of memory and not of perception. Sperling went on to answer three questions about the characteristics of iconic memory. We will consider each of these questions in turn. 1. What information is represented in iconic memory? To answer this question, Sperling varied the type of cue that signaled which part of the stimulus display to report in the partial report procedure. Remember that Sperling first used an auditory tone to cue subjects as to which row to report. The fact that performance (the estimate of the number of letters available) was better in the partial report procedure than in the whole report procedure means that the partial report cue was effective. So we already know that spatial information, or location, is preserved in iconic memory because subjects could report by row. Sperling then presented letters of different sizes and cued subjects to report only the large, or only the small letters. This was also an effective partial report cue, so information about size is also preserved in iconic memory. Similarly, Sperling presented letters in different colours and cued subjects to report only the red or only the green letters. Again, this was an effective partial report cue, so colour is also represented in iconic memory. Finally, Sperling presented letters and digits, and cued subjects to only report letters or only report digits. This cue was not an effective partial report cue. Why wasn't it? Note that all of the effective partial report cues are based on physical properties (location, size, colour). The difference between letters and digits is not a physical difference, it is a semantic difference. In other words, to know whether a symbol is a letter or a digit, one must interpret (or "read") the symbol. So, because letters versus digits was not an effective partial report cue, Sperling knew that the information held in iconic memory is precategorical. That is, the information has not yet been processed for meaning. Thus, iconic memory preserves only the physical features of the stimuli. 2. What is the duration of iconic memory? How long are the physical features of the stimuli held in iconic memory? To answer this question, Sperling varied the delay between the termination of the letter display and the partial report cue. He predicted that at very long delays after the array, the tone signal should not help because the iconic image will have decayed away. Because subjects will not know which row to report until it is too late, they should get about 4 to 5 right, the perceptual span (or whole report limit). With no delay, subjects should do very well. The key question is what would happen with intermediate delays. Two outcomes seemed plausible. If information is only available while the display is actually visible-if there is no iconic image-then any delay should cause performance to drop to the span value. If there is an icon, then fading should result in a gradual decline in performance with time as the image fades. As you may have guessed, the fading view was supported, and Sperling showed that the second introspection of subjects was correct, too. This image lasted somewhere in the vicinity of 250 msec to a second, depending on perceptual features such as brightness, etc. Erikson and Collins (1967) estimated the duration of iconic memory using a different type of procedure. They constructed pairs of dot patterns such that each one of the pairs appeared as a random set of dots. But when the pair of dot patterns were presented together, letters could be seen. Erikson and Collins presented the two patterns either simultaneously, or with a variable delay between the two. When the patterns were presented together, or when the delay between the presentation of the first and second patterns was less than a second, subjects could report the letters. This shows that the first dot pattern was still in iconic memory when the second pattern was presented, as subjects could see the letters. But when the delay between the first and second patterns was too long (greater than a second), the first dot pattern had faded by the time the second pattern was presented, and subjects could not see the letters. Erikson and Collins' experiment provides converging evidence that supports Sperling's estimate of the duration of iconic memory being less than a second. (In cognitive psychology, as in other sciences, the more different ways we can demonstrate a finding, the more confidence we can have in that finding. This is the logic of converging operations.) Note as well that Erikson and Collins' experiment also demonstrates that the capacity of iconic memory must be quite large, as iconic memory must be able to hold all of the dots from both of the patterns that they showed their subjects. 3. How is information lost from iconic memory? We have already seen that information is lost from iconic memory through a decay process that occurs over time. Later, researchers demonstrated one other way that information could be lost from iconic memory - pattern masking. When researchers presented a set of letters for subjects to report, and then a pattern mask (visual noise or random patterns), the pattern mask interfered with subjects' ability to report the letters. Just as the dot patterns used by Erikson and Collins merged together in iconic memory when the delay between the two patterns was not too long, the letters and the pattern mask also merge together at short delays. Because the pattern mask is noise, it makes the letters difficult or impossible to read when they are combined. When the pattern mask is presented just before the letters, it is called forward masking (because it interferes with a display that is forward in time), and when the pattern mask is presented just after the letters, it is called backward masking (because it interferes with a display that is backward in time). The greater the delay between the pattern mask and the letters, the less effect the mask will have. So, information in iconic memory is lost due to a rapid decay process. And information in iconic memory can be lost due to the interference of other stimuli (masking). Summary Sperling's research has shown (1) that a lot is perceived in a single fixation, (2) that an iconic image persists after the display disappears, and (3) that this image decays and is lost very rapidly. Of course, he also showed that whole report or perceptual span is not a good measure of what is perceived. Further research has shown that there are sensory memories associated with our other sensory memories as well. QUESTION: How might you do an auditory experiment analogous to Sperling's experiments on iconic memory? ANSWER: Use Stereo, with two to four apparent directional sources and cue them. The major difference between the sensory stores for visual and auditory information is that echoic memory lasts longer-on the order of 1 to 4 seconds. This has survival value-vision is simultaneous, with a 250 msec "window". Iconic memory helps us maintain a stable interpretation of the visual world (remember that we are essentially blind during the saccadic movements between fixations that last about 50-100 msec). Auditory information such as speech, on the other hand, is sequential and spread out over time. Thus a longer-lasting echoic memory enables us to process signals such as speech over time. So, we know that there are sensory stores, that their capacity is quite large, and that forgetting occurs rapidly via decay and due to masking. We also know that the information held in sensory memories is precategorical; that it represents the physical features of the stimulus and has not yet been analyzed for meaning. In the next two lectures we will examine how this information is processed for meaning. First we will consider pattern recognition in vision, and then we will examine speech recognition in audition. Lecture 3 – Pattern Recognition To begin, we can define pattern recognition as how people identify objects in their environment. It is a general set of processes whereby the continuous stream of stimulation around you is segmented into discrete, labelable units based on experience. This occurs at many different levels, from recognizing a light is on to being able to play successful chess and recognizing vast numbers of moves. Although the process is very general and applies all over the information processing system, we will focus on it where it first begins to matter-in transferring information from the sensory store to short term memory. Essentially, pattern recognition takes perceptual elements and transforms them into symbols and then into concepts that have meaning. This is done by relating the information in sensory store to information already known, and stored in long term memory. As is so often the case in studying cognition, pattern recognition is a very fluent and seemingly effortless skill, which makes it hard to study. Yet there are ways, and we will consider some of these as we try to understand how people recognize patterns. Template Theories Template theories maintain that patterns are treated as unanalyzed wholes, and that comparison of what is perceived to what is already known is accomplished by some kind of measure of overlap or similarity. That is how the number system at the bottom of bank cheques works-each of the digits always looks exactly the same and there are only 10 of them, so computer matching is a very straightforward process. The striped universal codes on products in stores that are scanned work the same way. But what about more complex situations where pattern recognition is required? Imagine how hard it is to recognize handwriting using such a system. Everyone writes differently, using different angles, different sizes, and even different shapes. Thus, we would need almost a infinite number of templates to be able to recognize all the variations of a letter. Or, we would need a way to adjust the pattern to-be-recognized (in terms of orientation, size, etc.) before comparing the new pattern to our templates in memory. But how would the pattern recognition system know how to adjust the new pattern before it has been recognized? And what is the template for a potato, which we can easily recognize, but each potato is not identical? Also, a template model would not describe how two patterns differ, only that they do differ. Nor would it explain how the same pattern can have two different interpretations, as the symbol "O" (a zero and a letter) does. Thus, template theories seem unlikely to handle the complex problem of human pattern recognition at a general level, although they may actually be used in some circumstances. Such an instance might be in the early stages of learning a new alphabet or set of symbols. Feature Theories Another way to explain pattern recognition is via a feature theory. A feature theory can be defined as a system that allows us to describe a pattern by listing the elements of that pattern. Patterns consist of elementary attributes which, when put together and interpreted, can be seen as a meaningful concept. One of the advantages of a feature theory is that it ties in well with what we know about how people identify concepts (such as "furniture"), a problem we will be considering later in the course. There are three lines of evidence that provide compelling support for the view that pattern recognition is based on the analysis of the component features. Let's consider each type of evidence in turn. 1. Visual Confusions The more similar two items are in terms of their features, the greater the chance they will be confused. Let's consider the letters of the English language. The different features of each letter can be distinguished, as in the table in the lectures. When these letters are presented rapidly (i.e., under conditions that make it difficult to always correctly identify the letters), confusions among similar letters increases. The more features two letters have in common, the more likely they will be confused. Visual confusions based on the number of shared components between items is one line of evidence indicating the importance of feature analysis in pattern recognition. 2. Visual Search Studies In visual search studies, subjects are asked to search a display for a particular target. Neisser (1967) used this task to demonstrate the importance of features. He found that subjects are faster to find a letter like "Z" when it was in a list of letters with rounded features (e.g., O, R, B, etc.) than when in it was in a list of list of letters composed of mostly of horizontal and vertical lines (e.g., T, M, V, X). In other words, searching for a target is easier when the non-target letters have dissimilar features than when they share many of the same features as the target. Neisser also asked subjects to find target words in lists of different words (e.g, is the word "sand" in the list?), or to search for words with a particular meaning (e.g., is there an animal in the list?). He found that subjects were much faster in finding specific words (like "sand") than searching for words based on meaning. This is because it is easier to base one's search on physical features rather than having to read each word, interpret its meaning, and base the search decision on the meaning of the word. Anne Treisman carried out a series of visual search studies that also demonstrated the important role of features in pattern recognition. In one study, she found that subjects were faster to search for a target that was an incomplete circle amidst non-targets that were complete circles, compared to searching for a complete circle among incomplete circles. This result indicates that the visual system treats "gap" as a feature, but does not treat "no gap" as a feature. Treisman also showed that subjects were very fast at search for target letters such as a "green T" when none of the non-target letters were green. In contrast, when subjects looked for a "white T" amidst white and black letters, they were much slower to find the target. Searches that can be based on a single feature (such as "green") are much faster than searches based on a combination or conjunction of features (such as "white" and "T"). Indeed, in a single feature search the target seems to "pop out" at you, but there is no pop out for conjunction feature searches. 3. Physiological Evidence of Feature Detectors Visual confusions and visual search experiments provide behavioural evidence for feature detection in pattern recognition. There is also physiological evidence for feature detectors. Information collected from the photoreceptors in the retina is sent via the optic nerve to the lateral geniculate nucleus or LGN. (It is interesting to note that information from all of our senses except one go to the LGN. The exception is our sense of smell, which is one of our most primitive senses. Information about odour goes from the receptors in the nasal cavity directly to the part of the brain called the olfactory bulb.) From the LGN, the visual signal is projected to the visual cortex. There are different types of cells, arranged in columns, in the visual cortex. Studies that have measured the responses of these cells to different types of visual stimuli have shown that the different types of cells are tuned to different types of features. For example, simple cortical cells are most sensitive to lines of a particular orientation, whereas hypercomplex cells are most responsive to specific combinations of features such as corners or angles. You needn't worry about what cells do what. The important point to note is that different cells in the visual cortex are specialized to different visual features. These cells are the feature detectors that provide the basis for pattern recognition. Structural Theories of Pattern Recognition Structural theories take feature theories as a starting point, then try to define the relations among features once the set of features is specified. Thus, structural theories extend theories of feature analysis, and emphasize the relations among features, or how features are fit together in pattern recognition. This additional complication is necessary to have a successful pattern recognition theory because of the precision it provides. Unfortunately, these theories have not been taken very far as yet. For our present purposes, I just want to provide one experimental example of the importance of structural theories of pattern recognition. Biederman (1985) took line drawings of common objects and removed 65% of the lines. He then asked subjects to try to identify the objects. The slides in the lectures provide examples of these incomplete objects - is it easier to identify the pictures on the left or the right? Subjects were far better at identifying the objects when the missing lines where at midsegments rather than at vertices. In other words, when corners or angles were preserved it was easier to identify the pictures, because more information about how the features fit together was still available. So, both features (feature theories) and how features go together (structural theories) are important in pattern recognition. BOTTOM UP POCESSING So far we have been talking about a model of pattern recognition that begins with the sensory input and ends with an abstract, meaningful interpretation of the input. This is a "bottom-up" view of the pattern recognition process. In information processing terms, the sensory information is the bottom and the representation is the top. Consider that part of the system that is trying to recognize the letters in the word "birthday". A bottom up process to accomplish this would go through a series of successive steps, with the output of each step serving as the input to the next step. Gradually, we would move from light vs. dark splotches on a page to the separation of figure and ground, to the identification of the features, identification of the letters, and finally to the recognition of the word. A bottom up process begins with the sensory input and ends with its representation, with a series of orderly steps from bottom to top in between. The defining property of a strictly bottom up process is that the outcome of a lower step is never affected by a higher step in the process. Read the following sentence as quickly as you can outloud: Paris in the the spring. Did you notice there are two "the"s in the sentence? If reading (and all pattern recognition, for that matter) were strictly a bottom-up process, you would have seen the repetition. You would also be reading much more slowly than you do. Bottom-up processes are complemented by our knowledge and past experience. You read the above sentence very quickly, in part, because it is familiar to you, and you can anticipate the end of the sentence from the beginning. This is the contribution of top-down processes to pattern recognition. Top-Down Processing We use knowledge based on past experience, and the context, to guide pattern recognition. This is top-down processing. Both bottom-up and top-down processing operate together in pattern recognition. This is called the interactive model of processing. The lecture slides provide some examples of how context influences pattern recognition. For example in the two handwritten sentences, the identical pattern is recognized as "went" in one sentence and "event" in the other sentence. The use of context to make the most appropriate interpretation of an ambiguous pattern demonstrates the role of top-down processing. Look at the slide in the lectures that shows black shapes on a white background. If you have never seen this picture before, you might have trouble seeing what it is. It is, initially, difficult to discriminate the figure from the background, and identify the relevant features of the figure. If you have never seen this picture before, you must rely only on bottom-up processes to identify the pattern. (The picture is a Dalmatian dog sniffing leaves on the ground.) But the next time you see this picture, you will have no trouble seeing the dog. You can use past experience, or top- down processing, to help interpret a familiar picture. Similarly, you may initially have trouble seeing the deer's mate (outlined in the trees) or seeing the hidden tiger the first time you look at these pictures. But, the next time you see these pictures the hidden mate and the hidden tiger will jump out at you, because of top-down processing based on prior knowledge. (The hidden tiger is in the letters defined by the tiger's stripes.) Biederman, Glass and Stacy (1973) illustrated the contribution of top-down processing in their visual search study. They gave subjects pictures and asked them to search for a particular target (like a bicycle, or a hydrant). Some of the pictures were normal, and others were normal pictures that were "jumbled up". Note that both types of pictures contain exactly the same visual information. Not surprisingly, the subjects were much faster in finding the target in the normal picture than in the jumbled picture. This is because subjects can use context and prior knowledge (hydrants are on the ground), or top-down processing, to help guide their visual search for normal pictures. When the picture is jumbled, the context is lost and top-down processing cannot play much of a role. Interactive Model of Reading In a study involving letter and word identification, Riecher (1969) presented subjects with a briefly presented stimulus display that was either a four letter word, a single letter, or a nonword (a four letter word where the letters were scrambled). The study display was followed by a test display that contained a pattern mask (##'s) and two letters. (The pattern mask was to eliminate the iconic image of the stimuli to make reading the stimuli more difficult.) The subjects were asked to indicate which of the two letters in the test display had been shown in the study display. In which of the three conditions (word, single letter, nonsense word) do you think it would be easiest to identify the letters? It seems obvious that the nonsense word would be the most difficult condition because it involves an unfamiliar combination of letters. And that would be right. But what is not so obvious is that subjects were more accurate in identifying the letter at test when it was presented in the context of a word than when the letter was presented by itself. This finding has been called the: Word Superiority Effect: A letter is identified more accurately in the context of a word than when it is presented by itself. When a random letter is presented by itself, without any context, we can only use bottom-up processes to identify the letter. When a letter is presented in the context of a word, we can supplement bottom-up processes with top- down processes (information about what letters go together in words), which helps to speed-up the letter identification process. (When a letter is presented in the context of a nonword, the top-down processes can hinder letter identification because a nonword involves irregular or unfamiliar letter groupings.) The word superiority effect provides us with a basis for the McClelland and Rumelhart's (1981) interactive model of reading. (The diagram of this model is a simplified version of the actual model, but it serves our purpose here.) In the model, bottom-up processes operate on the visual input to identify first features, then letters, and finally words. Note that top-down processes at the word level can provide information that can help speed-up identification at the letter level. It is this interaction between top-down and bottom-up processes that makes pattern recognition such a fluent and efficient process. Lecture 4 – Speech Perception Many Intro Cognitive texts discuss speech perception as a precursor to language. I, however, like to consider speech perception as part of our discussion of pattern recognition because speech perception is, after all, just another form of analyzing and identifying patterns. In speech the patterns to be recognized are auditory signals spread out over time. The smallest element of speech is the phonetic segment or phone. The phonetic alphabet is a culture free system of describing all sounds used in any language. Any given language only uses a subset of all the possible phones. The phoneme is the smallest element of speech that makes a meaningful or semantic difference in a specific language. For example, consider the two phones /k/ and /q/ (as in the words keep and cool). (To convince yourself that these two "ku" sounds are different, say "keep" and "cool" and notice that your mouth makes different initial movements in saying these words.) In Arabic these two phones are also phonemes, as they distinguish between words. But in English, we treat these two phones as equivalent and they do not make a meaningful difference in distinguishing between words. There are two ways to describe the characteristics of speech sounds. First, speech sounds can be thought of in terms of how they are produced, or articulated. As an example, say the phonemes /b/, /d/, and /g/ out loud (pronounce these as vowels, not letters of the alphabet - the //'s indicate that it is a phoneme rather than a letter of the alphabet). Notice how the position of both your lips and tongue are different when you say these three phonemes. Place of articulation is one type of articulatory feature of speech sounds. For /b/, the force of the sound is at the front of the mouth with the lips initially closed. This place of articulation is called bilabial. In saying /d/, the force of the sound is from the middle of the mouth; your tongue touches the roof of your mouth. This articulatory feature is called apical. Finally, for /g/ the force of the sound is further back still, and this feature is called velar. A second type of articulatory feature is voicing. Some phonemes, like /b/ are more forcefully expressed (voiced), whereas other phonemes such as /p/ are less strongly emitted (unvoiced). There are several other types of articulatory features that we need not worry about. The main point here is that speech sounds can be described in terms of how our speech apparatus (the lips, tongue, and larynx) actually produce or articulate phonemes. The second way to distinguish between phonemes is in terms of their physical or acoustic features. A speech spectograph shows what sound frequencies are present and how the frequency varies over time for speech. Spectographs for phonemes have three dominant frequency bands that rise or fall to a steady-state over time. These are called formants (numbered one to three from low to high frequencies). The formants (especially the first and second) provide the necessary auditory information to distinguish one phoneme from another. Just like we saw for visual pattern recognition, speech perception involves the operations of bottom-up and top- down processes to identify features, phonemes, and words. I am going to discuss three phenomena of speech perception - phoneme restoration, segmentation, and co-articulation (or parallel transmission) - that show the importance of top-down processes in speech perception. We will end with the phenomenon of categorical perception that provides evidence for feature detection. The analysis of features provides the basis for bottom-up, or data- driven, processes in speech perception. Top-Down Processing 1. Phoneme Restoration Warren (1970) asked subjects to listen to tape-recorded sentences. Warren physically removed a phoneme from a word in a sentence and replaced the missing phoneme with the sound of a cough. After listening to each sentence, he asked his subjects if they heard anything unusual and whether anything was missing. Subjects heard the cough, but most subjects did not notice that a phoneme was missing. Warren called this phenomena "phoneme restoration". Based on the context of the other words in the sentence, top-down processes filled in or restored the missing information. This is done in such a fluent manner that we usually do not even perceive that something was missing. In a second, and more dramatic study, Warren and Warren (1970) also omitted a phoneme from a word in a sentence. An example of one of their sentences is: It was found that the *eel was on the orange. Note that the word with the missing phoneme could be one of several different words (meal, wheel, peel, steal, deal, heel, etc., that all sound alike except for the first phoneme). Also note that there is only a single word in the sentence that provides the context to interpret or understand the incomplete word (orange), and this word occurs at the end of the sentence. Even so, Warren and Warren still found that subjects rarely noticed the missing phoneme. Thus, top- down processes do not need very much context to operate, and top-down processes can influence the processing of preceding as well as subsequent information in a sentence. 2. Segmentation Think back to when you heard a foreign language that you do not speak or comprehend. What did you hear? A language we do not understand sounds like a fairly continuous stream of sounds. This is, actually, how it should sound. A spectograph of ongoing speech shows that there is a continuous stream of sounds. Now, what do you hear when you listen to a language that you do understand? Our perception is that we hear a series of distinct words with pauses between the words. These pauses, though, are an illusion created by our analysis and comprehension of the words in the sentence. The gaps between words in spoken speech are not physically present in the speech stream. But if the speech stream is continuous, how does the pattern recognition system know how to partition the phonemes into the right words? This is the segmentation problem. And the solution to this problem is the contribution of top- down processes that use context to help guide the analysis of speech. Let me try to illustrate this idea with an example. Suppose you simply said to a friend out of the blue (i.e., with no context) "more on". Would your friend hear "more on" or "moron"? How would they know whether you said one word or two? They wouldn't. They would need context in order to interpret what you said (unless, of course, they jumped to a hasty conclusion!). One can think of many such examples of expressions that are acoustically ambiguous in the absence of context. Here are just a few: "intense" or "in tents", "real of" or "real love", and one of my favourites, "fun guy" or "fungi". I hope the point here is clear. Top-down processing based on context plays an important role in solving the segmentation problem in continuous speech. Without context, it can be difficult to know how to interpret speech.
More Less

Related notes for PS260

Log In


Join OneClass

Access over 10 million pages of study
documents for 1.3 million courses.

Sign up

Join to view


By registering, I agree to the Terms and Privacy Policies
Already have an account?
Just a few more details

So we can recommend you notes for your school.

Reset Password

Please enter below the email address you registered with and we will send you a link to reset your password.

Add your courses

Get notes from the top students in your class.