Chapter 7- Attention and Scene Perception
Retinal array contains far more information than we can process.
Attention is not a single thing, and it does not have a single locus in the nervous system. Rather atten-
tion is the name we give to a family of mechanisms that restrict processing in various ways.
Attention can be internal or external
External attention refers to attention to stimuli in the world.
Internal attention our ability to attend to one line of thought as opposed to another or to select one
response over another.
Attention can be overt or covert.
Overt attention usually refers to directing a sense organ at stimulus-fixating the eyes on a single
word for example.
Covert attention is for example when youre pointing your eyes at this page, while directing atten-
tion to a person to the left.
Reading this text while continuing to be aware of the music playing in the room is an example of divid-
Watching the pot to note the moment when the water starts to boil is sustained attention.
In this chapter we will be concerned with selective attention, the ability to pick one (or a few) out of
Selection in Space
Average reaction time (RT) is the amount of time that elapses between the point when the probe ap-
pears and the point when the subject hits the response key.
Cue - a stimulus that provides a hint about where the target might appear.
Given a valid (peripheral) cue, RT decreases, this is because it allows the subject to pay attention to the
If a cue is invalid, RT increases (slower response) because the subject has been fooled into attending to
the wrong location.
We could have valid or invalid instances of either symbolic or peripheral cues.
How long does it take for a cue to redirect our attention?
We can measure the timing of the attentional shift by varying the interval between time 2 and time 3 =
stimulus onset asynchrony (SOA), a psychophysical variable.
If SOA is 0 ms the cue and probe appear simultaneously. There is no time for the cue to be used to
direct attention, and there is no difference between the effects of valid or invalid cues.
As the SOA increases to about 150 ms, the magnitude of the cueing effect from a valid peripheral
cue increases, as shown by the red line in Figure 7.4. After that, the effect of the cue levels off or
declines a bit.
Symbolic cues take longer to work, presumably because we need to do some work to interpret.
We are built to get information from the gaze of others.
The Spotlight of Attention Attention could be deployed from spot to spot in a number of ways. It might move in a manner analo-
gous to the movements of our eyes. When we shift our gaze, our point of fixation sweeps across the inter-
The spotlight metaphor makes good sense, but there are other possibilities:
Attention might expand from fixation, growing to fill the whole region from the fixation spot to the
cued location and then shrink to include just the cued location -> version of a zoom lens model of
When attention is withdrawn from the fixation spot, it might not move at all. It might simply melt
away at that location and then reappear at the cued location.
Best evidence suggests that attention is not moving from point to point in the brain in the way a physical
spotlight would move across the world.
Visual research experiments provide a closer approximation of some of the actions of attention in the
real world. In a typical visual search experiment, the observer looks for a target item among distractor
Set size: the number of items in the display. As a general rule, it is harder to find a target as the number
of items increases.
As the task of finding something becomes harder, the slope relating RT to set size grows steeper.
Saying yes the target is present is faster than saying no because even in the hardest task, a lucky
subject might stumble on the presence of the target with her first deployment of attention, but it is not
possible to stumble on absence if the target in the same way.
Efficiency is used to describe the ease with which we can work our way through a display.
For example, if we direct attention to the target as soon as the display appears = efficient.
If we must examine each item in turn until we find the target = inefficient search.
Feature Searches are Efficient
Feature search: target is defined by the presence of a single attribute.
Salient: an item that stands out visually.
We can process the color or orientation of all the items at once = parallel search.
The slope of the function relating RT to set size in in such searches is about 0ms per item.
Between one dozen and two dozen basic attributes seem able to support parallel visual search. These in-
clude obvious stimulus properties like color, size, orientation and motion, and some less obvious at-
tributes like lighting direction.
In figure 7.7, major change is that objects no longer look like 3D bricks and no longer differ in their ap-
parent orientation in depth = more harder to find objects.
Many Searches are Inefficient
When the target and distractors in a visual search task contain the same basic features, the search is in-
efficient = each additional item imposes a significant cost on the searcher.Serial self-terminating search in which items are examined one after another (serially) either until the
target is found or until all the items have been checked.
A target can hide in plain sight = easy to see and are very familiar but search is still inefficient.
In Real World Searches, Basic Features Guide Visual Search
Guided search: a search in which attention can be restricted to a subset of possible items on the basis
of information about the target items basic features.
An object is distinguished from most of the distractors by a conjunction of several features.
Conjunction searches: target is defined by the conjunction - the co-occurrence- of two or more fea-
In terms of efficiency: tow feature conjunction searches tend to lie between the very efficient feature
searches and the inefficient serial searches.
In example 7.10 for example, each distractor shares two features with the target, but the target is still
easy to find because we can guide our attention to the right conjunction of features.
In Real World Searches, the Real World Guides Visual Search
Searching for arbitrary objects is not efficient, the ease with which we search in the real world must in-
volve more than feature guidance.
Scene-based guidance: information in our understanding of scenes that helps us find specific objects in
scenes (ex: faucets are near sinks). This type of guidance