Form Perception I (March 4 2013)
Gestalt Principle : “The whole is greater than the sum of its parts.”
1920s and 1930s, German psychologists studies how people perceive the world around
Believed that people perceive the whole stimulus rather than each individual parts
The Gestalt movement was in part a reaction to the structuralist approach in vogue at
the time, which suggested that everything could be reduced to basic elements.
Ex: the perception of movement you experience when watching a movie made by
flashing slightly different static pictures every second. There isn’t continuous movement
in or across any of the frames, but we could still perceive continuous movement as we
watch the rapid sequence of still pictures. -
Motion is an emergent property of the sequence of pictures.
-The perception of the movie in its entirety, including all of the complex movement, is
something more than the collection of thousands of still photographs. We can sit down
and analyze every photograph, but that would never provide the same rich experience
that we would get from watching the movie itself.
Six Gestalt Principles:
1) Figure-ground: the ability to determine what aspect of a visual scene is part of the
object itself and what is part of the background.
o ex: viewing a vase of flowers against a flowery wallpaper is an example of
o the snowman: a small, enclosed region that is completely surrounded by a larger
region, which would be the background. The figure would tend to have distinct
borders or edges that give it a perceptible form, is perceived as being in front of
the background, which is typically be formless or made up of multiple forms.
o We constantly determine what is figure and what is background. We do it often
and automatically that the entire process seems too straightforward to even
contemplate. But this seemingly simple process can be more difficult if the cues
that are used to make these figure/ground decisions aren’t clear, as is the case
with reversible figures.
1) 2) 2) Proximity: says that elements that are close together in space tend to belong together.
Ex: in a field of daisies. They aren’t all uniformly spaced apart but tend to have
regions where they’re clustered close together in some areas and fewer in
numbers in other areas. You will naturally see the regions of high daisy density as
one group of daisies because their proximity to each other, rather than grouping
together some daisies from one cluster with some from another cluster.
Ex: a row of X’s that vary in tee spacing between X’s. According t proximity,
you’re more likely to group together the X’s that are close together than the X’s
that are far apart.
3) Closure: is the Gestalt principle that refers to the fact that if there are gaps in the
contours of a shape, we tend to fill in those gaps and perceive a whole object.
Ex: although the telephone pole is in front of the truck, and blocking part of the
contour of the truck, we don’t perceive the object as being two separate pieces
of a truck. Instead, we automatically fill in the missing part that we can’t see and
perceive the truck as a whole object.
Our eye will automatically perceive it as a rectangle, even if there are obvious
gaps in it. In fact, our tendency to fill in the gaps may be so strong that we might
even think we see faint lines across the gaps.
4) Similarity: the tendency of human to group together elements that is physically similar.
Ex: driving by a farmer’s field that had alternating rows of sunflowers and corn.
Even though the distance between rows might be the same as, or even less than,
the distance between plants within a row, we tend to group together the
vegetation of the same type.
Supposed we have a grid made up of alternating rows of X’s and O’s. We will
tend to see the columns of the same elements, either all X’s or all O’s, as
belonging together, rather than grouping together a row of “xoxoxo”.
5) Continuity: is the Gestalt principle that let us perceive a simple, continuous form rather
than a combination of awkward forms.
5) 6) Ex: the letter “X” tends to be perceived as two continuous lines,”/” and “\”, that
cross in the middle, rather than seeing a combination of “v” or ^.
Seeing a vase of flowers, we’re likely to perceive each stem as a continous line,
even though they’re likely to e criss-crossed with others stems. It’s not the case
that we see the top of two stems as one form.
6) Common fate: the idea that things that change in the same way should grouped
From the fish pic, we tend to group elements together if they are moving
together in the same direction at the same time. Looking at a school of fish and
see them moving together in the same dirction, we tend to groupd them
This tendency is strong enoug to lead us to a perception of the group of
elements as a kind of object on its own.
It can also explain why we can suddenly see a camouflaged animal once it moves,
like a moth against the bard of a tree. It’s difficult to see when it stays still, but as
soon as it moves, there are elements within the moth’s patten that are moving
together in the same direction and the same time. These moving elements with
a common fate allow the contour of the moth’s shape to be perceived, and it
seems to pop out against tree.
Pattern/ Object recognition
The costume ball/Marineland example shows you that what person expects to see can
influence what they do see.
The preliminary steps in object recognition involve identifying what aspect of the scene
is the figure and what is the background. Once that is established, the parts of the figure
are identified and grouped together into a single object.
Recognizing an object is really a combination of two processes.
1) Bottom-up Processing: object recognition is guided by the features that are present in
- Ex: recognize a cow as being a cow; seeing it has 4 legs, goes “moo”, has an udder, a
big nose, 2 long ears on the side of its head, and 2 big eyes.
- So bottom-up processing says u recognize what you see by analyzing the individual
features and comparing those features to things with similar feathre that you have
in memory. -
2) Top-down Processing: object recognition is guided by your own beliefs or
- The 2 letters in both words are physically identical. Yet, you still read it as “THE
CAT” because your influenced by the context.
- When reading this series, the 2 and 10 symbols are physically identical, yet you
will read them as being “B” and “13” respectively.
- In a priming experiment, the experimenter measures how fast a participant can read
a word that is flashed on a screen. If you tell the participant that the next word is an
animal, you’ll find a priming effect because words like dog or duck will be recognized
a lot faster here than words like log or puck. This shows that processing of a word is
more efficient if the participant is primed to expect a word from a certain category.
- Top-down processing can’t work alone cuz you need some input from the stimulus
itself before your expectations about that stimulus can influence your recognition of
it. Bottom-up processing can’t explain everything alone either cuz expectations
certainly DO influence our perceptions.
- Both of these processes must be involve and we’re dealing with bidirectional
activation, where processing occurs in both directions at one. In this way, the
features of the object in combination with our expectations guides object
Geon Theory: Biederman suggests that we have 36 different geons, or simple
geometrical forms, stored in memory. These would be forms like a cone, a sphere, and a
cylinder. According to this theory, using just these 36 geons, it’s possible to recognize
over 150 million different objects. However, not all stimuli could be explain by the 36 geons. There are certain
stimuli, like faces or crumpled pieces of paper, for which it is difficult to
determine what geons would be used, yet we have no difficulty recognize them.
Criticism from brain damage studies: some forms of brain damage lead to very
specific deficits. Ex: people suffering from these brain injuries may not be able to
recognize different types of fruit, but they can name different types of tools. If
geon’s included, you might expect deficits in recognizing all types of objects
based on their shapes and not a specific category of objects. (But it might be
possible that geons could be processed at a different level of processing separate
from the area of brain damage.)
Template Theory: we store different templates in memory, and when we come across
an object, we compare that object to all the templates in memory. If a match is found,
then it’s a familiar object and the person could name it by activating connections to
other language areas in the brain. If not match is found, then it’s an unfamiliar object
and a new template is stored in memory.
It’s not compelling because we would have to store an incredible number of
different templates to recognize all of the different objects that we encounter.
A theory that overcomes the storage problems of TT is Prototype Theory. It says that we
store the most typical or ideal example of an object. This system is much more flexible
cuz you don’t need an exact match between the observed object and what is stored in
memory. (That’s how we can recognize common objects that we’ve never seen before)
it’s likely that we have more than one type of representation for each object.
We are able to recognize object as quickly and efficiently as we do, in part, because much of
the neural processing of object information is done in parallel. That is, different brain
systems process different components of the visual signal simultaneously. No particular
theory can explain the ability to recognize objects.
It is probably the rule rather than the exception that a specific object will look
somewhat differently every time we look at it.
Our ability to perceive an object as unchanging even though the visual image
produced by the object is constantly changing.
Things that can influence object recognition.
1) Shape constancy: refers to the fact that we perceive objects to have a constant shape,
even though the actual retinal image of the shape would change as your point of view
changes or as the object changes position.
- Ex: perceive the shape of the door to be rectangular, but it really only produces a
rectangular retinal image if you’re looking at it straight on and the door is closed.
When u moved, or door’s opened, the shape of the retina image is no longer a
rectangular, but you still perceive the door as having a constant rectangular shape. - Ex that a visual illusion that results from our tendency to adjust our perception of
the shape of an object to account for our own viewing angle. (different view from
different angle of a same object)
2) Location constancy: objects are constantly moving around on our retina as we move our
eyes, head and bodies. Despite this constant movement, we perceive the objects
around us as stationary.
- Ex: driving in a car, the entire scene is moving very fast on our retinas, but we don’t
perceive the objects in the scene to be moving.
3) Size constancy: we tend to see the size of objects around us as unchanging, even though
as these objects vary in distance from us, the size of the retinal image that they produce
can vary quite a bit.
- Ex: as ur friend walks away from you, you don’t suddenly gasp in horror thinking that
he is shrinking before your eyes as your retinal image of his gets smaller and smaller.
Ur perceives that he is still the same size but that he’s getting farther away from you.
4) Brightness constancy: refers to our ability to know that the brightness of objects around
us DOES NOT change even though the object may reflect more or less light depending
on the ambient lighting conditions.
- Ex: we perceive that our favourite coffee mug is the same brightness whether we
see it outside on a sunny day, or inside in a dimly lit room. Black is still black, white is
still white; regardless of whether we are inside under relatively low illumination or
outside on a bright sunny day. Although this’s our perception, in fact, the black
object outside is reflecting more light than the white object inside.
5) Color Constancy: has to do with the way that we perceive objects around us to have a
constant color even though the light stimulus that reaches the retina may change with
different illumination conditions.
- We could still recognize your white dog even if she was under a red fluorescent light
and looked reddish.
Most objects don’t change: 1) our friend is a constant size, the coffee mug has a particular level
of brightness, our dog is a certain color, 2) building don’t move as we drive by and doors don’t
morph into a different shape when we open them.
Cues in scene
- Our visual system has a way of picking up cues in the rest of the scene and using
those as clues to perceiving constancy in an object.
- 1) we might use depth cues to both determine that our friend is far away and shape
how we perceive our friend in that context. “ we use the depth cues to keep us from
seeing our friend as shrinking in size as he moves farther away. So even though he’s
producing a retinal image that is considerably smaller than the lamp post in the foreground, you know from other cues in the scene that he’s just far away and of
normal height. “
- 2) when you’re driving a car and approaching a bus that stopped in front of you, you
don’t see the bus as moving toward you. “our brain is integrating the motion of all
the elements in the scene.” If the bus is moving towards you, everything in the scene
would remain stationary. When everything in the scene is moving toward you, the
brain can use the information to determine that the movement is actually yours and
adjust how you perceive the scene accordingly.
Perceptual constancy occur cuz we know certain properties of objects DO NOT change and our
perceptual system automatically factors in other cues in the environment that gave us
information about the object of interest.
The reasons these illusions occur is cuz our perceptual strategies, which work most of the time,
are used in these particular situations where, in fact, they don’t belong. We think we see one
thing when, in reality, what we’re looking at is something quite different. Many of our
perceptual constancies can be overcome by simply by removing the relevant contextual
Muller-Lyer Illusion: it is an example of misapplying size constancy and inaccurately interpreting
The angled lines on top of the vertical lines each look like a corner, but the one on the left looks
like a corner that is pointed toward you, whereas the one on the right looks like a corner that is
receding away from you. Since the two lines give the exact same retinal image but the one on
the left is assumed to be closed to you than the one on the right, the closer one is perceived as
People from cultures who live in round huts and aren’t surrounded by right angles are much
less susceptible to the Muller-Lyer illusion, and they’re more likely to say that the two lines are of the same length. --- This provides some support that Muller-Lyer is at least partly due to
cultural and experience dependent processes.
Ames Room: a specially constructed room that looks like a normal rectangular room except it’s
actually trapezoidal in shape; one corner is much farther away from your point of view. Having
two people standing at each corner, the one standing at the farther corner looks a lot smaller.
Since you believe and perceive the room of normal height, you interpret the scene as though
each person is the same distance from you. When your brain applies the compensatory
computations that normally lead to size constancy, the cues used for distance constancy. If you
perceive the distance to be the same between u and the two ends of room, then the 2 people
will be perceived as different sizes. The one who is in reality closer to you will be seen as larger.
Ponzo Illusion: is the result of conflicting size constancy