Form Perception I
(Perception of Form)
The Gestalt Philosophy:
- 1920s and 1930s, German psychologists began to study how ppl perceive the world
- called "Gestalt psychologists"
- Believed: The whole is greater than the sum of its parts.
ie. Ppl perceive the whole stimulus rather than each individual part
- the movement was in part a reaction to the structuralist approach(everything could be
reduced to basic elements) in vogue at the time.
Motion is an emergent property (of the sequence of pictures).
Gestalt Principles = Laws that describe how we organize visual input in certain ways.
Thought to be innate, or that we acquire them rapidly.
Six Gestalt Principles:
1. Figure-ground = ability to determine what aspect of a visual scene is part of the
object itself and what is part of the background.
Implications: Reversible figure-ground picture.
Ex: Vase or face?
2. Proximity = helps with grouping, says that elements that are close together in space
tend to belong together.
You also tend to group objects closer together compared to those further apart.
Ex: Daisies/ XXXs
3. Closure = refers to the fact that if there are gaps in the contours of a shape, we tend
to fill in those gaps and perceive a whole object.
Ex: Truck and pole
4. Similarity = the tendency for us to group together elements that are physically similar
Ex: Grouping XOXO
5. Continuity = lets us perceive a simple, continuous form rather than a combination of
Ex: X's and stem of a flower
6. Common Fate = helps us with grouping, the idea that things that change in the same
way should be grouped together. Change together, should be grouped together.
Ex: We tend to group elements together if they are moving together in the same
direction at the same time.
Can explain why we can suddenly see a camouflaged anumal once it moves (moth on a
Expectations: What a person expects to see can influence what they actually see
Steps in Object recognition
- The preliminary steps in object recognition involve identifying what aspect of the scene
is the figure and what is the background.
- The parts of the figure are identified and grouped together into a single object
- Then bottomup/topdown
Object Recognition is a combination of two processes.
1) Bottom-up processing: Object recognition is guided by the features that are present in
the stimulus. You recognize what you see by analyzing the individual features and
cmparing those features to things with similar features that you have in memory.
Ex: Parts of a cow --> cow
2) Top-Down processing: Object recognition is guided by your own bliefs or
Ex: ABCD,11 12 13 14...
Experimenter measures how fast a participant can read a word that is flashed on the
screen. If you tell the participant that the next word is an animal, you'll find a priming
effect because words like dog or duck will be recognized a lot faster here than words like
log or puck.
Thus shows that processing of a word is more effecient if the participant is primed to
expect a word from a certain category.
*Top-down can't work alone bc you need some input from the stimulus itself before your
expectations about the stimulus can influence your recognition of it.
*Bottom-up can't explain everything alone either because expectations certainly do
influence our perceptions.
Must be a combination: bi-direcctional activation (where processing occurs inboth
directions at once). STIMULI COGNITION
Theories of Object Recognition (OR)
1. Biederman's Geon Theory = suggests that we have 36 different geons (simple
geometrical forms) stored in memory. Like, cone, sphere, cylinder etc. Using just 36
geons, its possible to recognize over 150million different objects.
Limitation: Some stimuli can't be described by geons (Faces, crumpled pieces of paper
difficult to recognize geons but we can still recognize these stimuli)
Some forms of brain damage lead to very specific deficits. Ex, may not be able
recognize different types of fruit but they can name different types of tools.
geons could be processed at a different level of processing seperate from the area of
2. Template Theory/Exemplar Theory = we store many templates in memory, and
when we come across an object we compare that object to all the templates in memory.
If match found, then its a familiar object and the person could name it by activating
connections to other langauge areas in the brain. If no match, then unfamiliar and a new
template stored in memory.
Limiations: Weak theory. We would have to store an incredible number of different
templates to recognize all of the different objects that we encounter.
3. Prototype Theory = we store the most typical or ideal example of an object. Our
Limitations: Protoype always changing as we grow and experience new things. Doesnt
explain new encounters with new objects in memory.
~SUMMARY~ The importance of parallel processing
We are able to recognize objects as quickly and efficiently as we do, in part bc of the
neural processing of object information is done in parallel.
Different brain systems process different components of the visual signal simultaneously.
No theory to explain completly of our ability of OR. Much better than computers at
Perceptual Constancy = Our ability to perceive an object as unchanging even though
the visual image produced by the object is constantly changing.
5 Perceptual Contancies(Things that can influence OR):
1. Shape constancy -we perceive objects to have a constant shape, even though the
actual retinal image of the shape would change as your point of view changes or as the
object changes position.
Ex: door rectangle
2. Location constancy - Objects are constantly moving around on our retinas as we
move our eyes, heads and bodes. But despite this, we perceive the objects around us as
Ex: Moving traffic when you're in your car
3. Size constancy - We tend to see size of objects around us as unchanging, (even
though these objects may vary in distance from us), the size of the retinal image that
they produce can vary quite a bit.
Ex: Freind going further
4. Brightness constancy - refers to our ability to know that the brightness of objects
around us dont change even thugh the object may reflect more or less light depending on the ambient lighting conditions. (Light amplititude)
Ex: Black still looks black no matter where you are (coffee)
6. Colour constancy - The way we perceive objects around us to have a constant
colour even though the light stimulus that reaches the retina may change with different
illumination conditions. (Light wavelength)
Ex: Bckground exist to make context to directed object. 'Red' dog at night.
How do we explain perceptual constancies?
- Existing knowledge.
- Cues in the scene (depth, shape,context, background,environemnt)
Q: Why do these illusions occur?
A: Bc our perceptual strategies, which work most of the time, are used in these
particular situations where they don't belong.
Many of our perceptual constancies can be overcome by simply by removing the relevnt
3 Types of Visual Illusions:
1. Muller Lyer Illusion: Misapplying Size Constancy. Both lines are actually the same
size. Implications: Culture - Evironment mostly free of right angles. People who live in these
type of surroundings, are much less suseptible of the Muller-Lyer illusion and say the two
lines are the same length. Thus, this illusion is at least partly due to cultural and
experience dependent processes.
2. Ames room: Actual room is trapezoid. Misapplying Size constancy (tricked by
3. Ponzo illusion: Misapplying Size constancy (tricked by depth cues) Form Perception II
Goals: To discuss how our brains pocess visual forms of OBJECT RECOGNITION and
Magno and Parvo Cells: Visual Pathway
First, ganglion cells (ie magno and parvo cells) in the retina transduce the light stimulus
into a neural impulse.
Magno cells: found mainly in the periphery of the retina.
Used for detecting changes in the brightness, motion and depth.
Parvo cells: found throughout retina.
Used for detecting colour, pattern, and form.
*These ganglion cells with their small receptive fields, are the crucial first step to object
From the retina, the axons of these cells exit the eye via the optic nerve, travel to the
LGN, and end up in the primary visual cortex in the occipital lobe. Feature Detectors = Cells very particular about what will make them fire
Hodgkin and Huxley (1952): Squid
Scientists recorded the electrical activity in an individual neuron of a squid. This helped
build a foundation for other researchers to use this technology to see how individual
neurons respond to specific stimuli.
Lettvin et al (1959): frogs
Based on H&H, lettvin et al found a neuron in the optic nerve of a frog that responded
only to moving black dots and they called these cells "bug detectors"
Hubel and Wiesel (1962): cats
Extended work in their studies of cells in the visual cortex of cats and monkeys, and
earned Nobel Prize in 1981.
Aim: Explore the visual cortex by trying to learn what type of stimuli the individual cortical
cells responded to.
Procedure 1: Put microelectrodes in the cortex of a cat to record the electrical acitivity of
individual neurons as the cat was shown different types of visual stimuli, such as flashes
Problem: They were not getting much response from the neurons until...
Solution: One day when they presented the cat with a slide that had a crack in it.
Results 1: When the line that was projected from the crack moved across the cat's visual
field, the neuron started to fire like crazy! They realized that neurons must respond to
stimuli that are more complex than diffuse flashes of light.
Proceudre 2: Began using lines of different orientations and thickness that moved in
different directions, and they found that each neuron is very specific about what will make it fire the most.
Results 2: These cells fire maximally to stimuli of a certain shape, size, position, and
movement, and this defines the receptive field for that cell.
Types of Feature Detectors Cells:
Simple Cell = Responds maximally to a bar of a certain length and orientation in a
particular region of the retina.
Ex: This simple cell responds the most to a horizontal bar. But if that same bar is moved
outside that particular region and/or changes orientation, then the cell will be inhibited
and fire less than baseline. Thus the receptive field for a simple cell is organized in an
opponent fashion, making it sensitive to the location of the bar within the receptive field.
Complex Cell = Responds maximally to a bar of a certain length and orientationn,
regardless of where the bar is located within the receptive field.
Doesn't care where in its receptive field the bar is located and will even continue to fire if
the bar is moving within the receptive field. Hypercomplex Cells = Responds maximally to a bar of a particular orientation that ends
at specific points within the receptive field.
Ex: This hypercomplex cell fires the most to a horizontal bar of light that appears in the
"on" region of the receptive field, but gives only a weak response if the bar touches the
"off" region. So these cells have an inhibitory region at the end of the bar, making them
sensitive to the length of the bar. ~SUMMARY~
Visual scene is done in the visual cortex (duh).
Neighbouring objects in your visual field are processed by neigbouring areas of your
The largest amount of cortex is devoted to processing information from the central part
of the visual field, which projects onto the fovea. However, (even though much of cortex is preserved for acitivity in centre visual field):
♠Each region of the cortex receives some input from a small piece of the visual field
♠& Within each region there are cells that analyze specific features of the scene
♠For a particular part of the visual field, there are neurons that:
♣Fire maximally if there is something in the scene that has a line of a certain
orientation, length, and movement
♣Other neurons respond maximally if there is something in that tiny portion of
the visual scene that is a specific colour
♣Other neurons respond most when there is a line that moves in a certain
♠Parallel processing = Cluster of cells in the region of the cortex right beside this
region are doing the same analysis for neighbouring part of the visual scene. Benefit:
speed. Ventral Stream
Combining information in the Extrastriate
Processing of visual input in the primary visual cortex involves specific cells responding
to relatively specific features from a small portion of the visual field. But for the visual
scene to make any sense, this information has to be combined to form a meaningful
Subregions in extrastriate
Combination begins in the extrastr