Unveiling the Word Superiority Effect in Computational Vision

The Word Superiority Effect OR How humans use context to see without really seeing and how can it help the field of computational vision

Humans are not simply detectors of patterns of light. • We infer interpretations of the physical stimulus from the context. • The effect of context is made clear in a fascinating phenomenon called the word superiority effect. • The effect was first discovered by James Cattell (1886).

First major breakthrew – Reicher’s experiment in 1969 • Reicher presented strings of letters – half the time real words, half the time not – for brief periods. • The subjects were asked if one of two letters were contained in the string, for example D or K. • Reicher found that subjects were more accurate at recognizing D when it was in the context of WORD than when in the context of ORWD.

How Reicher excluded factor like memory and guessing? • Asking the subject only about one letter (and not a ‘whole report’). • Doing so immediately after the display. • Using a forced choice task (rather than identification), when both choices would make sensible words.

Reicher’s findings • ‘Word Superiority Effect’ : comparing four letter words against non-pronounceable nonwords (e.g. WORD vs. ORWD), there was an advantage in reporting single letters from words than from non-words. • ‘Word-Letter Effect’ (WLE): report of one letter from a 4 letter word was more accurate than the report of a single letter presented alone.

Since there are four times as many letters to 'perceive' with the four letter words, this result is counter-intuitive, and therefore quite striking.

So what is the subject of my project?

The project’s goals • To conduct an experiment that will verify the “word superiority effect”. • To try and examine if the effect, and the hypothesis the researchers presented to explain the effect, has any ramification that can help in the computational vision field.

First experiment • The original experiment, similar to the way it was conducted by Reicher.

Second experiment • An experiment which examines the effect of the words and letters size on the effect. This is a different version of the first experiment.

Third experiment • An experiment which is based on an experiment which was conducted in 1979 by Adams. This experiment checks if the subjects can recall details on the letters they just seen.

The two major obstacles • Finding a way to present the words for a very brief time (about 30 milliseconds), but in a way that the words will be sufficiently seen on a computer screen. • “Translating” the experiment to Hebrew. Since the effect can work mainly on mother Tongue, it was necessary that it be conducted in Hebrew.

The resultsfirst experiment

Second experiment

Third experiment Only in 40% of the words, the fonts were reported in 60% accuracy or more.

conclusions I’ve found evidence for the word superiority effect but not to the word letter effect

Bottom-upandtop-downprocessing • Bottom-up or data-driven processing: Processing which is driven by the stimulus pattern, the incoming data. • Top-down processing: Processing which is influenced by the context and higher-level knowledge. • To obtain such a result, one must postulate interacting bottom-up and top-down mechanisms which process information in parallel.

The Interactive Activation Model • McClelland and Rumelhart (1981) asked Exactly how does the knowledge that we have interact with the input? • The Interactive Activation Model is a system that includes both bottom-up and top-down processing.

Interactive Activation Model Word level Letter level Feature level

The Interactive Activation Model: • There is a node for each word and each letter (in each letter position). • The nodes are organized into levels. • The nodes are connected to all other nodes within levels or betwen adjacent levels. • Connections may be excitatory or inhibitory. WORD LEVEL LETTER LEVEL FEATURE LEVEL

1. The word "word" is presented.

2. The activation of "W" is shown

3. The features send activation to the letters.

4. The letters send activation to the words.

5. The words send back activation to the letters.

6. The same process occurs simultaneouly for "O", …

6. The same process occurs simultaneouly for "O", "R", and …

6. The same process occurs simultaneouly for "O", "R", and "D".

7. "D" in context is easier to recognize because it receives activation from…

7. ...the letters …

7. … and the words.

Back to computational vision What did we see so far? Top-Down view: Appearance-based recognition . Network structure: Relaxation labeling

What can we add? Combining the two kinds of processes together. Not only comparing the stimulus to our database and not only collecting the stimulus from bottom up, but a combination of both.

What can we add? The top down process doesn’t work exactly like in Appearance-based recognition : • The process activates all the nodes connected when it is starting to receive information. In fact, the process activates many nodes that are barely connected. You can even say that it wastes resources because for every stimulus , many nodes are activated. • It is a circular process, that means that if E activates EAT than EAT will also activate E in reaction. • The connections are not only excitatory, but also inhibitory.

So what is the problem? • In order to get a useful tool, the database should include a lot of data. • In humans, this database is built by our experience over the years. If want the computer to have a big database will have to teach him how to build this database himself.

So what is the problem? • It also requires a lot of memory, to recall the data on each node. If there isn’t enough memory it will only extend the amount of time which is needed to process the data. • If there is inconsistent data (like the word red written in green), the Interactive Activation model will only delay the decision.

The End

Unveiling the Word Superiority Effect in Computational Vision