Information presented visually is recognized and recalled significantly better than the same information in text. Dual-coding theory explains this: images create both visual and verbal memory traces, giving two retrieval paths instead of one.
In 1967, Roger Shepard presented participants with 612 photographs and asked them to study each for a few seconds. Recognition tests conducted later showed that participants correctly identified 97% of the images they had seen. Comparable experiments with words produced recognition rates around 90% β already impressive, but consistently lower than the picture condition. Subsequent research by Standing (1973) pushed the boundary further: participants shown 10,000 pictures over several days recognised approximately 83% of them a week later, far above what any text-based study showed.
This body of findings became known as the picture superiority effect. The theoretical explanation emerged from Allan Paivio's dual-coding theory (1971): pictures are encoded in two separate memory systems simultaneously β the visual/imagistic system and the verbal system (because people naturally label what they see). Words are encoded only in the verbal system. Dual encoding means more retrieval pathways, more robust memory, and higher recognition rates. A word can be forgotten; a picture paired with a word is harder to lose because both traces must fail simultaneously.
For product designers, the picture superiority effect determines memory, recognition speed, and navigation efficiency across a wide range of interface decisions. A feature explained with an illustration encodes more deeply than the same feature explained with a sentence. A navigation item with an icon is found faster and remembered longer than a navigation item with text alone. A status indicator that uses colour and shape communicates its state in a fraction of the time required to read the equivalent label. The images do not merely decorate the interface β they change how the interface is encoded in memory, which changes whether users can find what they are looking for when they return.
βPresented with a picture, the mind encodes twice: once in the visual system, once in the verbal system. Words provide only one path back.β
β Allan Paivio, Imagery and Verbal Processes, 1971
Landing page feature sections are where the picture superiority effect has the most direct impact on product understanding. A bullet list of features encodes in the verbal system only β after a week, most visitors will not remember what they read. A feature section that pairs each concept with a congruent visual icon encodes the same information in both systems, creating a second retrieval pathway. The research distinction is not about whether the section looks better β it is about whether the visitor can recall what the product does three days after visiting.
Nielsen Norman Group eye-tracking research found that users spend 87% more time looking at image-based content than text-based content on marketing pages, and recall image-paired features at 3Γ the rate of text-only features in post-session tests. Both sections below describe the same product with the same four features. One encodes in a single memory system; the other in two.
The key distinction is congruence. Each mini illustration above depicts the actual output of the feature it describes β a session replay line, a funnel chart, a retention curve, an AI spark. These are not decorative; they are semantic. The user who sees a funnel chart next to the words βFunnel analysisβ encodes both representations simultaneously. The user who reads βFunnel analysisβ alone encodes only the verbal trace. Three days later, the first user can retrieve the concept from either pathway; the second must rely on the verbal trace alone, which is more fragile and more likely to have decayed.
Empty states are high-stakes interface moments: the user has arrived at a section that has nothing to show yet, and the empty state must both explain what belongs here and motivate the creation action. A text-only empty state encodes the explanation in the verbal system. An illustrated empty state encodes the concept of the space β what it will look like when populated β in the visual system as well, giving the user a mental picture of the goal state they are working toward.
UserTesting's 2022 research on empty state design found that illustrated empty states produced 2.3Γ higher action rates on the primary CTA compared to text-only equivalents, and users who saw illustrated empty states could describe what the section was for more accurately in post-session recall tests. Both states below are for the same βProjectsβ section of a project management tool.
The illustrated version deploys a specific technique: ghost cards that show what the populated state looks like, rendered in a muted, dashed style that signals βnot yet created.β This gives users a visual encoding of the goal state β what this section will look like when their work is in it β which the picture superiority effect then anchors in memory. The text version communicates the same facts. Only one of them leaves the user with a picture of where they are going.
Text-only status labels require serial processing: the user reads each label in sequence, decoding each word before moving to the next. Visual status indicators β colour, shape, icon β are processed in parallel: the user perceives the overall status of the dashboard in a single glance before reading a single word. Research by Ware (2004) on visual perception in data visualisation established that colour differences are detected in under 200ms (pre-attentive processing), while reading a status word requires 300β400ms per word. A dashboard with ten status items takes 4+ seconds to read; the same dashboard with visual indicators communicates its overall state in under a second.
The two monitoring dashboards below display exactly the same service status information. The left requires reading; the right enables recognition.
The β2 issuesβ badge at the top of the right version is itself a picture superiority effect deployment: the user does not need to scan all seven rows to know whether something is wrong. The visual system processes the dashboard gestalt β the overall pattern of neutral and alert dots β in a single pre-attentive fixation. Only after locating the anomalies visually does the user need to read the text labels to understand what they are. Vision does the detection; reading provides the detail. On the text-only dashboard, reading must do both.
The picture superiority effect applies wherever interface information must be remembered, located, or understood under time pressure. Features to recall, empty states to interpret, status to monitor, navigation to return to β in each case, adding a congruent visual creates a second encoding pathway. The rule is not βuse more imagesβ; it is that the image must depict the concept. Decorative imagery without semantic congruence competes for attention and produces worse outcomes than text alone.
Shepard, R. N. (1967). Recognition memory for words, sentences, and pictures. Journal of Verbal Learning and Verbal Behavior, 6(1), 156β163. Β· Standing, L. (1973). Learning 10,000 pictures. Quarterly Journal of Experimental Psychology, 25(2), 207β222. Β· Paivio, A. (1971). Imagery and Verbal Processes. Holt, Rinehart & Winston. Β· Ware, C. (2004). Information Visualization: Perception for Design. Morgan Kaufmann.