University of Warwick;
Acknowledgement: This work was funded by Economic and Social Research Council (United Kingdom) Grant RES-062-23-0545. I thank Gordon Brown, Nick Chater, Zach Estes, and Adam Sanborn for helpful comments.
Reading relies on identifying words. A word's stored representation must be accessed by the matching visual perceptual representation. The response to mismatching visual stimuli—in masked form priming and tachistoscopic identification experiments—has been extensively studied to inform theories of this representation and matching. Contemporary theories all assume that the matching is graded: Stored representations of mismatching words are accessed in spite of information that indicates the mismatch, but such access is less efficient the more severe the mismatch. The calculation of such graded matches is explicit in the spatial coding model (SCM;
The alternative theory posits—in common with a class of models of categorization (e.g.,
The claims of this article are that an appropriate characterization of letter identification processes is stochastic and piecemeal, and that phenomena attributed by other accounts to imperfect or graded matching in masked form priming and tachistoscopic identification are a consequence of these letter identification processes, rather than the details of lexical matching processes.
The Letters in Time and Retinotopic Space model (LTRS) implements these claims by specifying detailed assumptions about this piecemeal letter processing but specifying only the bare minimum of assumptions about lexical processing: that the lexical system can distinguish a match to a particular known word from a non-match for that word (which is necessary for stimulus identification to be possible at all), and that the lexical system is susceptible to a head start from additional stimulus exposure (which is necessary for priming to be possible at all). As such, the model cannot, of course, address phenomena that are clearly lexical, such as frequency effects or neighborhood effects, or even lexical decision itself, only the (relative amount of) priming that occurs in that task.
Thus, all of LTRS's explanations are that relevant information has yet to be perceived during a brief presentation; the timing of such information is a concrete concept, measurable in identification tasks. Models that rely on some form of similarity (match score) calculation must further specify several intervening hypothetical mechanisms: how such similarity is calculated, its influence on lexical access, and how the underlying representation gives rise to confusions when all information is present. LTRS is simpler because it does not require these explanatory mechanisms. In the light of other accounts, LTRS may appear to do little explanation, but this is because these other accounts interpose explanations where none are required.
This article presents (a) a description of LTRS; (b) discussion of some core aspects of the model; (c) LTRS fits to new word tachistoscopic identification data with manipulation of target duration for transposed-letter (TL) and 1-, 2-, and 4-letter-different (1LD; 2LD; 4LD) foils; (d) LTRS fits to nonword tachistoscopic identification data for a wide range of target–foil relationships; (e) LTRS fits for form priming data with a manipulation of prime duration; and (f) LTRS fits for priming with a range of prime–target relationships.
LTRS
On a given trial, processing of all letters begins at the same random point in time after onset. As an approximation, this time is normally distributed with mean α and standard deviation σ. The assumption that this time is equal for all letters is supported by tachistoscopic identification data specifically seeking—but not finding—time points at which identification of left-hand letters in words is above chance and right-hand letters is at chance (
Although processing begins at the same time for each letter, positional effects (such as left-to-right trends in accuracy at intermediate durations; e.g.,
The identity of the letter in position i becomes available at an exponentially distributed time after the start of processing, with processing rate proportional to βi; that is, once an amount of time t has elapsed since the start of processing, the probability that the letter i has been identified is 1 − exp(–kβit). Without loss of generality, k = 1. This assumption is the same as that for feature extraction in some models of categorization (e.g.,
Once a letter is identified, some information about its retinotopic location is available. This information is assumed to accurately specify a single point or small region within the letter (because a single letter spans several retinotopic letter detectors). As such, if (and only if) two letters have been identified, it is possible to (immediately) determine their order correctly, but other information is not reliably diagnostic as to stimulus identity (see below). That is, the identity/location of W in SWAN is enough to tell the word is not SCAN, but because the W in SWAN could appear anywhere on the retina, further positional information, such as the (approximate) position of the A, is needed to tell SWAN from SAWN.
After identification of each letter, a more precise positioning process occurs for that letter (even if other letters remain to be identified); this process has similar temporal properties to the identification process but with a different constant of proportionality. That is, for a letter in position i, once a duration t has elapsed since identification of that letter, the probability that the precise positional information has become available is 1 – exp(–λβit). The precise positional information is such that if this information is available for two letters, then it is known whether the two letters are adjacent, because this information reveals the retinotopic location of the left and right edges of the letters.
Although the retinotopic information that is posited to be available might allow estimation of the retinotopic distance between two letters, it is assumed that this information is usually discarded as unreliable (at least for short words). Absolute retinotopic distance information is unreliable because of size constancy: The same stimuli may become larger or smaller due to font size, viewing distance or viewing position. Relative retinotopic distance information is treated as unreliable because it usually is; proportional fonts and handwriting are often read: For instance—as illustrated in
Forced-choice tachistoscopic identification involves a brief, masked presentation of a word or nonword stimulus, following by a two-alternative forced choice for its identity, as illustrated in
Loss of information
After stimulus offset, all information about each letter has a small probability φ of being lost due to interference from the mask, or the identification of the response alternatives; no loss can occur while the stimulus is present on the display. When information is lost, all information about the relevant letter is lost; that is, identity information about a letter cannot be spared when its precise positional information is lost. Only stored letter information is used to determine a response; any other form of information is ignored because it has been rendered unusable or unreliable during the identification the response alternatives.
Response selection
Information distinguishes between the available options if (and only if) it could have been obtained from one alternative but not the other. If the available information does not distinguish between the available options, guessing occurs. In experiments where the correct response is available (as modeled here), this can only occur because of missing information, because the possibility of misperception is not used in LTRS's predictions. When the available information does distinguish between the options, there is a small probability ε that participants make a premature guess response (or motor error).
Masked form priming involves a brief (and usually forward-masked) presentation of a (typically lowercase) prime stimulus (typically a nonword) before presentation of a (typically uppercase) target word (sometimes with intervening mask or blank) on which another task, usually speeded lexical decision, is performed. To apply LTRS to form priming, specification must be given of (a) how priming occurs and (b) the cause of differences in magnitude of priming between different prime–target relationships.
Priming as savings
The effectiveness of the prime (compared to an unrelated prime that evokes no lexical processing of the target) is equal to the period of time for which it evokes initial lexical processing of the target (
That is, priming is defined as a head start in processing that occurs during the period where the target is a candidate for lexical identification of the prime. The precise nature of the lexical processing involved could moderate the effects, but this article explores the extent to which phenomena in word identification can be explained by essentially perceptual aspects of letter processing; including a detailed lexical processing mechanism would detract from this goal. A bare bones mechanism could be used to produce lexical decision times in combination with LTRS without modifying the priming predictions, such as a set of non-competing non-noisy accumulators with drift related to log-frequency. Although such a mechanism would produce a reasonable correlation with the word lexical decision data,
Termination of priming
In LTRS, the cessation of priming occurs stochastically (unlike
I would hope that the assumptions described so far seem neither implausible nor counterintuitive. These assumptions do, however, lead to what may be the most counterintuitive aspect of LTRS, particularly in the light of the emphasis in other approaches on representation and similarity: LTRS's assumptions suffice to make (good) predictions about data such as the effects of prime type on priming without specifying very many details of the representation. The representation needs (a) to be able to specify which letters have been perceived without specifying anything about letters that have yet to be perceived, (b) to be able to specify that two letters are adjacent in a particular order, (c) to be able to specify that two letters are non-adjacent without specifying the number or identity of the intervening letters, and (d) to be able to specify the order of two letters without specifying anything about adjacency.
There are several representations that meet these criteria. I do not specify which representation is part of the model because it has no consequences for the predictions at a behavioral level: Short of single-cell recording, differences between these representations (if the rest of LTRS is true) cannot be detected because of the all-or-nothing nature of the computations. Such representations include the following:
Globbing
This representation consists of an ordered letter string with unknown information indicated by a special character, as in wildcard matching for computer filenames, such as * for any number of unknown letters, and + for at least one unknown letter. The representation is initially unspecified, that is, *. Thereafter, the pattern would become more specific as more information is identified. For instance, *c*, *a*, *t*, c*, *t, *c*a*, *c*t*, *a*t*, c+t, c*a*, c*t*, *c*t, *a*t*, ca+, +at, *c*a*t*, c*a*t*, *c*a*t, ca*t*, c*a*t, *c*at, and cat are the possible states in this representation that could be produced by the stimulus CAT with the LTRS assumptions (though the representation could express other states consistent with CAT). Matching could occur via a serial scan, but parallel constant-time solutions exist for these kind of problems (
Bigrams with multiple states
In this representation, there would be nodes (or other representational units) for different open bigrams (i.e., ordered pairs of letters) that could carry additional information than just the presence of an open bigram by virtue of having multiple states that do not correspond to strengths. One state would indicate that no relevant letter has been detected, the second would indicate partial satisfaction (i.e., that one letter of the pair is present), the third would indicate the presence of the open bigram with no further information, the fourth would indicate that the open bigram is present and is also a closed bigram (i.e., adjacent), and the last would indicate that the open bigram is present but is not a closed bigram (i.e., non-adjacent). The activation values of these states would be arbitrary, as they would not be treated as strengths but rather would be treated in an all-or-none fashion.
Letters, open bigrams, adjacent bigrams, and non-adjacent bigrams
In this representation, multiple types of representational units (e.g., nodes, or a list) are taken to code the identity of the stimulus. These are abstract, position-free units for individual letters, open bigrams (i.e., ordered pairs of letters of any distance), adjacent bigrams (i.e., ordered pairs of letters that are adjacent), and non-adjacent bigrams (i.e., ordered pairs of letters that are non-adjacent). This representation—as a textual list of each type of unit that is implied by the information in the percept—is used in the implemented code for calculating most of the predictions made here, but other representations would yield the same results. This representation could be used in a network; this would yield a proliferation of units and connections to the (unspecified) word level, but it does not proliferate connection strengths that can make the model flexible: Information is combined by the logical-AND rule; such high-threshold logic is not sensitive to strengths. In terms of input to word units, any amount of feed-forward inhibition stops facilitation of the word unit regardless of the amount of active facilitatory connections, and in the absence of inhibition, any non-zero amount of active facilitatory feed-forward connections produces the maximum effective net input to the word unit.
It bears emphasis that although these representations and mechanisms are complex, such complexity is not what gives LTRS the scope to capture the data, and it does not lend the model extra flexibility: Unlike in models based on match scores, the details of such complex machinery do not form the explanation of the effects in LTRS. Instead, the core explanatory mechanism in LTRS is that primes prime more if they diverge in processing from the target on average later in processing and foils are harder to reject if they are less likely to have diverged in processing from the target at the time of the post-mask. I now illustrate how effectively such an explanation can account for the data.
LTRS was first applied to new
The full LTRS for four-letter strings with 10 parameters was fitted to each participant separately by maximum likelihood (exact predictions are achieved by numerical integration); the parameter values are presented in
Findings involving stimuli where letters are transposed with an adjacent letter, such as the above 2LD versus TL comparison and the corresponding priming comparison (e.g.,
LTRS was then applied to a wider range of target–foil relationships from five experiments by
This analysis was restricted to conditions where the lexical status of both options was the same (i.e., Experiment 2 was excluded) and the stimuli were five letters long (i.e., part of Experiment 5 was excluded) because (a) the goal of LTRS is to examine factors that can be explained perceptually without detailed lexical processes,
Although SCM does not have mechanisms to identify nonwords, its graded match scores, which are controlled by two parameters,
Next, LTRS was applied to form priming data with variation in prime duration, given by
A highly simplified version of LTRS is suitable for these data because they include only identity and 1LD priming, and they are averaged over stimulus lengths and all positions of difference. Only three parameters were estimated: α, ω, and B, using βi|l = B/l for any given length l; σ was set to zero, and λ has no role in 1LD or identity priming predictions, because letter order is irrelevant. (The remaining parameters are for identification only.) The entry-opening account of priming—in which identity priming is linear (with slope 1) in the prime duration, and 1LD priming is linear (with slope 1) in the prime duration up to the completion of the first (approximate) phase, after which it is constant—can also be characterized by a three-parameter model; the parameters are the intercept for identity priming, the intercept for 1LD priming, and the maximum value of 1LD priming. Against a total sum of squares of 3,812 ms
The most extensive data regarding apparent letter string similarity effects come from manipulation of prime–target relationships in form priming, so finally, LTRS was applied to these data.
Again, to avoid excessive proliferation of β parameters for the various lengths of stimuli, it was necessary to add an ad hoc equation constraining the β values with only a few parameters. Given the greater number of conditions to be fit, some of which were defined by position, two were used—B and ηi—the former being the sum of the rates, the latter reflecting increased processing strength (efficiency) for the initial position (cf.
The observed and LTRS-predicted priming (evaluated by quadrature), with the parameters given in
One of the most important comparisons comes from
Superset primes are another form of prime—ones in which an extra letter is inserted relative to the target—that can distinguish between models. In particular, LTRS predicts that when the extra letter repeats one already in the target, priming should be greater than if the extra letter is unique, because both the inserted letter and the letter it repeats must be detected to stop priming if there is an adjacent repeat, but only the new letter needs to be detected to stop priming if it is unique. Although SERIOL and other open-bigram schemes sensitive to the distance between pairs of letters also make this prediction, SCM makes the contrasting prediction that these conditions should be equivalent. One experiment that has examined this (
Against a total sum-of-squares of 12,390 ms
It is perhaps more telling to examine the predictions of a model for experiments of the type that it is designed to explain, but which were not examined in its development, as their idiosyncrasies will not have been built into the model.
LTRS accounted for effects of manipulations of duration and relationships between letter strings in both tachistoscopic identification and form priming. It did so by recourse to the idea that different information about letter strings becomes available at different times. The account asserts neither the notion that lexical processing partially tolerates imperfect matches in the presence of negative evidence, nor the notion that the cause of ambiguity in a percept is contamination by noise. Other models have yet to be applied to this range of types of data, and there may be difficulties in doing so. For instance, the matching process in SCM is based on word nodes, which causes difficulty in accounting for nonword identification data, even if a decision rule to fit the effect of duration in the word identification data was found; the overlap model has no account of the influence of exposure duration; and the functional form of the exposure duration effect is not the half-normal cumulative distribution that would be expected under simple signal detection accounts.
The predictions of LTRS are derived with specification of only the most basic properties of lexical processing relating to targets; effects of word primes, and of word neighbors of primes, would require further specification of lexical processing (and lexical decision). The extent to which LTRS is successful—without reference to detailed lexical processing—as an account of core phenomena relating to the identification, confusion, and priming of letter strings over time is suggestive that priming and tachistoscopic identification phenomena need not be understood in terms of tolerance for partial matches but may—in whole or in part—be explained by reference to the stochastic timing of information extraction from the stimulus.
Adelman, J. S., & Brown, G. D. A. (2006). The time course of letter position and identity information availability in visual word identification. Manuscript submitted for publication.
Adelman, J. S., & Brown, G. D. A. (2008). Methods of testing and diagnosing models: Single and dual route cascaded models of word naming. Journal of Memory and Language, 59, 524–544. doi:10.1016/j.jml.2007.11.008
Adelman, J. S., Marquis, S. J., & Sabatos-DeVito, M. G. (2010). Letters in words are read simultaneously, not in left-to-right sequence. Psychological Science, 21, 1799–1801. doi:10.1177/0956797610387442
Chung, K.-L. (1996). O(1)-time parallel string-matching algorithm with VLDCs. Pattern Recognition Letters, 17, 475–479. doi:10.1016/0167-8655(96)00007-4
Cohen, A. L., & Nosofksy, R. M. (2003). An extension of the exemplar-based random-walk model to separable-dimension stimuli. Journal of Mathematical Psychology, 47, 150–165. doi:10.1016/S0022-2496(02)00031-7
Davis, C. J. (2010). The spatial coding model of visual word identification. Psychological Review, 117, 713–758. doi:10.1037/a0019738
Davis, C. J., & Bowers, J. S. (2006). Contrasting five different theories of letter position coding: Evidence from orthographic similarity effects. Journal of Experimental Psychology: Human Perception and Performance, 32, 535–557. doi:10.1037/0096-1523.32.3.535
Forster, K. I., Mohan, K., & Hector, J. (2003). The mechanics of masked priming. In S.Kinoshita & S. J.Lupker (Eds.), Masked priming: The state of the art (pp. 3–37). Hove, England: Psychology Press.
Gomez, P., Ratcliff, R., & Perea, M. (2008). The overlap model: A model of letter position coding. Psychological Review, 115, 577–600. doi:10.1037/a0012667
Grainger, J., Granier, J. P., Farioli, F., Van Assche, E., & van Heuven, W. J. B. (2006). Letter position information and printed word perception: The relative-position priming constraint. Journal of Experimental Psychology: Human Perception and Performance, 32, 865–884. doi:10.1037/0096-1523.32.4.865
Johnston, J. C. (1978). A test of the sophisticated guessing theory of word perception. Cognitive Psychology, 10, 123–153. doi:10.1016/0010-0285(78)90011-7
Lamberts, K. (1998). The time course of categorization. Journal of Experimental Psychology: Learning, Memory, and Cognition, 24, 695–711. doi:10.1037/0278-7393.24.3.695
Lupker, S. J., & Davis, C. J. (2009). Sandwich priming: A method for overcoming the limitations of masked priming by reducing lexical competitor effects. Journal of Experimental Psychology: Learning, Memory, and Cognition, 35, 618–639. doi:10.1037/a0015278
Norris, D., Kinoshita, S., & van Casteren, M. (2010). A stimulus sampling theory of letter identity and order. Journal of Memory and Language, 62, 254–271. doi:10.1016/j.jml.2009.11.002
Perea, M., & Lupker, S. J. (2003). Transposed-letter confusability effects in masked form priming. In S.Kinoshita & S. J.Lupker (Eds.), Masked priming: The state of the art (pp. 97–120). Hove, England: Psychology Press.
Rumelhart, D. E. (1970). A multicomponent theory of the perception of briefly exposed visual displays. Journal of Mathematical Psychology, 7, 191–218. doi:10.1016/0022-2496(70)90044-1
Rumelhart, D. E., & Siple, P. (1974). The process of recognizing tachistoscopically presented words. Psychological Review, 81, 99–118. doi:10.1037/h0036117
Schoonbaert, S., & Grainger, J. (2004). Letter position coding in printed word perception: Effects of repeated and transposed letters. Language and Cognitive Processes, 19, 333–367. doi:10.1080/01690960344000198
Stevens, M., & Grainger, J. (2003). Letter visibility and the viewing position effect in visual word recognition. Perception & Psychophysics, 65, 133–151. doi:10.3758/BF03194790
Stewart, N. (2006a). Millisecond accuracy video display using OpenGL under Linux. Behavior Research Methods, 38, 142–145. doi:10.3758/BF03192759
Stewart, N. (2006b). A PC parallel port button box provides millisecond response time accuracy under Linux. Behavior Research Methods, 38, 170–173. doi:10.3758/BF03192764
Van Assche, E., & Grainger, J. (2006). A study of relative-position priming with superset primes. Journal of Experimental Psychology: Learning, Memory, and Cognition, 32, 399–415. doi:10.1037/0278-7393.32.2.399
Whitney, C. (2001). How the brain encodes the order of letters in a printed word: The SERIOL model and selective literature review. Psychonomic Bulletin & Review, 8, 221–243. doi:10.3758/BF03196158
Experimental Method
Four postgraduates of the Department of Psychology at the University of Warwick acted as observers in this experiment; three were paid £100 (ca. $170), and the fourth was the author. All reported normal or corrected-to-normal vision and had English as their first language.
One hundred and eight four-letter English words were used to construct 432 target–foil pairings for this experiment. Each word acted as a foil for four targets, and these were also the foils when this word acted as target. Each pairing (symmetrically) represented one of four transformation types, their subtypes (conditions) denoted by four-letter codes representing their relation to the string 1234: (a) replacement of four letters (e.g., CODE vs. WASP), denoted dddd, of which there were 216 pairs (counting both versions of a pairing as distinct); (b) replacement of two letters, either adjacent to one another (e.g., WHIP vs. WRAP) or at the two ends (e.g., SEAL vs. TEAM), denoted dd34, 1dd4, 12dd, or d23d, 20 of each subtype; (c) replacement of one letter (e.g., ABLE vs. AXLE), denoted d234, 1d34, 12d4, or 123d, 24 of each subtype; and (d) reversal of two letters either in adjacent positions (e.g., SAWN vs. SWAN) or at the two ends (e.g., MEAT vs. TEAM), denoted 2134, 1324, 1243, or 4231, 10 of each subtype. The severe selection criteria were not perfectly met, so a few of the pairings realized other relationships than those described above, due to the inclusion of the words EVER, VEER, and PASS among the stimuli, and the use of ACID–DUET as a pairing. All modeling takes into account the true relationship between the targets and foils. However, in the graphs, these pairings are averaged with the intended condition.
Stimuli were presented on a Sony CPD-G200 computer monitor driven at 166.67 Hz by an NVidia GeForce 2 MX based graphics card. The resolution of this display was 640 × 480 on a 17-in. (43.18-cm) monitor (see
Accuracy was measured on two-alternative forced-choice identification of the targets for each target–foil pairing in each of the 13 pairing subtypes, for each of eight DTs spaced evenly from 0 to 42 ms. Six replications of each target–foil–DT combination were tested, half associated with each response button, giving a total of 20,736 trials per observer. Over all trials, critical information was equally likely to appear in each of the four letter positions; and given two alternatives, each was equally likely to be correct. Every 3,456 trials observers had seen each target–foil–DT combination an equal number of times. Every 6,912 trials, observers had seen each target–foil–DT–button combination an equal number of times.
Observers completed 12 sessions of approximately 1 hr, each containing 16 blocks of 108 trials. Before each trial, a mask of 10 hash (#) symbols in a 32-point Courier font was displayed centrally horizontally approximately 12% from the bottom of the screen for an inter-trial interval varying randomly between 900 and 1,100 ms, notwithstanding delays due to the parallel port failing to reset. This was then replaced by the target word in a 24-point Courier font in lower case in the same position for a DT from 0 to 42 ms (no target stimulus was shown in the 0-ms condition). The screen was then blank for 6 ms. Then the mask was displayed again with two response options, both target and foil, in lateral positions in a line below that of the mask. Observers pressed the matching lateral button of the button box to indicate their response. After this, the alternatives were removed from the display, and the mask remained for the inter-trial interval. At the end of each block, accuracy for that block was displayed, and breaks were taken as needed between blocks.
Submitted: January 17, 2011 Revised: June 2, 2011 Accepted: June 2, 2011