Psychophysics quantitatively investigates the relationship between physical stimuli and the sensations and perceptions they produce. Psychophysics has been described as "the scientific study of the relation between stimulus and sensation" or, more completely, as "the analysis of perceptual processes by studying the effect on a subject's experience or behaviour of systematically varying the properties of a stimulus along one or more physical dimensions".
Psychophysics also refers to a general class of methods that can be applied to study a perceptual system. Modern applications rely heavily on threshold measurement, ideal observer analysis, and signal detection theory.
Psychophysics has widespread and important practical applications. For example, in the study of digital signal processing, psychophysics has informed the development of models and methods of lossy compression. These models explain why humans perceive very little loss of signal quality when audio and video signals are formatted using lossy compression.
Many of the classical techniques and theories of psychophysics were formulated in 1860 when Gustav Theodor Fechner in Leipzig published Elemente der Psychophysik (Elements of Psychophysics). He coined the term "psychophysics", describing research intended to relate physical stimuli to the contents of consciousness such as sensations (Empfindungen). As a physicist and philosopher, Fechner aimed at developing a method that relates matter to the mind, connecting the publicly observable world and a person's privately experienced impression of it. His ideas were inspired by experimental results on the sense of touch and light obtained in the early 1830s by the German physiologist Ernst Heinrich Weber in Leipzig, most notably those on the minimum discernible difference in intensity of stimuli of moderate strength (just noticeable difference; jnd) which Weber had shown to be a constant fraction of the reference intensity, and which Fechner referred to as Weber's law. From this, Fechner derived his well-known logarithmic scale, now known as Fechner scale. Weber's and Fechner's work formed one of the bases of psychology as a science, with Wilhelm Wundt founding the first laboratory for psychological research in Leipzig (Institut für experimentelle Psychologie). Fechner's work systematised the introspectionist approach (psychology as the science of consciousness), that had to contend with the Behaviorist approach in which even verbal responses are as physical as the stimuli.
During the 1930s, when psychological research in Nazi Germany essentially came to a halt, both approaches eventually began to be replaced by use of stimulus-response relationships as evidence for conscious or unconscious processing in the mind. Fechner's work was studied and extended by Charles S. Peirce, who was aided by his student Joseph Jastrow, who soon became a distinguished experimental psychologist in his own right. Peirce and Jastrow largely confirmed Fechner's empirical findings, but not all. In particular, a classic experiment of Peirce and Jastrow rejected Fechner's estimation of a threshold of perception of weights, as being far too high. In their experiment, Peirce and Jastrow in fact invented randomized experiments: They randomly assigned volunteers to a blinded, repeated-measures design to evaluate their ability to discriminate weights. Peirce's experiment inspired other researchers in psychology and education, which developed a research tradition of randomized experiments in laboratories and specialized textbooks in the 1900s. The Peirce-Jastrow experiments were conducted as part of Peirce's application of his pragmaticism program to human perception; other studies considered the perception of light, etc. Jastrow wrote the following summary: "Mr. Peirce's courses in logic gave me my first real experience of intellectual muscle. Though I promptly took to the laboratory of psychology when that was established by Stanley Hall, it was Peirce who gave me my first training in the handling of a psychological problem, and at the same time stimulated my self-esteem by entrusting me, then fairly innocent of any laboratory habits, with a real bit of research. He borrowed the apparatus for me, which I took to my room, installed at my window, and with which, when conditions of illumination were right, I took the observations. The results were published over our joint names in the Proceedings of the National Academy of Sciences. The demonstration that traces of sensory effect too slight to make any registry in consciousness could none the less influence judgment, may itself have been a persistent motive that induced me years later to undertake a book on The Subconscious." This work clearly distinguishes observable cognitive performance from the expression of consciousness.
Modern approaches to sensory perception, such as research on vision, hearing, or touch, measure what the perceiver's judgment extracts from the stimulus, often putting aside the question what sensations are being experienced. One leading method is based on signal detection theory, developed for cases of very weak stimuli. However, the subjectivist approach persists among those in the tradition of Stanley Smith Stevens (1906–1973). Stevens revived the idea of a power law suggested by 19th century researchers, in contrast with Fechner's log-linear function (cf. Stevens' power law). He also advocated the assignment of numbers in ratio to the strengths of stimuli, called magnitude estimation. Stevens added techniques such as magnitude production and cross-modality matching. He opposed the assignment of stimulus strengths to points on a line that are labeled in order of strength. Nevertheless, that sort of response has remained popular in applied psychophysics. Such multiple-category layouts are often misnamed Likert scaling after the question items used by Likert to create multi-item psychometric scales, e.g., seven phrases from "strongly agree" through "strongly disagree".
Omar Khaleefa has argued that the medieval scientist Alhazen should be considered the founder of psychophysics. Although al-Haytham made many subjective reports regarding vision, there is no evidence that he used quantitative psychophysical techniques and such claims have been rebuffed.
Psychophysicists usually employ experimental stimuli that can be objectively measured, such as pure tones varying in intensity, or lights varying in luminance. All the senses have been studied: vision, hearing, touch (including skin and enteric perception), taste, smell and the sense of time. Regardless of the sensory domain, there are three main areas of investigation: absolute thresholds, discrimination thresholds and scaling.
A threshold (or limen) is the point of intensity at which the participant can just detect the presence of a stimulus (absolute threshold) or the presence of a difference between two stimuli (difference threshold). Stimuli with intensities below the threshold are considered not detectable (hence: sub-liminal). Stimuli at values close enough to a threshold will often be detectable some proportion of occasions; therefore, a threshold is considered to be the point at which a stimulus, or change in a stimulus, is detected some proportion p of occasions.
An absolute threshold is the level of intensity of a stimulus at which the subject is able to detect the presence of the stimulus some proportion of the time (a p level of 50% is often used). An example of an absolute threshold is the number of hairs on the back of one's hand that must be touched before it can be felt - a participant may be unable to feel a single hair being touched, but may be able to feel two or three as this exceeds the threshold. Absolute threshold is also often referred to as detection threshold. Several different methods are used for measuring absolute thresholds (as with discrimination thresholds; see below).
A difference threshold (or just-noticeable difference, JND) is the magnitude of the smallest difference between two stimuli of differing intensities that the participant is able to detect some proportion of the time (the percentage depending on the kind of task). To test this threshold, several different methods are used. The subject may be asked to adjust one stimulus until it is perceived as the same as the other (method of adjustment), may be asked to describe the direction and magnitude of the difference between two stimuli, or may be asked to decide whether intensities in a pair of stimuli are the same or not (forced choice). The just-noticeable difference (JND) is not a fixed quantity; rather, it depends on how intense the stimuli being measured are and the particular sense being measured. Weber's Law states that the just-noticeable difference of a stimulus is a constant proportion despite variation in intensity.
In discrimination experiments, the experimenter seeks to determine at what point the difference between two stimuli, such as two weights or two sounds, is detectable. The subject is presented with one stimulus, for example a weight, and is asked to say whether another weight is heavier or lighter (in some experiments, the subject may also say the two weights are the same). At the point of subjective equality (PSE), the subject perceives the two weights to be the same. The just-noticeable difference, or difference limen (DL), is the magnitude of the difference in stimuli that the subject notices some proportion p of the time (50% is usually used for p in the comparison task). In addition, a two-alternative forced choice (2-afc) paradigm can be used to assess the point at which performance reduces to chance on a discrimination between two alternatives (p will then typically be 75% since p=50% corresponds to chance in the 2-afc task).
In psychophysics, experiments seek to determine whether the subject can detect a stimulus, identify it, differentiate between it and another stimulus, or describe the magnitude or nature of this difference. Software for psychophysical experimentation is overviewed by Strasburger.
Psychophysical experiments have traditionally used three methods for testing subjects' perception in stimulus detection and difference detection experiments: the method of limits, the method of constant stimuli and the method of adjustment.
In the ascending method of limits, some property of the stimulus starts out at a level so low that the stimulus could not be detected, then this level is gradually increased until the participant reports that they are aware of it. For example, if the experiment is testing the minimum amplitude of sound that can be detected, the sound begins too quietly to be perceived, and is made gradually louder. In the descending method of limits, this is reversed. In each case, the threshold is considered to be the level of the stimulus property at which the stimuli are just detected.
In experiments, the ascending and descending methods are used alternately and the thresholds are averaged. A possible disadvantage of these methods is that the subject may become accustomed to reporting that they perceive a stimulus and may continue reporting the same way even beyond the threshold (the error of habituation). Conversely, the subject may also anticipate that the stimulus is about to become detectable or undetectable and may make a premature judgment (the error of anticipation).
To avoid these potential pitfalls, Georg von Békésy introduced the staircase procedure in 1960 in his study of auditory perception. In this method, the sound starts out audible and gets quieter after each of the subject's responses, until the subject does not report hearing it. At that point, the sound is made louder at each step, until the subject reports hearing it, at which point it is made quieter in steps again. This way the experimenter is able to "zero in" on the threshold.
Instead of being presented in ascending or descending order, in the method of constant stimuli the levels of a certain property of the stimulus are not related from one trial to the next, but presented randomly. This prevents the subject from being able to predict the level of the next stimulus, and therefore reduces errors of habituation and expectation. For 'absolute thresholds' again the subject reports whether they are able to detect the stimulus. For 'difference thresholds' there has to be a constant comparison stimulus with each of the varied levels. Friedrich Hegelmaier described the method of constant stimuli in an 1852 paper. This method allows for full sampling of the psychometric function, but can result in a lot of trials when several conditions are interleaved.
In the method of adjustment, the subject is asked to control the level of the stimulus and to alter it until it is just barely detectable against the background noise, or is the same as the level of another stimulus. The adjustment is repeated many times. This is also called the method of average error. In this method, the observers themselves control the magnitude of the variable stimulus, beginning with a level that is distinctly greater or lesser than a standard one and vary it until they are satisfied by the subjective equality of the two. The difference between the variable stimuli and the standard one is recorded after each adjustment, and the error is tabulated for a considerable series. At the end, the mean is calculated giving the average error which can be taken as a measure of sensitivity.
The classic methods of experimentation are often argued to be inefficient. This is because, in advance of testing, the psychometric threshold is usually unknown and most of the data are collected at points on the psychometric function that provide little information about the parameter of interest, usually the threshold. Adaptive staircase procedures (or the classical method of adjustment) can be used such that the points sampled are clustered around the psychometric threshold. Data points can also be spread in a slightly wider range, if the psychometric function's slope is also of interest. Adaptive methods can thus be optimized for estimating the threshold only, or both threshold and slope. Adaptive methods are classified into staircase procedures (see below) and Bayesian, or maximum-likelihood, methods. Staircase methods rely on the previous response only, and are easier to implement. Bayesian methods take the whole set of previous stimulus-response pairs into account and are generally more robust against lapses in attention. Practical examples are found here.
Staircases usually begin with a high intensity stimulus, which is easy to detect. The intensity is then reduced until the observer makes a mistake, at which point the staircase 'reverses' and intensity is increased until the observer responds correctly, triggering another reversal. The values for the last of these 'reversals' are then averaged. There are many different types of staircase procedures, using different decision and termination rules. Step-size, up/down rules and the spread of the underlying psychometric function dictate where on the psychometric function they converge. Threshold values obtained from staircases can fluctuate wildly, so care must be taken in their design. Many different staircase algorithms have been modeled and some practical recommendations suggested by Garcia-Perez.
One of the more common staircase designs (with fixed-step sizes) is the 1-up-N-down staircase. If the participant makes the correct response N times in a row, the stimulus intensity is reduced by one step size. If the participant makes an incorrect response the stimulus intensity is increased by the one size. A threshold is estimated from the mean midpoint of all runs. This estimate approaches, asymptotically, the correct threshold.
Bayesian and maximum-likelihood (ML) adaptive procedures behave, from the observer's perspective, similar to the staircase procedures. The choice of the next intensity level works differently, however: After each observer response, from the set of this and all previous stimulus/response pairs the likelihood is calculated of where the threshold lies. The point of maximum likelihood is then chosen as the best estimate for the threshold, and the next stimulus is presented at that level (since a decision at that level will add the most information). In a Bayesian procedure, a prior likelihood is further included in the calculation. Compared to staircase procedures, Bayesian and ML procedures are more time-consuming to implement but are considered to be more robust. Well-known procedures of this kind are Quest, ML-PEST, and Kontsevich & Tyler's method.
In the prototypical case, people are asked to assign numbers in proportion to the magnitude of the stimulus. This psychometric function of the geometric means of their numbers is often a power law with stable, replicable exponent. Although contexts can change the law & exponent, that change too is stable and replicable. Instead of numbers, other sensory or cognitive dimensions can be used to match a stimulus and the method then becomes "magnitude production" or "cross-modality matching". The exponents of those dimensions found in numerical magnitude estimation predict the exponents found in magnitude production. Magnitude estimation generally finds lower exponents for the psychophysical function than multiple-category responses, because of the restricted range of the categorical anchors, such as those used by Likert as items in attitude scales.