Psychoacoustic behavior of human listeners in lateralization judgments of binaural stimuli

Introduction: The present work aims to develop statistical models for the psychoacoustic behavior of human beings in lateralization judgments of binaural acoustic stimuli, as a function of Interaural Time Delay (ITD) and Interaural Amplitude Difference (IAD) for several Sensation Levels (SL). Such models intend to contribute to a deep comprehension of the perception or recognition mechanism which permits listeners to decide whether a source of a sound is located on the right or on the left side of their medial plane. Methods: Numerous lateralization judgments are accomplished through a computer controlled experiment set-up in order to investigate the transduction mechanism beneath them. The statistical treatment of the psychoacoustic data obtained has been performed by Two and Three Factors – Probit (Probability Unit) Analysis. Results: The Probit Analysis makes it possible to obtain the model coefficients and to fit ‘Probit Planes and Surfaces’ to the experimental data in order to study and predict the simultaneous effects produced by ITD and IAD in the listeners’ psychoacoustic perceptions at several Sensation Levels (SL). Conclusion: The approach used here is appropriate for the analysis of this kind of binary response and it also offers a simple way to obtain psychophysical responses that can be related to neurophysiological phenomena. It is argued that this fact may lead to another way to access neural information through psychoacoustic experiments, without needing invasive methods.


Introduction
The ability of localizing sound sources is a very important feature of the human auditory system: by generating the spatial perception of a listener such ability supplies the coordinates of objects in the environment, information that may be critical for surviving. Psychoacoustic studies accomplished in the last six decades show that human binaural hearing is able to precisely localize sound sources, particularly in the azimuthal plane, even in the presence of harmonic interference (Clopton and Spelman, 1995;Devore et al., 2009;Fastl and Zwiger, 2007;Laback et al., 2004;Long et al., 2003;Stern et al., 2005;Strutt, 1907).
On the contrary of the visual system whose periphery provides spatial representation, the auditory system must perceive a few cues and send them to the central neural system in order to be analyzed. The chief cues are the interaural time delays (ITD) and the interaural amplitude differences (IAD). The interaural amplitude differences result from the "sound shade" created by the listener head in the opposite side of the source sound. The interaural time delays occur because it takes more time for the sound wave to reach the ear in the opposite side of the source sound.
The more important relays of the human central neural system which are concerned to the sound localization are the superior olivary complex and the mesencephalus (superior colliculus). Some authors state that neural computation of ITD starts in the MSO (medial superior olive) (Kuwada et al., 1997), the MSO neurons transmitting the information about azimuth through a nonlinear mapping. ITD is considered the cue for the localization of low-frequency sounds in the azimuthal direction (Stern and Trahiotis, 1997). In turn, IAD information as well as the frequency filtering provided by the pinna seem to attenuate front-dorsal ambiguities.
The aim of this work is to apply statistical methods to analyze the psychoacoustic behavior of human listeners in the lateralization judgments of binaural stimuli as a function of interaural time delays (ITD) and interaural amplitude differences (IAD), for several values of sensation level (SL), that is, the amount of dB above the hearing threshold of a particular subject. Lateralization judgments obtained through virtual simulation of stimuli using earphones are in general consistent with the corresponding judgments performed over the actual signals in open field. Hence, the investigation here accomplished uses an experiment set-up controlled by computer (Nogueira et al., 2013a;2013b). Volume , Number , p. , Each listener under test is expected to decide whether a source of a sound is located on the right or on the left side of their medial plane for each random combination of ITD and IAD presented. The great amount of experimental data thus obtained from these judgments has been statistically treated by Two-Factor Probit Analysis. With this analysis it was possible to fit 'Probit Planes' to the experimental data in order to study and predict the simultaneous effects produced by ITD and IAD in the listeners psychoacoustical perceptions at several SL. The work also includes Three-Factor Probit Analysis with which it was possible to fit 'Probit Surfaces' also to simultaneously investigate the influences of ITD, IAD and the product ITD.IAD on the listeners perceptions or judgments.

Experimental technique
In the computer controlled experiment set-up described in (Nogueira et al., 2013a;2013b) and used here, binaural trains of pulses with randomly variant ITD and IAD simulate a randomly variant localization of the sound source in the azimuthal plane, as illustrated in Figure 1. The computer has been programmed to generate the two phase shifted trains of pulses and also two signals to control their attenuations provided by an external circuitry. In turn, the computer receives the judgment code as well as synchronization signals from a digital interface, specifically designed for this purpose.
The experimental technique used throughout this work consists of binaural earphone presentations of two trains of low-pass filtered (with cut-off frequency at 1560 Hz) pulses at the rate of 20 pulses-per-second and 100 microseconds pulse width. Since the pulse width corresponds to 0.2% of the audio signal period, the spectral response of the filtered signal consists of 78 tons symmetrically distributed inside the low-pass filter bandwidth from 20 to 1560 Hz, with less than 4% of magnitude variation. Therefore, many hair cells along the human cochlea are homogeneously excited, on the contrary of several procedures which apply a sinusoidal signal (thus, a single frequency). Moreover, natural stimuli with frequencies above 1560 Hz generate indistinguishable interaural time delays (ITD) (Stern et al., 2005).
In the context of this work, the ITD and IAD are artificially generated and not assessed from a natural sound. Indeed a specific combination of time delay and amplitude difference between the trains of pulses applied to both ears is configured at random each time it is presented to the listener in an experiment session. Such procedure makes it possible to precisely relate pairs ITD-IAD to the listener decision. High quality earphones have been used in the experiments in order to apply the binaural stimuli. As described in (Nogueira et al., 2013a;2013b) the earphones (KOSS ESP-9 electrostatic stereophones) performance characteristics are: frequency response range from 15 Hz to 15 kHz, ± 2dB; sensitivity of ± 1dB for 90 dB SPL (sound pressure level) at 1 kHz; 40 dB isolation from external noise; THD (total harmonic distortion) less than 2% at 110 dB SPL.
All the subjects to whom theses audio signals are presented are volunteers that have been clinically Figure 1. Experiment set-up for lateralization judgments of binaural acoustic stimuli with example of binaural pulses pattern and the respective virtual perception in subject's brain. evaluated before performing as listeners in this experiment. They have been submitted to rigorous audiometric and otorhinolaryngologic exams in a specialized hospital. Since the sensation levels have been set with respect to each listener's respective threshold level, his/her absolute level of hearing is indifferent for the accomplishment of this experiment. The threshold level of each listener has been determined as part of these exams through empirical detection. By definition, the threshold of hearing is the sound pressure level (SPL) of 20 μPa (micropascal) and occurs between 2 and 5 kHz.
The time delay and the amplitude difference between the pulses are both randomized by computer programming, ranging from -350 to +350 μs (in steps of 100 μs) and from -4 to +4 dB (in steps of 1 dB), respectively. The particular configuration or the particular random values of Interaural Time Delay (ITD) and Interaural Amplitude Difference (IAD) are presented until the subject decides whether the virtual acoustic-image is located on the right or on the left side of his (her) medial plane. The sensation level (SL) is set at 10, 20, 30 and 40 dBA in different sessions, at random.
In the physical set-up designed and implemented in this work for the purpose of applying this technique, the subject is introduced into an acoustic cabin where they meet the earphones and the interface equipment. His/her judgments are expressed by pressing a switch either in the right or in the left side with respect to his (her) medial plane. Each judgment is codified by an electronic interface as a pulse with two possible voltage levels. The time elapsed between the beginning of the pulses presentation and the subject's decision is measured and recorded. After a 5 seconds rest period, new trains of pulses are presented to the subject with another configuration, or another set of random values of ITD and IAD. This procedure is repeated 128 times in each session of the experiment, taking 30 minutes approximately.
The present technique has been developed from the classical set-up widely used by psychoacoustic researchers, and the application of this technique is referred to as "judgment of sideness experiment" (Békésy, 1960). The experiment set-up has been built and the measurements have been accomplished in the Department of Bioengineering of Imperial College, London.

The statistical treatment of psychoacoustical data relating to ITD and IAD factors
The data acquisition system described in the previous Section has been extensively applied to obtain the psychoacoustic information processed in this Section. These data have been statistically analyzed and prediction models have been generated in order to estimate the lateralization judgment responses to acoustic binaural stimuli.
Since the listeners decide either for the right or for the left side at each pulses presentation, the psychoacoustic data here obtained are essentially binary. A very appropriate technique to analyze this kind of response is the Probit Analysis with which it is possible to fit Probit Regression equations on the experimental data, in order to evaluate the discrepancy between the observations and predictions from the parameters (Finney, 1977). The mathematical approach used in the Probit analysis is thoroughly described in (Finney, 1977). In order to control each step of the Probit technique and to implement adaptations concerning this particular application, instead of directly applying generic and automatic computational tools, such as SPSS or SAS, a specific computational tool has been developed.
As mentioned in the previous Section, in the data acquisition system designed and implemented in this work, the computer registers the time elapsed and the decision (right or left) in each judgment accomplished by the listener under test, in the total amount of 128 judgments per session per sensation level.
The frequencies of the responses corresponding to "right side" or "left side" are computed for each combination of ITD and IAD, for each listener and for each value of sensation level. These frequencies are the inputs of the computational tool thus developed to perform the statistical treatment of psychoacoustic data according to Probit Analysis and through which the parameters of the Probit Regression models have been calculated.
The models here referred are the Probit Plane (two factors model), expressed by and the Probit Surface (three factors model), expressed by where Y is the expected Probit and a, b 1 , b 2 , and b 3 are the regression coefficients. It should be noted that a, b 1 and b 2 differ from one model to the other. The two factor model allows studying and predicting the simultaneous effects that ITD and IAD produce in the psychoacoustic perception of listeners for several values of sensation levels. On the other hand, the three factors model allows investigating the influence of ITD, IAD and of the interaction of these two factors.
The outputs of the computational tool are: the combinations of ITD and IAD applied; the total number of judgments accomplished; the number of observed "right side" responses; the expected probits; the number of expected "right side" judgments; the partial Chi-square results; the kind of model used (two or three factor); the regression coefficients; the total chi-square value χ 2 and its freedom degree (NFD); the variances of b i (C ii ) and the co-variances of b i and b j (C ij ).

Results
The technique previously described has been applied to experimental data concerning the six subjects under test, for sensation levels of 20, 30 and 40 dB. Table 1 exhibits the model coefficients and other statistical quantities calculated through the Probit Analysis. Figure 2 shows the planes and surfaces obtained by applying the coefficients of Table 1, for SL = 20 dBA and SL = 40 dBA, for subject AN. Similar planes and surfaces have been obtained for all other subjects and all values of SL. These plots present a positive gradient with increasing ITD and decreasing IAD, as expected, since more positive delays and more negative amplitude differences lead to a greater frequency of decisions for the right side, according to the definitions of ITD and IAD illustrated in Figure 1.
In order to attain a better comprehension on the psychoacoustic behavior described by these models, the individual contributions of terms b 1 .ITD and b 2 .IAD from the two factors model of listener AN are depicted in Figure 3 for SL = 20 dBA and SL = 40 dBA. Figure 4 shows the individual contributions of terms b 1 .ITD, b 2 .IAD and b 3 .ITD.IAD from the  Figures 5 and 6 for the case of listener ZB. Figures 3 to 6 show that the influence of ITD in the answer definition is greater than the influence of IAD, in both two and three factors models, for both ranges of random variation. Although not shown, this observation may be generalized to the other listeners. Figures 4 and 6 also show that the contribution of the term proportional to the product of ITD and IAD in the three factors model may be either negligible, as in the case of listener AN, or it may be more important than the contribution of b 2 .IAD, as in the case of listener ZB. Therefore, it is not possible to predict in general terms which model, two or three factors, provides more precision or reliability in representing the decision pattern of a particular listener.
In Figure 7 the difference between the models obtained for SL = 30 dBA and SL = 20 dBA and the difference between the models obtained for SL = 40 dBA and SL = 20 dBA are presented in graphical form for both cases of two factors and three factors models, for the case of listener AN. Figure 7 indicates the increasing of Probit gradient with respect to ITD as sensation level increases. Such behavior is expected for all listeners, since from Table 1 one can notice the increasing of coefficient       b 1 with SL in both models. On the other hand, with a few exceptions, the coefficient b 2 , strongly related to the gradient of Probit with respect to IAD, varies very little and inconsistently with SL. Therefore, we should infer that sensation level interferes significantly and systematically only on ITD contributions to the perception of acoustic source in lateralization judgments.
The differences between the three factors and the two factors models, for the three sensation levels considered, are shown in graphical form in Figure 8, for the cases of listeners AN and ZB. Since coefficients b 1 and b 2 of two factors model are very similar to their counterparts in three factors model, as noticeable from Table 1, the differences between two factors and three factors models depicted in Figure 8 approach the contribution of term b 3 .IAD.ITD. Therefore, these surfaces approach hyperbolic paraboloids with small weight to the whole Probit and with very different curvatures from one listener to the others.

Discussion
The models here determined for each listener and a few values of sensation level through the Probit technique provide a deep insight on the psychoacoustic behavior of human beings in lateralization judgments of binaural stimuli. Some behavioral aspects can be observed from the regression surfaces and planes.
The approach used in this work is appropriate for the analysis of the auditory system binary response in the case of lateralization judgments. Moreover, this approach offers a simple way of obtaining psychophysical responses that can be related to neurophysiologic phenomena, thus being an alternative to access neuronal information by means of psychoacoustic experiments instead of applying invasive methods. The neuronal process exhibited by a subject is a matter of interest for clinical medicine and neurology.
Although limited to the lateralization problem, this work may be an initial step towards the investigation of other problems concerning the localization of sound sources, for other planes and more directions, in the case of multiple sound sources, in reverberant environments. These researches may benefit applications such as: noninvasive diagnostics related to regions of the central auditory system which are concerned to binaural process; development and improvement of electronic devices for the localization of vehicles moving in air or aquatic spaces; implementation of binaural processing in hearing implants; replication of the human ability to localize sound source in robots.
Some of these issues and applications have been focused in the last ten years, such as: modeling the localization of sounds in the azimuthal half-plane (Raspaud et al., 2010;Willert et al., 2006); sound localization in the presence of noise and reverberation (Devore et al., 2009;May et al., 2012;Woodruff and Wang, 2012;; robotic sound source localization (Keyrouz, 2014;Liu and Meng, 2008). However, most approaches up to now are based on experimental cochleagrams or head related transfer functions (HRTF) whose determination demands the use of an anechoic chamber and the introduction of microphones inside the listener's eardrum, besides other procedure constraints. Moreover, psychological aspects involved in the listener's judgments are not taken into account in these measurements.
The method described in (Nogueira et al., 2013a;2013b) and in the present work for acquiring and statistically processing experimental data concerning lateralization judgments of binaural stimuli is simple, reliable, noninvasive and comprises psychological interference. This method allows obtaining models to relate the chief cues, ITD and IAD, concerning the human ability for sound localization to the decision probabilities of a particular listener.