Development and calibration of cross-sensory fusion
Much recent evidence suggests that humans integrate information between senses in a statistically optimal manner, maximizing the precision of performance. However, this capacity is not present at birth, but develops only after about 8 years of age. For younger children one sense dominates over the other: for size discrimination touch dominates over vision but for orientation discrimination visual dominates. We suggest that the dominance of one or other sense reflects cross-modal calibration of developing systems, where the more one sense calibrates the other, rather than fusing with it. But unlike sensory fusion, it is the more robust and accurate sense that does the calibration, even if it is the less precise. Two lines of evidence support this idea: congenitally blind children show a selective reduction in haptic orientation-discrimination thresholds, and dyskinetic children (with highly impaired movement control) show a selective deficit in visual size judgments. Both these impairments could result from a lack of cross-sensory calibration in early development.
In many everyday situations, our senses are bombarded by numerous different unisensory signals at any given time. In order to gain the most veridical, and least variable, estimate of environmental stimuli/properties, we need to combine the individual noisy unisensory perceptual estimates that refer to the same object, while keeping those estimates belonging to different objects or events separate. How, though, does the brain ‘know’ which stimuli to combine? Traditionally, researchers interested in the crossmodal binding problem have focused on the role that spatial and temporal factors play in modulating multisensory integration. However, crossmodal correspondences between various unisensory features (such as between auditory pitch and visual size) may provide yet another important means of constraining the crossmodal binding problem. A large body of research now shows that people exhibit consistent crossmodal correspondences between many stimulus features in different sensory modalities. So, for example, people will consistently match high-pitched sounds with small, bright objects that are located high up in space. The literature reviewed here supports the view that crossmodal correspondences need to be considered alongside semantic and spatiotemporal congruency, among the key constraints that help our brains to solve the crossmodal binding problem.
Massimiliano Di Luca
Recalibration of simultaneity
Prolonged exposure to asynchronous audiovisual stimuli changes perceived simultaneity as the brain minimizes the temporal discrepancy. It has been proposed that recalibration is obtained by tuning the mechanisms dedicated to the detection of asynchrony. On the other hand, there is increasing experimental evidence indicating that other mechanisms are also in place. For example data indicates that asynchronous exposure changes perceived simultaneity in stimulus pairs containing only one of the exposed stimuli (the recalibration effect transfers). Reaction times to single stimuli are also affected, which suggests an influence on unisensory processing. Moreover, changes in reaction time are associated with a modification in the detectability of stimuli, indicating specific influences on the neural response to the stimuli. We also find that reaction to stimuli presented in pairs also changes after exposure in a way that implies the mutual influence of uni- and bi-modal neural circuits. Such behavioral results are complemented by neural recordings and are interpreted to be the expression of an adjustment in the processing of individual stimuli.
Integrated processing of auditive and visual data for scene analysis in complex environments
Detection, identification and tracking of objects operating in complex environments requires the integration of auditive and visual percepts. The integration of percepts allows to utilize the information provided by two complementary modalities. I will demonstrate how having directional information about auditive events greatly enhances the capabilities of a technical system to localize and separate semantic objects, This a goal, which is hard to achieve by analyzing visual and auditive data alone and if we exploit only temporal correlation of percepts. This capability opens new opportunities to design cognitive systems which understand where they are and what’s in their environment.
Sub-optimal visuo-haptic integration in softness perception
Softness perception intrinsically relies on haptic information. However, everyday-life experience teaches us some correspondence rules between felt softness and the visual effects of exploratory movements that are executed to feel softness. In two experiments, we studied whether and how the brain integrates visual and haptic information while estimating the softness of deformable objects (rubber specimens varying in compliance). In Exp. 1 participants estimated the magnitude of perceive softness for a broad set of stimuli. Participants explored the stimuli without and with vision of their movements (haptic and visuo-haptic conditions), or they watched how another person explored the stimuli (visual condition). Exp. 2 used the method of constant stimuli combined with a 2IFC task in order to precisely assess the judgments and their reliabilities for a harder and a softer standard stimulus. Here, visual information was mimicked by computer graphics (finger movement, deformation) on a visual display; input in visual conditions was taken from previous visuo-haptic trials. Natural (Exp. 1) or artificial discrepancies (Exp. 2) between the information in the two senses allowed to measure each sense’s contribution to visuo-haptic judgments. Results from both experiments suggest that participants partially can infer softness from indirect visual information alone, and that such visual information contributes to visuo-haptic softness judgments to a considerable extent (30 – 50%). Thereby, the visual contribution was higher than predicted from models of optimal integration (MLE model=sense’s are weighted according to their relative reliabilities), and–also unlike predicted–did not differ between hard and soft stimuli. Consistently, visuo-haptic judgments were less reliable than predicted from optimal integration. We conclude that the integration of visual and haptic softness judgments is biased towards vision rather than optimal, and it might be even guided by a fixed weighting-scheme.
Markus Rank & Sandra Hirche
Performance-optimized haptic telepresence over time-delayed and lossy channels
Haptic telepresence systems enable the human to interact in remote environments. Typically, the haptic signal exchange between the operator side and the remote side is performed over a shared communication network inducing time delay and packet loss. It is well-known that human teleoperation task performance decreases when the communication uncertainty increases. With nowadays communication protocols we can adjust communication quality within some limits. Here we explore a novel concept of performance-optimized teleoperation through Quality-of-Service adaptation. Therefore we investigate human perception and behavior in teleoperation tasks over time-delayed and lossy communication channels. From human studies an operator task performance model depending on the communication parameters is derived revealing that task completion time is significantly higher with increasing time delay and packet loss rate, further accuracy is influenced by time delay. The results can be partially explained by a decrease in haptic transparency caused by time delay and the operator’s dynamic manipulation abilities. To overcome the observed performance decrease, we propose an optimal Quality-of-Service (QoS) control algorithm that allows us to actively adjust the channel time delay. We find that QoS is capable of improving task accuracy. However, variance in time delay is found to be negatively correlated with completion time, limiting the applicable bandwidth of QoS control which must be chosen in agreement with human adaptation characteristics.
Cooperative control with application to gaze tracking
We study the problem of natural human eye and head movement from the point of view of classical mechanics. It turns out that there is an underlying principal behind how humans rotate their eyes and turn their heads and these were introduced in the mid nineteenth century by Listing, Donders and Helmholtz. Although, rotations can be parameterized as a point in SO(3), humans do not use the full SO(3) but only a 2-dimensional submanifold of SO(3) in actuating both the eyes and the head. We parameterize this submanifold and describe a Riemannian metric on this space from the natural metric on SO(3). Subsequently we pose an optimal control problem and compare optimal trajectories from recorded eye and head movement data. In conclusion, a connection between ‘human head movement’ and a ‘generalized gimbal’ is outlined.
Stefan Glasauer & Frederike Petztschner
Beyond Sensory Experience: Learning from the Past
Sensory input often is unreliable, for example, due to noise or changing environmental conditions. In such situations, it is advantageous to not solely rely on current sensory data but include past experience in the perceptual estimation process. Here we show experimentally how humans incorporate prior knowledge in estimating displacement during blindfolded locomotion and while moving through virtual environments. A two-stage iterative Bayesian estimation process incorporating knowledge about previous behaviour is able to explain the systematic errors in both tasks. The model, which provides a direct link between Weber-Fechner and Stevens’ power law, may provide a unified explanation for commonly seen effects in psychophysical magnitude estimation studies, such as the range effect, hysteresis, and the regression effect.
Mark W. Greenlee, Tina Plank, Anton Beer
Multisensory integration: Evidence from structural and functional MRI
Multisensory integration assists us to identify objects by providing multiple cues with respect to object category and spatial location. A semantic audiovisual object-matching task was used to determine the effect of spatial congruency on response behavior and fMRI brain activation. Participants (n =15) responded in a 4-AFC paradigm the spatial location of the object that best matched the sound presented (Plank et al. Human Brain Mapping, 2011). Realistic sounds based on head-related transfer functions were presented binaurally with the simulated sound source corresponding to one of the four quadrants. The sound was randomly located in the quadrant that contained the matching object or in another location. Spatial congruency had a significant benefit for reaction times and hit rate. Several areas in visual, auditory and frontal cortex were significantly activated by this task. Only a small region in the right middle and superior temporal gyrus (BA 21/22) was more activated when the auditory sound sources were spatially congruent with the semantically matching visual stimulus (differential contrast ‘‘spatially congruent > incongruent’’). In an independent sample (n = 10; Beer et al., Exp. Brain Res., submitted) probabilistic tractography of diffusion-tension imaging suggests the existence of neuroanatomical connections between Heschl’s gyrus and regions in the STS, the supramarginal gyrus, the intraparietal sulcus and the occipital cortex. These anatomical connections could provide the basis for spatial congruency effects in audiovisual integration.
Attention and movement plans
It is now well established that the planning of eye movements leads to a shift of covert spatial attention to the eye movement goal (Deubel& Schneider, 1996). This has been observed so consistently that it is frequently assumed that the saccadic system is directly involved in covert attention shifts, even when we do not move the eyes (Rizzolatti et al, 1987). In the presented studies we investigated the relationship between covert spatial attention and eye and hand movement planning. Using a probe discrimination task we measured attention while participants attended to spatial locations, made eye movements, made hand movements, or both, attended at a spatial location and made an eye or a hand movement. We observed that attentional resources were withdrawn from the attended location if saccade or reach was made to a different location. This suggests that covert spatial attention relies on action planning systems, and confirms that covert spatial attention might be a function of action systems.
Multisensory interactions in human primary cortices: synthesis and controversies
This talk focuses on whether or not multisensory interactions in humans involve primary cortices and if so, when such occur and for what functional purpose(s). To examine this issue, the major focus is placed on simple, rudimentary stimuli (e.g. flashes/checkerboards and tones/noises), principally because such stimuli have been used in studies employing different methods and in different species. This allows for comparative, translational research. Likewise, rudimentary stimuli are a reasonable starting point for addressing controversies in how to identify and qualitatively describe multisensory phenomena. Properties of rudimentary stimuli can likewise be parametrically varied to render them physically (and therefore perceptually and/or behaviorally) more complex and ethologically more valid. Throughout the talk investigations of auditory-visual interactions that applied psychophysics, fMRI, ERPs, and TMS will be reviewed. One emphasis will be on describing some of the controversies surrounding their use in multisensory research as well as on novel methodological and analytical tactics for using these various techniques in multisensory research. The core findings and detailed evidence supporting them are then described. An effort is made to describe the putative neurophysiological bases for the phenomena observed with various brain imaging methods. Finally, findings across these diverse methodologies are synthesized to draw some general conclusions regarding multisensory interactions in humans and to lay out some directions for future research.
Leo van Hemmen
Neuronal Object Formation, Representation, and Multimodal Integration
In general, vertebrates are multimodal beings. That is, they have different modalities, such as hearing, seeing, haptics, a triad that dominates our own lives, but also the lateral line of fish and aquatic frogs, e.g., Xenopus, and infrared vision of snakes belong to this sensory spectrum. The great challenge is explaining that different modalities exploit completely different kinds of physics and therefore need to handle the input sensations completely differently in generating neuronal representations, which we usually call `maps’. The key notion dangling in the background is that of `object’. A specific sensation creates a specific kind of object. Whereas in, say, vision we can characterize objects rather well, “auditory objects” still arouse at best a debate. In spite of that, each of the modalities generates a map stemming from a different kind of physics and in the optic tectum or superior colliculus (in mammals) all these maps are united before further action is taken. In this talk, we will focus on two questions. First, what is the leading principle, if any, of the grand unification? Second, what do the underlying neuronal mechanisms look like?
Intersensory interplay in the human brain
Traditional theories on multisensory integration highlighted the functions of multisensory convergence zones in the cortex and the superior colliculi. More recently, several studies have provided converging evidence in animals and humans, that ‘sensory-specific’ and even primary sensory cortices can be modulated by multisensory inputs as well. In my talk, modulations within subcortical structures, in particular the human thalamus, will be presented and discussed: First, I will present evidence from human fMRI-studies that modulations within sensory-specific thalamic structures are related to behavioral performance in audiovisual and audiotactile situations.
Moreover, spatial and temporal alignment of multisensory stimuli may determine their perceptual fate of integration or segregation. Nonetheless, this automatic integration may interact with top-down task-sets, for instance attending to spatial or temporal stimulus properties. Results from human fMRI-studies contrasting spatial vs. temporal attention to identical audiovisual stimuli suggest, that non-specific nuclei in the central and posterior thalamus may play a crucial role here. Together these observations suggest that nuclei within the thalamus represent central processing hubs within a broader neural network that is engaged in multisensory integration processes and the interplay of integration processes with endogenous control.
Plasticity of multisensory functions
The sensory deprivation approach has been used to investigate the plasticity of sensory functions. Studies in congenitally blind humans have provided evidence on the one side for crossmodal plasticity and on the other side for reduced crossmodal interactions of the intact senses, suggesting a genuine role of vision for the development of some multisensory processes. Whether the development of multisensory functions is linked to sensitive or even critical periods can only be investigate in individuals with congenitally sensory loss whose sensory capabilities were restored. Patients born with congenital cataracts which were removed after the age of about half a year show reduced or a total lack of audio-visual interactions in several tasks, suggesting a critical role of early visual and/or multisensory input for multisensory development. We speculate that crossmodal plasticity and the lack of multisensory recovery might be linked.
Causal Inference, decision-making, and learning in multisensory perception
What are the principles that govern crossmodal interactions? Comparing human observers’ multisensory perception with that of a Bayesian observer, we have found that humans’ multisensory perception is consistent with Bayesian inference both in determining when to combine the crossmodal information and how to combine them. The former problem is a type of causal inference. Causal inference, which has been largely studied in the context of cognitive reasoning, is in fact a critical problem in perception. Our Bayesian causal inference model accounts for a wide range of phenomena including two important auditory-visual illusions, as well as counter-intuitive phenomena such as partial integration and negative gain. This model allows us to computationally characterize the different components of the perceptual process such as sensory representations, prior expectations, and decision strategies for individual observers. I will discuss the relationship between sensory representations and prior expectations, as well as our surprising findings regarding the decision-making strategy used by the perceptual system. These results altogether suggest that the human brain tries to find sensory signals that are caused by the same source and integrate them in a fashion that results in minimization of error in perception in some situations and likely maximization of learning in other situations.
Crossmodal interactions not only affect perception when signals from multiple modalities are present, but also influence the subsequent unisensory processing. We have discovered that training with congruent auditory-visual stimuli can accelerate the rate and increase the magnitude of visual learning, improving performance even in the absence of sound. These results suggest that there are mechanisms of learning that are tuned to processing multisensory information, and training in unisensory environments may engage mechanisms of learning which are suboptimal. The influence of multisensory processing on the subsequent unisensory processing can occur extremely rapidly. I will present recent data showing that auditory representation of space can get altered after a single exposure to discrepant auditory-visual stimuli lasting only a few milliseconds. These findings suggest an impressive degree of plasticity in a basic perceptual map induced by a crossmodal error signal. Therefore, it appears that modification of sensory maps does not necessarily require accumulation of substantial amount of evidence of error to be triggered, and is continuously operational. I will discuss functional advantages of this scheme of sensory recalibration. Altogether, these findings suggest that crossmodal interactions influence both multisensory and unisensory perception in a robust and rapid fashion.
Zhuanghua Shi & Hermann J. M Müller
The influence of crossmodal emotional processes on duration estimation
Previous studies have shown that emotional stimuli can influence our time perception in the same modality. However, whether and how perceiving emotional stimuli in one modality affects the perceived time of a stimulus in another modality is still controversial. The talk will focus on our recent work on this issue. Using time bisection method, we compared influences of different types of emotional pictures (such as attack, mutilation, and neutral) on subsequent tactile duration estimation. We found an overestimation of short tactile durations (400–800 ms) following the presentation of threatening (attack) pictures, but not of high-arousal mutilation pictures. Follow-up experiments revealed that perceiving short tactile durations was more strongly influenced by preceding threatening pictures than perceiving long tactile durations. These findings show that visual emotional stimuli can modulate subsequent tactile time perception; however, the influence depends on the type of emotional stimuli and the range of durations. We explain our results within the context of the internal clock model as well as crossmodal attentional sharing mechanisms.
Multiple Pathways for Multisensory Integration
The beneficial consequences of multisensory integration are well known by now. However, the nature of such interactions still remains under intense debate. Behavioural manifestations of multisensory interactions can originate at different levels within the information processing hierarchy, from very early sensory to late response stages. Often, the more specific the multisensory effect is, the more chances that it is supported by sensory levels, whereas multisensory interactions with little specificity to spatial location, spectral content, or semantic congruence reflect the contribution of higher-order processes. Here I present data from studies addressing the question of specificity during multisensory enhancement. In a first study, we showed that the contrast threshold for vision can be improved by a concurrent sound, but only under presentation parameters that favour visual processing within the magno-cellular pathway. A second study extends this finding with a psychophysical analysis of reaction time distributions to visual targets presented with and without accessory sounds. We tell apart two sources of sound-induced enhancement; a response enhancement across all spatial frequencies, and a sensory specific enhancement selective for low visual spatial frequencies. Finally, in a third study we address the contributions of different visual pathways during the perception of biologically relevant audiovisual stimuli (point-light talking faces). In this study, we test the visually-based enhancement of speech perception in noise under conditions that favour configural visual processing (supported by the ventral pathway) versus global motion (supported by the dorsal pathway). The results reveal an important contribution of both systems. In general, the present results support the idea that multisensory integration is a multifaceted phenomenon which is enabled through a number of different mechanisms that may be isolated, but that probably operate in a coordinated fashion during everyday-life information processing.
Perceptual online and offline coding of haptic information
We study the compression and communication of haptic information in telepresence and teleaction system with a special focus on perceptual coding. Both real-time communication (online coding) and the recording and replay of haptic interaction sessions (offline coding) are considered. We show that these two cases have very different requirements with respect to the tolerable algorithmic delay introduced during encoding, but nevertheless can be addressed in a very similar manner from a compression point of view. We also present recent work on error-resilient haptic communication
in the presence of network packet loss. The presented research is joint work with Sandra Hirche, Julius Kammerl, Iason Vitorias, Fernanda Brandi and Rahul Chaudhari.
Perception of synchrony between heard and lipread speech: not that special.
Perception of intersensory temporal order is particularly difficult for (continuous) audiovisual speech, as perceivers may find it difficult to notice substantial timing differences between speech sounds and lip movements. We tested whether this occurs because
audiovisual speech is strongly paired (“unity assumption”). Participants made temporal order judgments (TOJ) and simultaneity judgments (SJ) about sine-wave speech (SWS) replicas of pseudowords and the corresponding video of the face. Listeners in speech and nonspeech mode were equally sensitive judging audiovisual temporal order. Yet, using the McGurk effect and electrophysiology (the Mismatch Negativity, MMN), we could demonstrate that the sound was more likely integrated with lipread speech if heard as speech than non-speech. Judging temporal order in audiovisual speech is thus unaffected by whether the auditory and visual streams are paired. Conceivably, previously found differences between speech and non-speech stimuli are not due to the putative ‘‘special” nature of speech, but rather reflect low-level stimulus differences.
Agnieszka Wykowska, Anna Schubö
Perceiving while acting: action-related bias of perceptual processing
When interacting with the environment, humans select information for efficient behaviour. Hence, the human perceptual system has developed various mechanisms of selection with respect to task-relevance. Task-relevance might be either specified through asking participants to, for example, “look for red”, or through asking participants to perform a certain action, e.g., “grasp a cup” or “point to a cup”. As various action types require selection of different types of information, an efficient system weighs respective characteristics in accordance with intended action types. Such a weighting mechanism might be based on direct action-perception links stemming from a common code for action and perception, in line with the Theory of Event Coding (Hommel et al., 2001). We conducted a series of experiments to test the idea of a common code for action and perception with a paradigm that combined a visual search task (size or luminance targets) with a movement task (grasping or pointing). The results showed that processing of perceptual dimensions was influenced by the intentions to act out a particular type of movement, i.e., action-congruent visual dimensions were detected faster to incongruent ones, supporting the idea of action-related weighting of perceptual dimensions. Subsequent studies confirmed action-related bias of perception with the ERP methodology. Taken together, we conclude that already early stages of perceptual processing can be influenced by action intentions. We suggest an intentional weighting mechanism allowing for efficient interaction with the environment.
Action Sets Influence Human Temporal Resolution
Ebru Baykara & Agnieszka Wykowska
The present study investigated whether human temporal resolution is flexible and can be modulated by various types of action sets. Participants were asked to perform a simultaneity judgment (SJ) task with two visual stimuli presented either simultaneously or with a variable stimulus-onset-asynchrony (SOA). The precision with which participants judged two asynchronously presented stimuli as being separate was used as a measure of temporal resolution. The action sets were induced by a computer game (played before the SJ task) requiring either speeded or unspeeded reactions. Importantly, the computer games were entirely unrelated to the subsequently performed SJ task. The results revealed that a given action set had an impact on the temporal resolution of the visual system, i.e., dependent on the speed of responses that the game required, participants’ performance in the subsequent SJ task was selectively influenced. These results speak in favor of the idea that human temporal resolution is flexible and highly adaptive to the current goals and demands.
Behavioral Effect of an Illusory Flash
Anja Fiedler, Julie L. O’Sullivan, Hannes Schröter & Rolf Ulrich
Perception of multimodal stimuli is based on the integration of different sensory inputs. If two modalities deliver conflicting information, perception in one modality can be altered in order to ensure a coherent representation of the assumed underlying physical source. An impressive example for such a perceptual modulation is the sound-induced flash illusion (SIFI; Shams, L., Kamitani, Y., & Shimojo, S. (2000). What you see is what you hear. Nature, 408, 788.). The SIFI can occur when a single brief flash is combined with two auditory beeps. In this case, participants often report seeing two consecutive flashes. We examined whether this audio-visual illusion is restricted to perceptual processes or can also affect behavior. Specifically, we tested whether or not illusory redundant flashes can speed up response times compared to a single flash. Such a redundancy gain has previously been shown for physically redundant signals. To this end, we conducted a simple reaction time task and presented two auditory beeps together with either one or two flashes. Following a speeded key press in response to the onset of any flash, participants were asked to indicate the number of perceived flashes. As a main result, we observed a high positive correlation between the stimulus-based and the illusion-based redundancy gain. Thus, an illusory perceived double-flash can affect behavior and speed up response time just like physically presented double-flashes.
Blind Kiss: Lateralised Behaviour in Blind Individuals
Elena Nava, Onur Güntürkün & Brigitte Röder
One of the earliest reported functional asymmetry in humans is a right-sided head-turning, which likely develops prenatally and persists into adulthood. This lateralised behaviour has been associated to the development of a right-side preference as a consequence of biasing vision towards the right side of the body.
To what extent vision underlies, promotes and maintains lateralised patterns of behaviour is still an underinvestigated issue. To this aim, we observed head-turning preference in a group of congenitally blind individuals by asking them to kiss an adult-sized doll. To investigate other lateral preferences, we additionally asked them to step on a platform (foot preference), put an ear against the door (ear preference), and report their hand preference through a questionnaire. A group of sighted individuals, matched for age, served as control group. We hypothesised that, if vision drives lateral preferences, blind individuals would show no significant or deviating right- vs. left-side preference.
Interestingly, blind individuals showed a significant left-side preference for kissing (88% left-kissers vs. 12% right-kissers), which differed from sighted controls (38% left-kissers vs. 62% right-kissers).
The left-kissing preference in the blind was not found to correlate with any other lateral preference. In addition, foot preference, handedness and ear difference did not differ between blind individuals and sighted controls, and for both groups a strong right-handedness and right-footedness was found. Finally, no correlation was found between left-side preference for head-turning and finger used for Braille reading, suggesting that the development of reading does not influence motor asymmetries.
Overall, our results show, for the first time, that blind individuals are strongly lateralised in a motor behaviour such as head-turning. In addition, our findings suggest that, while lateralised functions such as handedness seem to be predominantly independent from (visual) experience, some aspects of motor behaviour seem to be shaped by sensory input.
Contextual Cueing of Tactile Search
Michael Schneider, Thomas Geyer & Zhuanghua Shi
Visual context information can guide attention in complex visual displays. When participants are repeatedly presented with identically arranged (‘repeated’) visual displays, search reaction times (RTs) are faster relative to newly composed (‘non-repeated’) displays, an effect which has been referred to as contextual cueing (Chun & Jiang, 1998). While almost all prior studies investigated the context-based guidance of visual attention, the present study tested whether cueing does also operate in tactile search. Participants had to detect a ‘strong’ amongst ‘weak’ vibrating stimuli (vibration generators were attached to fingers) and subsequently discriminate its color (via foot pedals). We constructed target positions and distractor positions such that some configurations were repeated for multiple times (‘repeated displays’) while others configurations were only presented once (‘non-repeated displays’). We found RTs were faster for repeated than non-repeated tactile displays. More interestingly, participants could not distinguish the repeated displays from the new displays – as investigated by a subsequent recognition test. Thus we concluded contextual cueing manifests within various modalities – including vision and touch as well.
Effects of Delayed Visual Feedback on Manual
Heng Zou, Zhuanghua Shi & Hermann J. Müller
Delayed visual feedback was commonly found to affect human‘s action performance. For example, in manual reaching movement task, the visual feedback delay generally induces ‘lag’ effects on the completion time and degrades the accuracy. In addition, the delay in visuomotor feedback has been found to interact with the task difficulty, such delay caused greater deterioration of movement performance when the task is more difficult (MacKenzie & Ware 1993). However, it is still unknown when and in which processes the feedback delay influences on the performance. To examine the visuomotor delay effect in detail, we adopted a modified Fitts’ Law paradigm, which requires both target acquisition and decision making responses. We injected various delays on the moving object (cursor) and compared different devices (mouse, touch pad, touch screen). We found that, instead affecting the whole movement in general, visual feedback delay mainly influenced the final movement correction (tuning) phase, while there was no effect on the reaching movement. More interestingly, though delayed visual feedback generally decreased task performance, in comparison to the smaller feedback delay (about 30 ms), the larger delay (more than 100 ms) led to better performance, including shorter fine tuning time and faster key press response. The results suggests that participants adjust movement strategies and criteria of decision making dynamically based on visual feedback delay. This adjustment might be based on the sensorimotor prediction. It also suggests that, when the feedback delay is invertible, it is to benificial to determine a suitable amount of delay which optimizes performance instead of achieving as least delay as possible, which might actually inducing worse performance.
Experimental Toolbox: a Modular Approach to Computer Based Experiments
In nowadays experiments, especially with multimodal approaches, the exact timing of events is a neccessity. Modern personal computers offer exact temporal resolutions below milliseconds and can be programmed by the experimentators without much pre-knowledge. However, coding can be repititive and time consuming, and consequently programmers tend to reuse code of previous experiments mostly by copy-and-paste procedures. This copy-and-paste approach of code reuse is a main source of bugs which are then kept and groomed over several iterations of new setups.
Computer science offers principles and techniques to avoid these kinds of errors, such as modularity, reusability, compatibility and unit testing. Applying these principles, the novel Experiment-Toolbox presented here offers a modular framework of reusable code, able to be run on various platforms. Its layered levels of abstraction allow novices to quickly modify existing experiments without endangering themselves to fatal errors in measurements or timings. Even with moderate programming skills, new experiments can be generated with numerous helper functions. Additional unit testing facilities can be utilized to check new experiment setups. The underlying modularity enables the programmer to create or modify paradigms without changing the functionality of existing experiments.
Finally, the present novel toolbox has interfaces to existing low-level toolboxes for measuring and presentation, such as PsychToolbox and the CRS ViSaGe system. It is at present implemented in Matlab but its principles can also be implemented in any other progamming language to interface to other existing frameworks such as VisionEgg, PsychoPy or PyEPL.
Influences of Multisensory Feedback Delays on Duration Reproduction
Stephanie Ganzenmüller, Zhuanghua Shi & Hermann J. Müller
During movements our brain receives sensory feedback from more than one modality. The integration of multisensory feedbacks is vital important for precise motor control. In the present study we focused on the effect of multisensory feedback delays on time reproduction. In the experiment, an audio-visual standard duration was presented and participants had to replicate that duration by pressing and hold down a button. While participants were pressing the button, a multisensory feedback stimulus (a tone and a LED light) was presented. We manipulated the feedback delay by injecting time delays (0 or 200 ms) prior to the onset of the visual or auditory signal. We found that participants almost completely compensate the delay in their reproduction for audio and audiovisual delay conditions (i.e. duration reproductions were overestimated by almost 200 ms). In contrast, reproduction times for the visual delayed feedback condition only increased by about twenty-five percent (50 ms). The results indicate auditory feedback dominance on the duration reproduction task. In the second experiment, we manipulated the reliability of the auditory feedback by reducing the probability of the auditory feedback (50%) and applying Gaussian envelope to the auditory feedback stimuli (150 ms, 300 ms). In addition, we used speakers instead of the headphones. The results still showed a strong overestimation in the reproduction times in auditory delayed feedback conditions. However, the length of the Gaussian envelope plays little role in the time reproduction. Results indicate that participants’ motor control of duration reproduction is strongly biased by the delay of the auditory feedback, even when the timing of the visual feedback is more reliable.
Maximum-likelihood Analysis of Visual-vestibular Heading Discrimination in the Earth-horizontal Plane
Paul MacNeilage, Christopher Fetsch & Dora Angelaki
Humans most often move in the earth-horizontal plane, so it is important to know how heading reliability changes for different directions of movement in this plane and how this impacts visual-vestibular cue combination. Previous work has shown that heading estimates are less reliable for more eccentric heading directions. This change in reliability should lead to changes in cue weighting during multimodal heading estimation. Using a motion platform and attached visual display, we measured relative reliabilities of visual and vestibular estimates of azimuth and elevation for four heading eccentricities in the horizontal plane: 0, 30, 60, and 90 deg. Both visual and vestibular estimates of heading azimuth were less reliable at greater eccentricities. Vestibular estimates of heading elevation were also less reliable at grater eccentricities, but reliabilities of visual elevation estimates were less affected. Single-cue reliability measures were used to predict 1) the maximum-likelihood (ML) increase in reliability during multimodal heading estimation and 2) the visual and vestibular ML weights. On average, multimodal estimates were more reliable than single-cue, and close to the ML predictions. Observed weights were also close to the ML predictions, with one notable exception: visual weights were much Lower than predicted for azimuth discrimination around straight ahead.
Modality Invariant Organization of Language: Evidence from Oscillatory Brain Activity
Davide Bottari, Till R. Schneider, Andreas K. Engel, Barbara Hänel-Faulhaber & Brigitte Roeder
Event-related potential (ERP) studies and brain imaging studies on sign language have provided evidence for both modality specific and modality invariant aspects of the cerebral organization of language (e.g. Neville et al., 1997). For example, semantic violations are followed by an ERP N400 effect irrespectively of whether written, auditory or signed language stimuli are used, but the effect has slightly different scalp topographies as a function of language modality. To date, only few studies have analyzed oscillatory neuronal activity during language processing by means of EEG time-frequency analysis. Oscillatory brain activity has been shown to be related to perceptual and cognitive functions. The aim of the present study was to evaluate modulations of frequency specific activity to investigate modality specific and modality invariant aspects of language processing. We compared brain activity of 12 hearing controls and 10 congenital deaf individuals while reading written German that were either correct or comprised a semantic or syntactic violation (Exp. 1). Words were sequentially presented at the same location for 600 ms each and participants had to judge after the presentation whether or not the sentence was correct. Time-frequency analysis revealed in both groups comparable effects of semantic and syntactic violations that replicated previous studies in controls: In both groups the semantic violation resulted in a broadly distributed increase of theta activity (3-7 Hz, 300 -800 ms) and decrease of alpha (8-12 Hz, 500- 1000 ms), beta (15-20 Hz, 600 – 700 ms), and gamma activity (30-40 Hz, 700 -800 ms; confined to central-posterior electrodes) with respect to the correct sentences. Similarly, the syntactic violation elicited a broadly distributed increase of theta activity (3-7 Hz, 300 -800 ms) and a decrease of alpha (8-12 Hz, 500 -1000 ms) and beta activity (15-20 Hz, 600 -700 ms) with respect to the correct sentence, but no difference in the gamma-band. In Exp. 2 we compared brain activity of 14 hearing native signers (individuals born to deaf native signers) and 11 congenital deaf individuals while watching videos of German sign language. Similarly to the written German sentences the signed sentences were either correct or comprised a semantic or a syntactic violation. Time-frequency analysis revealed significant alpha (8-12 Hz, 800 -1800 ms) and beta activity (15-20 Hz, 900 -1200 ms) decreases with respect to the correct sentences, with an indistinguishable scalp topography across groups. These data suggest language invariant processing systems for language.
Perceived Duration of Audio and Visual Signals
J. Hartcher-O’Brien, M. Di Luca & M.O. Ernst
The aim of this study is to understand how temporal properties of auditory and visual signals are perceptually distorted and how such distortions can be resolved when these signals provide redundant information about duration.
First we characterize perceptual distortions of signals in different modalities. Observers compared perceived onset, peak, and offset of two Gaussian signals with different durations (sigma 150ms and 5ms). Estimates of peak were independent of modality, whereas estimates of overall duration confirmed a marked perceptual contraction for visual signals. Moreover, we find that this distortion happens primarily at the onset of the signals, rather than at the offset. These findings illuminate the modality-dependent distortions in perceived duration.
To determine whether temporal information is combined in a statistically optimal manner for redundant information (as described by Maximum Likelihood Estimation, MLE), observers reported the longer of two intervals defined by audio, visual, or audiovisual stimuli. Consistent with MLE, we find that perception with temporally discrepant audiovisual stimuli followed the more reliable signal and reliability of the combined estimate increased optimally.
Physiological Correlates of Subjective Time: Evidence for the Temporal Accumulator Hypothesis
Domenica Bueti & Emiliano Macaluso
Clock-counter models, the most influential cognitive models of temporal computation, have been successful in explaining a large set of behavioral data. However, it remains unclear whether the component operations postulated in these models correspond to any specific biological mechanisms. Using stimuli in different sensory modalities and manipulating physical properties known to bias the ‘subjective’ perception of time (speed for vision and pitch for audition), the present study aimed to highlight brain areas where activity correlates with the ‘subjective’ perception of time: a time accumulator according to clock-counter models. Using functional MRI we found that during the encoding of a temporal interval in the millisecond range (600 and 1000 ms), the hemodynamic response of a few brain regions correlated with the interval reproduction performance. For the visual modality, the activity of the putamen, the mid-insula and the mid-temporal cortex reflected the subjective interval duration, which was biased according to the different speeds of the visual stimuli. This effect was found only when subjects encoded the stimulus duration and was specific for the visual modality, where a significant overestimation of time with increasing speed was observed. These results demonstrate a definite relation between ‘subjective time’ and brain activity, supporting the hypothesis of a physiological correlate of time ‘accumulation’.
Task-evoked Pupillary Response in Dual-task Situations
Yiquan Shi, Elisabeth Müller, Martin Buss, Erich Schneider & Torsten Schubert
Many studies have used task-evoked pupillary response as a measure of mental workload in humans. However, pupil size has been rarely measured in dual-task situations (but see Karatekin, 2004). Mental workload should be higher when someone has to do two tasks simultaneously than when a person does only one task. Additionally, most pupil studies took place in highly controlled laboratory settings which might restrict the validity of the data. The presented study wanted to test the usability of task-evoked pupillary response as a measure of mental workload in less controlled dual-task situations. Therefore a new head-mounted eye-tracker was used, the EyeSeeCam (Erich Scheider et al., 2008) to give the participants freedom of head movements while performing a sensori-motor task. We asked in detail if an additional task can produce higher mental workload which is reflected by larger pupil size even in a less controlled situation. Three different conditions (single task “squares” single task “calculation” and dual task) were presented block-wise. In all conditions three green squares were presented with one of them changing its color to red. In addition to the color change, a tone of high or low frequency could occur in half of the trials. In the single task “squares” the subjects had to respond to the location of a square changing its colour by pressing a corresponding button. In the single task “calculation” the subject had to remember the number presented at the start of the block and add three when a high tone was presented and subtract three when a low tone was presented. In the dual task condition the subjects had to do both tasks. The procedure of the blocks for the three conditions only differed in one aspect: the cue at the beginning of the block, which told the subjects which task they had to do in this block.
First it was found that overall mean pupil size was significantly higher in the dual task blocks than in the single task blocks. Second, in the dual task blocks the pupil size was higher in trials with tone (“real” dual task trials) than in the trials without a tone, in which the subjects only had to remember the current calculation result.
These findings show that task-evoked pupillary response can be an indicator for mental workload which is produced by an additional task even in less controlled situations.
The Effect of Delayed Visual Feedback on Synchrony Perception in a Tapping Task
Mirjam Keetels & Jean Vroomen
Sensory events following a motor action are, within limits, interpreted as a causal consequence of those actions. For example, the clapping of the hands is initiated by the motor system, but subsequently visual, auditory, and tactile information is provided and processed. In the present study we examine the effect of temporal disturbances in this chain of motor-sensory events. Participants are instructed to tap a surface with their finger in synchrony with a chain of 20 sound clicks (ISI 750 ms). We examined the effect of additional visual information on this ‘tap-sound’- synchronization task. During tapping, subjects will see a video of their own tapping hand on a screen in front of them. The video can either be in synchrony with the tap (real-time recording), or can be slightly delayed (~40-160 ms). In a control condition, no video is provided. We explore whether ‘tap-sound’ synchrony will be shifted as a function of the delayed visual feedback. Results will provide fundamental insights into how the brain preserves a causal interpretation of motor actions and their sensory consequences.
The Effect of Threatening Pictures on Attentional Shifts in Audiotactile Temporal Order Judgment
Lina Jia, Zhuanghua Shi & Hermann J. Müller
In our daily life, we have to deal with complex multisensory environment. How our brain dynamic guides attention to select information across different modalities is an important topic. Studies have revealed that attention in one modality can facilitate the process of the information at the same location of another modality. For example, viewing visual threatening stimuli can enhance the unpleasant sensation of tactile stimuli. However, it is still unknown whether there is modality-specific attentional shift induced by the visual threatening stimuli in multisensory events. To investigate this, we examined the modulation of the visual threatening stimuli on the audiotactile temporal order judgment. We compared points of subjective equality (PSEs)of audiotactile events between the neutral and threatening conditions. In addition, we manipulated the onset of the audiotactile events (stimultaneous with the onset, or 300 – 500 ms after the offset of the pictures). Results showed that, when the audiotactile events were presented simultaneously with the onset of pictures, the PSE of audiotactile events was shifted towards auditory modality by the visual threatening pictures. In contrast, the PSE was shifted in opposite way (bias towards tactile modality) when the audiotactile events were presented after the offset of pictures. These findings demonstrate modality-specific and dynamic attentional shifts induced by the visual threatening stimuli and it may relate to our emotional defense mechanisms.
The Impact of Single-Trial, Audio-visual Learning on Unisensory Object Discrimination
Antonia Thelen & Micah M. Murray
Previous studies showed that single-trail multisensory learning can influence the ability to accurately discriminate image repetitions during a continuous recognition task. These investigations focused on the impact of single-trial audio-visual pairings on subsequent visual discrimination performance. An open issue is the directionality of the observed impact, namely the incidental effects of multisensory pairings on subsequent auditory discrimination. Subjects discriminated initial from repeated presentations of either images or sounds in two separate sessions which were one week apart. In both recognition tasks, half of the initial presentations were multisensory parings, which included meaningful congruent, incongruent or meaningless auditory-visual pairings. Upon repetition half of the stimuli were identical to the initial presentation. Of the remaining stimuli, half of the previously multisensory stimuli were presented in a unisensory manner. Whereas the remaining, initially unisensory, stimuli were paired with either with a meaningful congruent, incongruent or meaningless sound or image.
Accuracy in recognition of repeated presentations was enhanced for stimuli initially encountered with their semantically congruent counterpart, in both visual and auditory discrimination tasks, while incongruent pairings did not lead to recognition facilitation. Further meaningless pairings lead to discrimination impairment. This was always compared to the performance accuracy with stimuli that were presented twice in a unisensory manner.
The major finding of this study is that single-trial multisensory learning impacts both visual and auditory discrimination performance in a continuous recognition task. We hypothesize a common object representation, which is similarly accessed by the two sensory modalities.
Thus, the observed recognition facilitation relies on a redundant stimulus mechanism, where the task-redundant sensory stimulus reinforces the activation of the object representation, which leads to a better recognition performance, even when only one part of the information is available.
Ventriloquizing Perceived Duration:Audiovisual Integration in Time Perception
Karin M. Bausenhart & Rolf Ulrich
Klink, Montijn, and van Wezel (2011) observed a strongly reduced accuracy in duration discrimination for filled visual intervals that were accompanied by auditory intervals of different duration. The authors suggested that this may be due to a temporal ventriloquism effect. Specifically, they assumed that on- and offset of the visual intervals were “pulled” towards on- and offset of the auditory intervals, thus biasing perceived duration. We investigated this effect further by employing empty visual intervals, that is, intervals that are defined by short pulses marking their onset and offset. In each trial, a constant standard interval had to be compared to a comparison interval of varying duration. The visual pulses marking this comparison interval were accompanied by auditory pulses which differed with respect to their timing: In the “no bias” condition, auditory and visual pulses were presented simultaneously. In the “shorter” condition, the auditory pulses were presented within the visual interval. Finally, in the “longer” condition, the pulses were presented before and after the visual onset and offset pulses, respectively. Participants were instructed to ignore the auditory stimuli and base their judgments solely on the visual information. Despite these instructions, duration judgments varied according to the biasing auditory information: Compared to the “no bias” condition, perceived duration was prolonged in the “longer” condition and shortened in the “shorter” condition. In contrast, the threshold for duration discrimination was similar in all three conditions. These results suggest that audiovisual integration of temporally discrepant signals from different modalities can alter perceived duration, but does not impair discrimination accuracy.
White Matter Tracts between Human Auditory and Visual Cortex as Revealed by Diffusion Tensor Imagine
Anton L. Beer, Tina Plank & Mark W. Greenlee
Although it is known that sounds can affect visual perception, the neural correlates for crossmodal interactions are still disputed. Previous tracer studies in non-human primates revealed direct anatomical connections between auditory and visual brain areas. We examined the structural connectivity of the auditory cortex in normal humans by diffusion-weighted tensor magnetic resonance imaging and probabilistic tractography. Tracts were seeded in Heschl’s region or the planum temporale. Fibres crossed hemispheres at the posterior corpus callosum. Ipsilateral fibres seeded in Heschl’s region projected to the superior temporal sulcus, the supramarginal gyrus and intraparietal sulcus, and the occipital cortex. Occipital fibre tracts primarily projected to the inferior occipital gyrus, the lateral occipital gyrus, and the calcarine sulcus. Fibres seeded in the planum temporale terminated primarily in the superior temporal sulcus, the supramarginal gyrus, the lateral central sulcus and adjacent regions. Our findings suggest the existence of direct white matter connections between auditory and visual cortex – in addition to subcortical, temporal, and parietal connections.