The University of Edinburgh SENSOPAC Site

Bayesian Multisensory Perception

Role within SENSOPAC

The SENSOPAC robot will collect data from different sensor modalities, e.g. haptic data (touch sensors), proprioceptive data, and movement data (afferent control commands). Processing these inputs is not trivial, as can be seen from the following examples.

When grasping and shaking an object, fingertip touch sensors can yield very different outputs, depending on whether a finger has contact and how much pressure it applies. The experienced pressure also varies during acceleration of the arm.
When pushing multiple objects across a table, the experienced pressure depends on the weight and friction parameters of the object, and different sensors might have contact with different objects.

More generally, input data from multiple modalities may stem from the same source, from separate sources, or they may represent only noise. We aim at inferring the number of sources and a corresponding assignment of the correlated data to the source for further processing by the haptic data decoding module.

A probabilistic approach to multisensory perception

If a living creature or machine can collect information about the outside world through multiple sensor modalities, a key issue is the integration of the different sensory inputs. As an example one may consider observing a conversation, where visual and auditory input signals have to be combined in order to determine who said what.

In more technical terms, we are concerned with the association between observations and their latent (not directly observable) sources. Hereto, we concentrate not only on the fusion (or integration) of different observations, but we explicitly model fission (or segregation), corresponding to the case that two or more observations do not stem from the same source.

We treat this problem of data association as a structure inference task, for which we apply Bayesian model selection techniques. Furthermore, we use factorial hidden Markov models to account for temporal dependencies, e.g. to model the transitions when a perceived audio signal (speech) stems from two persons speaking alternating [1].

While we handle audio-visual data in current studies, the underlying theory and algorithms will be applicable in the sensorimotor context. Research carried out in this subproject is closely related to our work on haptic cognition, which is one of the main focus points of SENSOPAC (WP1, Active Sensing). Please also see our subproject Haptic Data Decoding and Processing.

Related Publications

[1]: Hospedales, T., Cartwright, J., & Vijayakumar, S., Structure Inference for Bayesian Multisensory Perception and Tracking. International Joint Conference on Artificial Intelligence (IJCAI). Hyderabad, India (2007). [pdf]

Subproject overview