Robot perception and cognition often rely on the integration of information from multiple sensory modalities, such as vision, ...
Multi-modal Speech Transformer Decoders: When Do Multiple Modalities Improve Accuracy? Authors: Guan, Y., Trinh, V.A., Voleti, V., and Whitehill, J.