Sense-aware BERT and Multi-task Fine-tuning for Multimodal Sentiment Analysis

摘要

Humans convey emotions through verbal and non-verbal signals when communicating face-to-face. Pre-trained language model such as BERT can be fine-tuned to improve the performance of various downstream tasks including sentiment analysis. However, most prior works about BERT fine-tuning contains only textual unimodal data and lacks information from sense organs, such as audio and visual signals, which are crucial for sentiment analysis. In this paper, we propose Sense-aware BERT (SenBERT) which allows sense information integrated with BERT during fine-tuning. In particular, we exploit multimodal multi-head attention to capture the interaction between unaligned multimodal data. Additionally, due to the variable information richness of different modalities, multimodal network may be dominated by some modalities during training process, so we propose unimodal sentiment analysis auxiliary tasks for multi-task learning which forces the model to focus on all modalities. We conduct experiments on CMU-MOSI and CMU-MOSEI datasets for multimodal sentiment analysis. The results show the superior performance of SenBERT on all the metrics over previous baselines.

出版物
In International Joint Conference on Neural Networks 2022
方岭永
方岭永
硕士研究生
刘功申
刘功申
教授