One of the challenges of sentiment analysis and emotion recognition is how to effectively fuse th... more One of the challenges of sentiment analysis and emotion recognition is how to effectively fuse the multimodal inputs. The transformer-based models have achieved great success in applications of multimodal sentiment analysis and emotion recognition recently. However, the transformerbased model often neglects the coherence of human emotion due to its parallel structure. Additionally, a low-rank bottleneck created by multi-attention-head causes an inadequate fitting ability of models. To tackle these issues, a Deep Spatiotemporal Interaction Network (DSIN) is proposed in this study. It consists of two main components, i.e., a cross-modal transformer with a cross-talking attention module and a hierarchically temporal fusion module, where the crossmodal transformer is used to model the spatial interactions between different modalities and the hierarchically temporal fusion network is utilized to model the temporal coherence of emotion. Therefore, the DSIN can model the spatiotemporal interactions of multimodal inputs by incorporating the time-dependency into the parallel structure of transformer and decrease the redundancy of embedded features by implanting their spatiotemporal interactions into a hybrid memory network in a hierarchical manner. The experimental results on two benchmark datasets indicate that DSIN achieves superior performance compared with the state-of-the-art models, and some useful insights are derived from the results.
Planning has been proven to be an effective strategy for dealing with complex tasks in environmen... more Planning has been proven to be an effective strategy for dealing with complex tasks in environments. However, due to the constraints of computational budget and the accumulated model biases, planning for pixel-based long horizon tasks with limited samples remains a great challenge. To address this issue, a Regularized Model Predictive Control (RMPC) was proposed in this study. RMPC performs trajectory optimization using short-term reward estimates and long-term return estimates, which avoids the high burden of long-horizon planning. Additionally, an implicit regularization mechanism is employed to improve the robustness of the generated environment model and reliability of the value function estimation, which helps to reduce the risk of accumulated model biases. Extensive comparison experiments and ablation studies are performed on the benchmark datasets for evaluating the proposed RMPC. And empirical results show that RMPC outperforms the previous SOTA algorithms in terms of sample-efficiency (20.88% performance improvement) and model stability (56.39% standard deviation reduction) on pixel-based continuous control tasks from DMControl-100k benchmark. Our code is available at: https://github.com/Arya87/RMPC.
Uploads
Papers by ZHANG FENG