Cross-Media Learning for Image Sentiment Analysis in the Wild
2017 IEEE International Conference on Computer Vision Workshops (ICCVW)
https://doi.org/10.1109/ICCVW.2017.45Abstract
Much progress has been made in the field of sentiment analysis in the past years. Researchers relied on textual data for this task, while only recently they have started investigating approaches to predict sentiments from multimedia content. With the increasing amount of data shared on social media, there is also a rapidly growing interest in approaches that work "in the wild", i.e. that are able to deal with uncontrolled conditions. In this work, we faced the challenge of training a visual sentiment classifier starting from a large set of user-generated and unlabeled contents. In particular, we collected more than 3 million tweets containing both text and images, and we leveraged on the sentiment polarity of the textual contents to train a visual sentiment classifier. To the best of our knowledge, this is the first time that a cross-media learning approach is proposed and tested in this context. We assessed the validity of our model by conducting comparative studies and evaluations on a benchmark for visual sentiment analysis. Our empirical study shows that although the text associated to each image is often noisy and weakly correlated with the image content, it can be profitably exploited to train a deep Convolutional Neural Network that effectively predicts the sentiment polarity of previously unseen images.
References (53)
- Groundtruth: NEG, Prediction: NEG Groundtruth: NEG, Prediction: NEU Groundtruth: NEG, Prediction: POS Groundtruth: NEU, Prediction: NEG Groundtruth: NEU, Prediction: NEU Groundtruth: NEU, Prediction: POS Groundtruth: POS, Prediction: NEG Groundtruth: POS, Prediction: NEU Groundtruth: POS, Prediction: POS References
- G. Amato, F. Carrara, F. Falchi, C. Gennaro, C. Meghini, and C. Vairo. Deep learning for decentralized parking lot occupancy detection. Expert Syst. Appl., 72:327 -334, 2017. 4
- G. Amato, F. Falchi, and L. Vadicamo. Visual recognition of ancient inscriptions Using Convolutional Neural Network and Fisher Vector. JOCCH, 9(4):21:1-21:24, Dec. 2016. 4
- M. Avvenuti, S. Cresci, F. Del Vigna, and M. Tesconi. Im- promptu crisis mapping to prioritize emergency response. Computer, 49(5):28-37, 2016. 1
- S. Baccianella, A. Esuli, and F. Sebastiani. Sentiwordnet 3.0: An enhanced lexical resource for sentiment analysis and opinion mining. In LREC 2010. 5
- C. Baecchi, T. Uricchio, M. Bertini, and A. Del Bimbo. A multimodal feature learning approach for sentiment analysis of social network multimedia. Multimedia Tools and Appli- cations, 75(5):2507-2525, 2016. 3
- F. Barbieri, V. Basile, D. Croce, M. Nissim, N. Novielli, and V. Patti. Overview of the evalita 2016 sentiment polarity classification task. In EVALITA 2016. 3
- A. Bermingham and A. F. Smeaton. Classifying sentiment in microblogs: is brevity an advantage? In CIKM 2010. ACM. 2
- D. Borth, R. Ji, T. Chen, T. Breuel, and S.-F. Chang. Large- scale visual sentiment ontology and detectors using adjective noun pairs. In Multimedia 2013. ACM. 1, 2, 8
- V. Campos, B. Jou, and X. Giró i Nieto. From pixels to sen- timent: Fine-tuning CNNs for visual sentiment prediction. Image and Vision Computing, 2017. 3, 6, 8
- D. Cao, R. Ji, D. Lin, and S. Li. A cross-media public sen- timent analysis system for microblog. Multimedia Systems, 22(4):479-486, 2016. 1, 3
- T. Chen, D. Borth, T. Darrell, and S. Chang. Deepsentibank: Visual sentiment concept classification with deep convolu- tional neural networks. CoRR, abs/1410.8586, 2014. 3, 8
- F. Chollet. Keras. https://github.com/fchollet/ keras, 2015. 5
- A. Cimino and F. Dell'Orletta. Tandem LSTM-SVM ap- proach for sentiment analysis. In EVALITA 2016. 3, 5
- S. Cresci, M. Tesconi, A. Cimino, and F. Dell'Orletta. A linguistically-driven approach to cross-event damage assess- ment of natural disasters from social media messages. In WWW Companion 2015. ACM. 1
- R. Datta, D. Joshi, J. Li, and J. Z. Wang. Image retrieval: Ideas, influences, and trends of the new age. ACM Comput. Surv., 40(2):5:1-5:60, May 2008. 1
- J. Deriu, M. Gonzenbach, F. Uzdilli, A. Lucchi, V. D. Luca, and M. Jaggi. Swisscheese at semeval-2016 task 4: Sentiment classification using an ensemble of convolutional neural networks with distant supervision. In SemEval @ NAACL-HLT 2016. 2
- Y. Gal and Z. Ghahramani. A theoretically grounded applica- tion of dropout in recurrent neural networks. In NIPS 2016. 5
- A. Go, R. Bhayani, and L. Huang. Twitter sentiment classi- fication using distant supervision. CS224N Project Report, Stanford, 1(12), 2009. 2
- M. Hu and B. Liu. Mining and summarizing customer re- views. In SIGKDD 2004. ACM. 5
- J. Islam and Y. Zhang. Visual sentiment analysis for so- cial images using transfer learning approach. In BDCloud- SocialCom-SustainCom 2016. IEEE. 3, 6, 8
- Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Gir- shick, S. Guadarrama, and T. Darrell. Caffe: Convolu- tional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093, 2014. 6
- B. Jou, T. Chen, N. Pappas, M. Redi, M. Topkara, and S.- F. Chang. Visual affect around the world: A large-scale multilingual visual sentiment ontology. In Multimedia 2015. ACM. 1, 2, 3, 8
- A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In NIPS 2012. 3, 4, 8
- M. Laver, K. Benoit, and J. Garry. Extracting policy po- sitions from political texts using words as data. American Political Science Review, 97(2):311-331, 005 2003. 1
- S. Li, Z. Wang, G. Zhou, and S. Y. M. Lee. Semi-supervised learning for imbalanced sentiment classification. In IJCAI 2011. 4
- X. Li, T. Uricchio, L. Ballan, M. Bertini, C. G. M. Snoek, and A. D. Bimbo. Socializing the semantic gap: A comparative survey on image tag assignment, refinement, and retrieval. ACM Comput. Surv., 49(1):14:1-14:39, June 2016. 1
- Z. Li, Y. Fan, W. Liu, and F. Wang. Image sentiment predic- tion based on textual descriptions with adjective noun pairs. Multimedia Tools and Applications, pages 1-18, 2017. 3, 6
- L. Ma, Z. Lu, L. Shang, and H. Li. Multimodal convolutional neural networks for matching image and sentence. In ICCV 2015, pages 2623-2631. IEEE, 2015. 3
- J. Machajdik and A. Hanbury. Affective image classifica- tion using features inspired by psychology and art theory. In Multimedia 2010. ACM. 2
- J. Mao, W. Xu, Y. Yang, J. Wang, Z. Huang, and A. Yuille. Deep captioning with multimodal recurrent neural networks (m-rnn). arXiv preprint arXiv:1412.6632, 2014. 3
- T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean. Distributed representations of words and phrases and their compositionality. In NIPS 2013. 2, 5
- G. Mishne, N. S. Glance, et al. Predicting movie sales from blogger sentiment. In Computational Approaches to Analyz- ing Weblogs 2006. AAAI. 1
- S. M. Mohammad, S. Kiritchenko, and X. Zhu. Nrc-canada: Building the state-of-the-art in sentiment analysis of tweets. In SemEval @ NAACL-HLT 2013. 4
- P. Nakov, A. Ritter, S. Rosenthal, F. Sebastiani, and V. Stoy- anov. Semeval-2016 task 4: Sentiment analysis in twitter. In SemEval @ NAACL-HLT 2016. 2, 4
- B. O'Connor, R. Balasubramanyan, B. R. Routledge, and N. A. Smith. From tweets to polls: Linking text sentiment to public opinion time series. In ICWSM 2010. AAAI. 1
- J. Pennington, R. Socher, and C. D. Manning. Glove: Global vectors for word representation. In EMNLP 2014. 2, 5
- T. Rao, M. Xu, and D. Xu. Learning multi-level deep rep- resentations for image emotion classification. arXiv preprint arXiv:1611.07145, 2016. 3
- A. S. Razavian, H. Azizpour, J. Sullivan, and S. Carls- son. CNN features off-the-shelf: an astounding baseline for recognition. In CVPRW 2014, pages 512-519. IEEE, 2014. 4
- M. Rouvier and B. Favre. SENSEI-LIF at semeval-2016 task 4: Polarity embedding fusion for robust sentiment analysis. In SemEval @ NAACL-HLT 2016. 2
- O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, and L. Fei-Fei. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV), 115(3):211-252, 2015. 5, 8
- M. Schuster and K. K. Paliwal. Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing, 45(11):2673-2681, 1997. 4
- S. Siersdorfer, E. Minack, F. Deng, and J. Hare. Analyzing and predicting sentiment of images on the social web. In Multimedia 2010. ACM. 1, 2, 8
- K. Simonyan and A. Zisserman. Very deep convolu- tional networks for large-scale image recognition. CoRR, abs/1409.1556, 2014. 4, 5, 8
- C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich. Going deeper with convolutions. In CVPR 2015. IEEE. 3, 8
- D. Tang, B. Qin, and T. Liu. Document modeling with gated recurrent neural network for sentiment classification. In EMNLP 2015. 4
- T. Tieleman and G. Hinton. Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude. COURSERA: Neural networks for machine learning, 4(2), 2012. 5
- Y. Wang, S. Wang, J. Tang, H. Liu, and B. Li. Unsupervised sentiment analysis for social media images. In IJCAI 2015. 3
- T. Wilson, J. Wiebe, and P. Hoffmann. Recognizing con- textual polarity in phrase-level sentiment analysis. In HLT- EMNLP 2005. 5
- Q. You, J. Luo, H. Jin, and J. Yang. Cross-modality consis- tent regression for joint visual-textual sentiment analysis of social multimedia. In WSDM 2016. ACM. 1, 3
- Q. You, J. Luo, H. Jin, and J. Yang. Robust image sentiment analysis using progressively trained and domain transferred deep networks. CoRR, abs/1509.06041, 2015. 1, 3, 6, 8
- J. Yuan, S. Mcdonough, Q. You, and J. Luo. Sentribute: Image sentiment analysis from a mid-level perspective. In WISDOM @ SIGKDD 2013. ACM. 2, 8
- B. Zhou, A. Lapedriza, J. Xiao, A. Torralba, and A. Oliva. Learning deep features for scene recognition using places database. In NIPS 2014. 3, 4, 5, 8