Steganalysis of transcoding steganography
2013, annals of telecommunications - annales des télécommunications
https://doi.org/10.1007/S12243-013-0385-4Abstract
Transcoding steganography (TranSteg) is a fairly new IP telephony steganographic method that functions by compressing overt (voice) data to make space for the steganogram by means of transcoding. It offers high steganographic bandwidth, retains good voice quality, and is generally harder to detect than other existing VoIP steganographic methods. In TranSteg, after the steganogram reaches the receiver, the hidden information is extracted, and the speech data is practically restored to what was originally sent. This is a huge advantage compared with other existing VoIP steganographic methods, where the hidden data can be extracted and removed, but the original data cannot be restored because it was previously erased due to a hidden data insertion process. In this paper, we address the issue of steganalysis of TranSteg. Various TranSteg scenarios and possibilities of warden(s) localization are analyzed with regards to the TranSteg detection. A novel steganalysis method based on Gaussian mixture models and melfrequency cepstral coefficients was developed and tested for various overt/covert codec pairs in a single warden scenario with double transcoding. The proposed method allowed for efficient detection of some codec pairs (e.g., G.711/G.729), while some others remained more resistant to detection (e.g., iLBC/AMR).
References (39)
- Arackaparambil C, Yan G, Bratus S, Caglayan A (2012) On Tuning the Knobs of Distribution-based Methods for Detecting VoIP Co- vert Channels. In: Proc. of Hawaii International Conference on System Sciences (HICSS-45), Hawaii, January 2012
- Baugher M, Casner S, Frederick R, Jacobson V (2004) The Secure Real-time Transport Protocol (SRTP), RFC 3711
- Brooks M, VOICEBOX: Speech Processing Toolbox for MATLAB, http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html.
- Campbell WM, Broun CC (2000) A computationally scalable speaker recognition system. Proc. EUSIPCO 2000 Tampere, Fin- land, pp 457-460
- Cappé O. h2m Toolkit. http://www.tsi.enst.fr/∼cappe/
- Cummins F, Grimaldi M, Leonard T, Simko J (2006) The CHAINS corpus: CHAracterizing INdividual Speakers. In: Proc of SPECOM'06, St Petersburg, Russia, 2006, pp 431-435
- Dittmann J, Hesse D, Hillert R (2005) Steganography and steganalysis in voice-over IP scenarios: operational aspects and first experiences with a new steganalysis tool set. In: Proc SPIE, Vol 5681, Security, Steganography, and Watermarking of Multimedia Contents VII, San Jose, pp 607-618
- Fisk G, Fisk M, Papadopoulos C, Neil J (2002) Eliminating steganog- raphy in Internet traffic with active wardens, 5th international work- shop on information hiding. Lect Notes Comput Sci 2578:18-35
- Furui S (2009) Selected topics from 40 years of research in speech and speaker recognition, Interspeech 2009, Brighton UK
- Garateguy G, Arce G, Pelaez J (2011) Covert Channel detection in VoIP streams. In: Proc. of 45th Annual Conference on Information Sciences and Systems (CISS), March 2011, pp 1-6
- Garofolo J, Lamel L, Fisher W, Fiscus J, Pallett D, Dahlgren N et al (1993) TIMIT acoustic-phonetic continuous speech corpus. Lin- guistic Data Consortium, Philadelphia
- Grocholewski S (1997) CORPORA-Speech Database for Polish Diphones, 5th European Conference on Speech Communication and Technology Eurospeech'97. Rhodes, Greece
- Huang Y, Tang S, Bao C, Yip YJ (2011) Steganalysis of com- pressed speech to detect covert voice over Internet protocol chan- nels. IET Inf Secur 5(1):26-32
- Huang Y, Tang S, Zhang Y (2011) Detection of covert voice-over internet protocol communications using sliding window-based steganalysis. IET Commun 5(7):929-936
- Janicki A, Staroszczyk T (2011) Speaker Recognition from Coded Speech Using Support Vector Machines. In: Proc. TSD (2011) LNAI 6836. Springer, Berlin-Heidelberg, pp 291-298
- Janicki A, Mazurczyk W, Szczypiorski S (2012) Influence of Speech Codecs Selection on Transcoding Steganography. Accepted for publication in Telecommunication Systems: Modeling, Analy- sis, Design and Management, to be published, ISSN: 1018-4864, Springer US, Journal no. 11235
- Kabal P (2002) TSP speech database, Tech Rep, Department of Electrical & Computer Engineering, McGill University, Montreal, Quebec, Canada
- Kabal P (2009) ITU-T G.723.1 Speech Coder: A Matlab Imple- mentation, TSP Lab Technical Report, Dept. Electrical & Computer Engineering, McGill University, updated July 2009. http://www- mmsp.ece.mcgill.ca/Documents)
- Kräetzer C, Dittmann J (2007) Mel-Cepstrum Based Steganalysis for VoIP-Steganography. In: Proc. of the 19th Annual Symposium of the Electronic Imaging Science and Technology, SPIE and IS&T, San Jose, CA, USA, February 2007
- Kräetzer C, Dittmann J (2008) Cover Signal Specific Steganalysis: the Impact of Training on the Example of two Selected Audio Steganalysis Approaches. In: Proc. of SPIE-IS&T Electronic Im- aging, SPIE 6819
- Kräetzer C, Dittmann J (2008) Pros and cons of mel-cepstrum based audio steganalysis using SVM classification. Lect Notes Comput Sci LNCS 4567:359-377
- Li S, Huang Y (2012) Detection of QIM Steganography in G.723.1 Bit Stream Based on Quantization Index Sequence Analysis, Journal of Zhejiang University Science C (Computers & Electronics) -to appear in 2012
- Liu Q, Sung A, Qiao M (2010) Detection of double MP3 compres- sion. Cogn Comput 2:291-296
- Luo D, Luo W, Yang R, Huang J (2012) Compression history identi- fication for digital audio signal, In Proc. of IEEE International Confer- ence on Acoustics, Speech and Signal Processing (ICASSP 2012)
- Mazurczyk W, Szaga P, Szczypiorski K (2012) Using transcoding for hidden communication in IP telephony. In: Multimedia Tools and Applications, DOI 10.1007/s11042-012-1224-8
- Norskog L, Bagwell C. SoX -Sound eXchange, available at http:// sox.sourceforge.net/
- Ortega García J, González Rodríguez J, Marrero-Aguiar V (2000) AHUMADA: a large speech corpus in Spanish for speaker charac- terization and identification. Speech Comm 31:255-264
- Pevny T, Fridrich J (2008) Detection of double-compression in JPEG images for applications in steganography. IEEE Trans Inf Forensic Secur 3(2):247-258
- Ramírez J, Górriz JM, Segura JC (2007) Voice Activity Detection. Fundamentals and Speech Recognition System Robustness. In: Grimm M, Krosche K (June 2007) Robust Speech Recognition and Understanding. I-Tech, Vienna, Austria
- Reynolds DA (1995) Speaker identification and verification using Gaussian mixture speaker models. Speech Comm 17(1): 91-108
- Rodriguez-Fuentes LJ, Varona A, Diez M, Penagarikano M, Bordel G (2012) Evaluation of Spoken Language Recognition Technology Using Broadcast Speech: Performance and Challenges. In: Proc. Odyssey 2012, Singapore
- Schulzrinne H, Casner S, Frederick R, Jacobson, V (2003) RTP: A Transport Protocol for Real-Time Applications. IETF, RFC 3550, July 2003
- Takahashi T, Lee W (2007) An assessment of VoIP covert channel threats. In: Proc 3rd Int Conf Security and Privacy in Communica- tion Networks (SecureComm 2007), Nice, France, pp 371-380
- Wang W, Farid H (2006) Exposing digital forgeries in video by detecting double MPEG compression, MM&Sec'06, September 2006. Switzerland, Geneva
- Wang J, Liu G, Dai Y, Wang Z (2009) Detecting JPEG image forgery based on double compression. J Syst Eng Electron 20(5):1096-1103
- Wildermoth BR, Paliwal KK (2003) GMM-based speaker recogni- tion on readily available databases. Microelectronic Engineering Research Conference, Brisbane
- Xiph-OSC: Speex: A free codec for free speech: Documentation, available at http://www.speex.org/docs/
- Xu J, Su Y, You X (2012) Detection of video transcoding for digital forensics, Audio, Language and Image Processing (ICALIP), 2012
- International Conference on, vol., no., pp.160,164, 16-18 July 2012