Academia.eduAcademia.edu

Outline

An Auto Encoder For Audio Dolphin Communication

2020, arXiv (Cornell University)

https://doi.org/10.48550/ARXIV.2005.07623

Abstract

Research in dolphin communication and cognition requires detailed inspection of audible dolphin signals. The manual analysis of these signals is cumbersome and time-consuming. We seek to automate parts of the analysis using modern deep learning methods. We propose to learn an autoencoder constructed from convolutional and recurrent layers trained in an unsupervised fashion. The resulting model embeds patterns in audible dolphin communication. In several experiments, we show that the embeddings can be used for clustering as well as signal detection and signal type classification.

References (23)

  1. Kohlsdorf, "Data Mining In Large Audio Collections Of Dolphin Signals," PhD Thesis, Georgia Institute of Technology, 2015.
  2. Kohlsdorf, Herzing and Starner, "Methods for Discovering Models of Behavior: A Case Study with Wild Atlantic Spotted Dolphins," Animal Behavior and Cognition, 2016.
  3. Kohlsdorf, Herzing and Starner, "Feature Learning and Automatic Segmentation for Dolphin Communication Analysis," Interspeech 16, 2016.
  4. Coates, Lee and Ng, "An Analysis of Single-Layer Networks in Un- supervised Feature Learning," International Conference on Artificial Intelligence and Statistics, 2011.
  5. Socher, Lin, Ng and Manning, "Parsing Natural Scenes and Natural Language with Recursive Neural Networks", ICML, 2011.
  6. Cho, van Merrienboer, Gulcehre, Bahdanau, Bougares, Schwenk and Bengio, "Learning Phrase Representations using RNN EncoderDecoder for Statistical Machine Translation," CoRR, 2014
  7. Ranzato, Huang, Boureau and LeCun, "Unsupervised Learning of In- variant Feature Hierarchies with Applications to Object Recognition," CVPR, 2007
  8. Lampert and O'Keefe, "A survey of spectrogram track detection algo- rithms," Applied Acoustics, vol. 71, no. 2, pp. 87 100, 2010.
  9. Kershenbaum, Sayigh and Janik, "The encoding of individual identity in dolphin signature whistles: How much information is needed?," PLoS ONE, vol. 8, no. 10, 2013.
  10. Deecke, Ford, and Spong,"Quantifying complex patterns of bioacoustic variation: Use of a neural network to compare killer whale (Orcinus orca) dialects," Journal of the Acoustical Society of America, 1999.
  11. van der Maaten and Hinton, "Visualizing High-Dimensional Data Using t-SNE," Journal of Machine Learning Research, 2008.
  12. Hochreiter and Schmidhuber, "Long Short-Term Memory," Neural Com- putation 9.8, 1997.
  13. Herzing, "Clicks, whistles and pulses: Passive and active signal use in dolphin communication," Acta Astronautica, 2014.
  14. Schuster and Paliwal, "Bidirectional recurrent neural networks," IEEE Transactions on Signal Processing, 1997.
  15. Kingma and Lei Ba, "ADAM: A method for stochastic optimization," ICLR, 2015.
  16. Arthur and Vassilvitskii, "k-means++: The Advantages of Careful Seed- ing," ACM-SIAM symposium on Discrete algorithms, 2007.
  17. Kohlsdorf, Mason, Herzing and Starner, "Probabilistic extraction and discovery of fundamental units in dolphin whistles," ICASSP, 2014.
  18. Halkias and Ellis, "Call detection and extraction using Bayesian infer- ence," Applied Acoustics, 2006.
  19. Lampert and OKeefe, "An active contour algorithm for spectrogram track detection," Pattern Recognition Letters, 2010.
  20. Shapiro and Wang, "A versatile pitch tracking algorithm: From human speech to killer whale vocalizations," The Journal of the Acoustical Society of America, 2009
  21. Cornell, "Lab of Ornithology's Raven: Interactive sound analysis soft- ware," Bioacoustics Research Program, 2014
  22. Rousseeuw, "Silhouettes: a Graphical Aid to the Interpretation and Val- idation of Cluster Analysis," Computational and Applied Mathematics, 1987
  23. Sainburg, Thielk and Gentner: "Latent Space Visualization, Charac- terization, and Generation of Diverse Vocal Communication Signals", Preprint, available at https://www.biorxiv.org/content/10.1101/870311v1