Papers by Maxim Vashkevich
Short‐critical‐path algorithm for allpass transform
Electronics Letters
Allpass transform is a key element of the discrete systems which produces signal processing in wa... more Allpass transform is a key element of the discrete systems which produces signal processing in warped frequency domain. The paper presents a new short‐critical path algorithm for allpass transform, which essentially consists of calculating the outputs of allpass filters chain. The algorithm is based on the proposed dual allpass filter structure, which computes the outputs of a cascade of two allpass filters in one computational pass. The dual allpass filter structure is obtained using the state variable representation of the allpass filter. The efficiency of the proposed algorithm is analyzed in the context of allpass transform implementation using a general purpose CPU. The results of the experiment show that the use of the proposed algorithm accelerates the calculation of the allpass transform by 1.81 times.
arXiv (Cornell University), Oct 31, 2011
This paper presents an efficient approach for multiplierless implementation for eight-point DCT a... more This paper presents an efficient approach for multiplierless implementation for eight-point DCT approximation, which based on coordinate rotation digital computer (CORDIC) algorithm. The main design objective is to make critical path of corresponding circuits shorter and reduce the combinational delay of proposed scheme.
arXiv (Cornell University), Nov 6, 2012
The paper presents an algebraic technique for derivation of fast discrete cosine transform (DCT) ... more The paper presents an algebraic technique for derivation of fast discrete cosine transform (DCT) algorithms. The technique is based on the algebraic signal processing theory (ASP). In ASP a DCT associates with a polynomial algebra A C = C[x]/p(x). A fast algorithm is obtained as a stepwise decomposition of A C . In order to reveal the connection between derivation of fast DCT algorithms and Galois theory we define A over the field of rational numbers Q instead of complex C. The decomposition of A Q requires the extension of the base field Q to splitting field E of polynomial p(x). Galois theory is used to find intermediate subfields L i in which polynomial p(x) is factored. Based on this factorization fast DCT algorithm is derived.
arXiv (Cornell University), Mar 15, 2012
A fast Discrete Cosine Transform (DCT) algorithm is introduced that can be of particular interest... more A fast Discrete Cosine Transform (DCT) algorithm is introduced that can be of particular interest in image processing. The main features of the algorithm are regularity of the graph and very low arithmetic complexity. The 16-point version of the algorithm requires only 32 multiplications and 81 additions. The computational core of the algorithm consists of only 17 nontrivial multiplications, the rest 15 are scaling factors that can be compensated in the post-processing. The derivation of the algorithm is based on the algebraic signal processing theory (ASP).
Publication in the conference proceedings of EUSIPCO, Lisbon, Portugal, 2014
Publication in the conference proceedings of EUSIPCO, Bucharest, Romania, 2012
2017 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA), 2017
2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO), 2012
The paper presents a hearing aid (HA) system based on a low-delay cochlear filter bank derived fr... more The paper presents a hearing aid (HA) system based on a low-delay cochlear filter bank derived from the discrete cochlear model. The spectral gain shaping method (SGSM) that includes dynamic range compression, hearing loss compensation and noise reduction is applied in perceptually matched frequency bands. The acoustic feedback cancellation is implemented as an off-the-forward path scheme and does not introduce an additional delay to the forward path. The cancellation itself is made by adaptive filtering of non-uniform subband signals obtained from an oversampled warped cosine-modulated filter bank.

Two speech processing systems have been developed for real-time and non-real-time voice conversio... more Two speech processing systems have been developed for real-time and non-real-time voice conversion. Using the real-time processing the user can apply conversion during voice over IP (VoIP) calls imitating identity of a specified target speaker. Non-real-time processing system converts prerecorded audio books read by a professional reader imitating voice of the user. Both systems require some speech samples of the user for training. The training procedures are similar for both systems however the user is considered as a source speaker in the first case and as a target speaker in the second. For parametric representation of speech we use a speech model based on instantaneous harmonic parameters with multicomponent sinusoidal excitation. The voice conversion itself is made using artificial neural networks (ANN) with rectified linear units. Here we demonstrate implementations of the voice conversion systems with dedicated web interfaces and iPhone application.
Speech Analysis Based on Sinusoidal Model with Time-Varying Parameters
Journal of The Audio Engineering Society, 2015
A real-time pitch modification system has been developed. The implemented processing scheme is ba... more A real-time pitch modification system has been developed. The implemented processing scheme is based on hybrid deterministic/stochastic decomposition of the signal and includes extraction of instantaneous pitch, pitch-synchronous time-frequency analysis, parametrical morphing and synthesis. The scheme provides high quality output with considerably high naturalness. The aim of the presentation is to show capabilities of the designed real-time signal processing framework. The system implements speech-specific intonation change routines such as lowering, uplifting, tremolo etc. In order to make the presentation more expressive we designed a special singing mode in which the system automatically corrects wrong notes. The target melody and voice effects are specified using musical instruments digital interface (MIDI).
Journal of The Audio Engineering Society, 2015
General-Purpose Listening Enhancement Based on Subband Non-Linear Amplification with Psychoacoustic Criterion
Journal of The Audio Engineering Society, 2015
Two speech processing systems have been developed for real-time and non-real-time voice conversio... more Two speech processing systems have been developed for real-time and non-real-time voice conversion. Using the real-time version the user can apply conversion during voice over IP (VoIP) calls imitating identity of a specified target speaker. Non-real-time processing system converts prerecorded audio books read by a professional reader imitating voice of the user. Both systems require some speech samples of the user for training. The training procedures are similar for both systems however the user is considered as a source speaker in the first case and as a target speaker in the second. For parametric representation of speech we use a speech model based on instantaneous harmonic parameters with multicomponent sinusoidal excitation. The voice conversion itself is made using artificial neural networks (ANN) with rectified linear units.
Features extraction for the automatic detection of ALS disease from acoustic speech signals
2018 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA), 2018
The paper presents a features for detection of pathological changes in acoustic speech signal for... more The paper presents a features for detection of pathological changes in acoustic speech signal for the diagnosis of the bulbar form of Amyotrophic Lateral Sclerosis (ALS). We collected records of the running speech test from 48 people, 26 with ALS. The proposed features are based on joint analysis of different vowels. Harmonic structure of the vowels are also taken into consideration. We also presenting the rationale of vowels selection for calculation of the proposed features. Applying this features to classification task using linear discriminant analysis (LDA) lead to overall correct classification performance of 88.0%.

2019 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA), 2019
On average the lack of biological markers causes a one year diagnostic delay to detect amyotrophi... more On average the lack of biological markers causes a one year diagnostic delay to detect amyotrophic lateral sclerosis (ALS). To improve the diagnostic process an automatic voice assessment based on acoustic analysis can be used. The purpose of this work was to verify the sutability of the sustain vowel phonation test for automatic detection of patients with ALS. We proposed enhanced procedure for separation of voice signal into fundamental periods that requires for calculation of perturbation measurements (such as jitter and shimmer). Also we proposed method for quantitative assessment of pathological vibrato manifestations in sustain vowel phonation. The study's experiments show that using the proposed acoustic analysis methods, the classifier based on linear discriminant analysis attains 90.7% accuracy with 86.7% sensitivity and 92.2% specificity.
A Mobile Application for Detection of Amyotrophic Lateral Sclerosis via Voice Analysis
Speech and Computer, 2021
Interspeech 2013, 2013
This paper presents an approach to parametric voice conversion that can be used in real-time ente... more This paper presents an approach to parametric voice conversion that can be used in real-time entertainment applications. The approach is based on spectral mapping using an artificial neural network (ANN) with rectified linear units (ReLU). To overcome the oversmoothing problem a special network configuration is proposed that utilizes temporal states of the speaker. The speech is represented using the harmonic plus noise model. The parameters of the model are estimated using instantaneous harmonic parameters. Using objective and subjective measures the proposed voice conversion technique is compared to the main alternative approaches.

Biomedical Signal Processing and Control, 2021
Amyotrophic lateral sclerosis (ALS) is incurable neurological disorder with rapidly progressive c... more Amyotrophic lateral sclerosis (ALS) is incurable neurological disorder with rapidly progressive course. Common early symptoms of ALS are difficulty in swallowing and speech. However, early acoustic manifestation of speech and voice symptoms is very variable, that making their detection very challenging, both by human specialists and automatic systems. This study presents an approach to voice assessment for automatic system that separates healthy people from patients with ALS. In particular, this work focus on analysing of sustain phonation of vowels /a/ and /i/ to perform automatic classification of ALS patients. A wide range of acoustic features such as MFCC, formants, jitter, shimmer, vibrato, PPE, GNE, HNR, etc. were analysed. We also proposed a new set of acoustic features for characterizing harmonic structure of the vowels. Calculation of these features is based on pitch synchronized voice analysis. A linear discriminant analysis (LDA) was used to classify the phonation produced by patients with ALS and those by healthy individuals. Several algorithms of feature selection were tested to find optimal feature subset for LDA model. The study's experiments show that the most successful LDA model based on 32 features picked out by LASSO feature selection algorithm attains 99.7% accuracy with 99.3% sensitivity and 99.9% specificity. Among the classifiers with a small number of features, we can highlight LDA model with 5 features, which has 89.0% accuracy (87.5% sensitivity and 90.4% specificity).
2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2016
The paper presents an algorithm for accurate instantaneous pitch extraction Main points 1) Sinuso... more The paper presents an algorithm for accurate instantaneous pitch extraction Main points 1) Sinusoidal modeling of signals with rapid pitch change; 2) Extraction of instantaneous pitch contour; 3) High analysis accuracy in the whole pitch range; 4) Good performance in presence of additive noises.
Uploads
Papers by Maxim Vashkevich