AUTOMATIC TEXT INDEPENDENT LANGUAGE IDENTIFICATIONA.NAGESH-India
Automatic language identification (LID) is the task of identifying the
language of a given utterance of speech using a machine. It is gaining increased
importance in the context of economic globalization.
The conventional language identification system requires difficult and time consuming labeling process of phoneme boundaries of the utterances in the speech corpus. In the present work, an attempt is made to develop an automatic language identification system that does not require labeled speech corpus or linguistic information of the target languages.
Till now, for conventional LID systems, features are extracted using Mel-frequency Cepstral Coefficients (MFCC). Although the importance of acoustic-phonetics information in the speech has been realized, an attempt has been made to extract new features from the speech signal. This new method of feature extraction is based on the principle that the frequency of occurrence of phonemes is different in different languages. In this work, the probability of each feature vector in the acoustic class is computed. New type of feature vectors are proposed which capture the variations in the frequency of occurrence of phonemes across the languages effectively. Based on these proposed new feature vectors, LID systems are built, namely, new features based LID system using Gaussian Mixture Models (GMM) and new features based LID system using hidden Markov Models (HMM).
The performance evaluation of three LID systems, MFCC features based LID system using Vector Quantization (VQ), new features based GMM and HMM LID systems are presented. It is established that, the identification performance of LID system using HMM has out performed the identification performance of LID systems using GMM and VQ.
It is also established that, based on these new type of features, HMM based LID system has achieved significant improvement in the identification performance than the conventional features (MFCC) based LID systems. The experiments were carried out on Oregon Graduate Institute Multi-language Telephone (OGI_MLT) speech corpus.