Dialect recognition is useful in many industrial sectors, par-ticularly with the aim of allowing a better interaction between customers and providers. The core idea is to improve or customize marketing and customer service strategies, de-pending on the geographic location, birthplace and culture. This study proposes different models to automatically dis-criminate between two Colombian dialects: "Antioqueño" and "Bogotano", to the best of our knowledge this is the first work of Colombian dialect recognition based on real conver-sations from customer service centers. The proposed strategy consists of independent analyses, using information from speech recordings and their corresponding transliterations. On the one hand, classical approaches are used to model speech including prosody features, Mel frequency cepstral coefficients and the mean Hilbert envelope coefficients. For text models, Word2Vec and bidirectional encoding represen-tations from transformer embeddings are considered. On the other hand, a deep learning approach is applied by considering convolutional neural networks, which are trained using spectrograms and embedding matrices for speech and text, respectively. The implemented deep learning models seem to be more promising than the classical ones for the addressed problem. Further experiments will be considered to validate this claim in a wider spectrum of methods.
Tópico:
Spanish Linguistics and Language Studies
Citaciones:
4
Citaciones por año:
Altmétricas:
0
Información de la Fuente:
Fuente2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)