PERFORMANCE EVALUATION OF SPHINX AND HTK SPEECH RECOGNIZERS FOR SPOKEN ARABIC LANGUAGE

Al-Anzi, Fawaz; AbuZeina, Dia

dc.contributor.author	Al-Anzi, Fawaz
dc.contributor.author	AbuZeina, Dia
dc.date.accessioned	2021-05-09T08:08:08Z
dc.date.accessioned	2022-05-22T08:54:11Z
dc.date.available	2021-05-09T08:08:08Z
dc.date.available	2022-05-22T08:54:11Z
dc.date.issued	2019-06-03
dc.identifier.uri	http://localhost:8080/xmlui/handle/123456789/8226
dc.description.abstract	Automatic speech recognition (ASR) has lately been a focus consideration of researchers with respect to more convenient human-computer interaction. Despite the successful implementation of ASR technology in different languages, employing this technology in Arabic natural language processing (NLP) applications is limited and constrained to a small vocabulary such as digits and control commands or a limited set of words. Therefore, particular attention has been paid to promoting research in this field to automate man-machine communication. We aim to examine the performance of two popular ASR engines for identical Arabic speech collection. The ASR engines include the Carnegie Mellon University (CMU) Sphinx and the Hidden Markov Model Toolkit (HTK). In fact, performing an ASR task using different recognizers will increase researcher knowledge regarding which engine is the best fit for particular target applications, as well as enhancing research in this field. In this paper, an experimental evaluation is presented for both Sphinx and HTK recognizers using a new “in-house” Arabic continuous speech corpus that contains a total of 15.93 hours (12.74 training hours and 3.19 testing hours). The vocabulary contains 30,986 words. In these experiments, we used two text formats, Arabic characters for CMU Sphinx (PocketSphinx decoder) and Roman characters for HTK (HVite decoder) because HTK expects Roman characters. The experimental comparison shows that Sphinx outperforms (even in a shorter time) HTK. In addition, this study demonstrates the intermediate steps followed for models training including acoustic and language models.	en_US
dc.language.iso	en_US	en_US
dc.publisher	ICIC International	en_US
dc.subject	Arabic speech recognition, Sphinx, HTK, HVite, Buckwalter, Language model, Pronunciation dictionary	en_US
dc.title	PERFORMANCE EVALUATION OF SPHINX AND HTK SPEECH RECOGNIZERS FOR SPOKEN ARABIC LANGUAGE	en_US
dc.type	Article	en_US