Abstract:
The pronunciation variation is a well-known
phenomenon that has been widely investigated for automatic
speech recognition (ASR). The knowledge-based
phonological rules are generally used to capture the accurate
phonetic realization in order to minimize the mismatch
between the ASR dictionary and the actual phonetic representation
of the speech signal. For the Arabic ASR, there
are a number of studies that employ these rules on Arabic
ASR systems; however, little research has been devoted to
measure the precise performance of each rule. In this paper,
we aim at finding the exact effect of each rule as well as
the rules that have no influence. We used the Carnegie Mellon
University PocketSphinx speech recognizer with a new
“in-house” modern standard Arabic speech corpus that
contains 19 h for training and 3.7 h for testing. We evaluated
the effect of three famous rules (Shadda, Tanween,
and the solar letters). The experimental results do not show
clear evidence that using phonological rules for ASR dictionary
adaptation can enhance the performance for withinword
pronunciation variation. The obtained results might
be an indication to rethink or use other ASR performance
aspects, such as cross-word pronunciation variation and the
optimal phonemes set of the Arabic language.