earticle

논문검색

Morpheme Segmentation and Concatenation Approaches for Uyghur LVCSR

초록

영어

In this paper, various kinds of sub-word lexica are thoroughly investigated under the framework of Uyghur LVCSR system. Experimental results show that it is inefficient to directly model based on word units or small units like morpheme or even syllable units. It is observed that an optimal sub-word unit set between word and morpheme units can better fit for ASR system. In order to select best unit set we have investigated several effective unit segmentation, concatenation approaches, and their ASR performances. For segmentation approach, we investigate a supervised segmentation which split words into the smallest functional units - the linguistic morphemes, and an unsupervised segmentation which extract pseudo-morphemes (or statistical morphemes). In supervised model, a leaning algorithm is trained on a manually prepared training corpus, and morpho-phonetics changes are analyzed. In the unsupervised model, the Morfessor tool is used to extract pseudo-morphemes from a raw text corpus. For concatenation approach, several effective concatenation approaches are investigated based on linguistic morphemes. First is the data-driven approach which concatenates morpheme sequences based on certain measures like co-occurrence frequency or mutual probability. Second is a model based approach which merges units with global statistical criteria. In this study, the Morfessor program is revised and turned into concatenation program by controlling segmentation points. Third is the two-layer-lexica based concatenation approach which extracts an optimal sub-word unit set by aligning and comparing the ASR results of word and morpheme two lexical layers. This method utilizes both speech and text, and produced the best results in terms of WER and lexicon size, and proved to be very stable. The best optimal lexicon, which is obtained totally on the basis of HMM based acoustic model, outperformed all other baseline lexica. And when all these lexica are directly incorporated with a deep neural network (DNN) based acoustic model, without changing the speech and text training corpora and language models, the optimal lexicon not only drastically improved the ASR accuracy but also outperformed other units as a proof of the generality of the two-layer-lexica based approach.

목차

Abstract
 1. Introduction
 2. Morpheme Segmentation Approaches
  2.1. Supervised Morpheme Segmentation
  2.2. Unsupervised Morpheme Segmentation
 3. Morpheme Concatenation Approaches
  3.1. Data-driven morpheme concatenation approaches
  3.2. A statistical model based morpheme concatenation approach
  3.3. Two-layer-lexica based morpheme concatenation approaches
 4. ASR results for segmented and concatenated lexica
  4.1. Acoustic model construction
  4.2. Lexical model construction
  4.3. ASR results on segmented lexica
  4.4. ASR results on concatenated lexica
  4.5. DNN based ASR results
 5. Conclusions
 Acknowledgements
 References

저자정보

  • Mijit Ablimit Postdoctoral Research Station of Computer Science and Technology, Xinjiang University, Urumqi, China 830046
  • Tatsuya Kawahara School of Informatics, Kyoto University, Kyoto, Japan
  • Askar Hamdulla School of Software, Xinjiang University, Urumqi, China 830046

참고문헌

자료제공 : 네이버학술정보

    함께 이용한 논문

      ※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

      0개의 논문이 장바구니에 담겼습니다.