Filled Pause Distribution and Modeling in Quasi-Spontaneous Speech

Publication Type:

Conference Paper


Disfluency in Spontaneous Speech, Berkeley, CA, USA, p.31-34 (1999)




Filled pauses (FP's) are characteristic of spontaneous speech and present considerable problems for speech recognition by being often recognized as short words. Recognition of quasispontaneous speech (medical dictation) is subject to this problem as well. An um can be recognized as thumb or arm if the recognizer’s language model does not adequately represent FP’s. Representing FP’s in the training corpus improves recognition. Several techniques of seeding a training corpus with FP’s were evaluated to show that a stochastic method, along with random insertion uniformly distributed around the average sentence length, yield better results compared to random insertion at other ranges. The optimal method of seeding a training corpus with FP’s may be linked to clause boundaries despite the fact that an imperfect method of inserting FP’s at clause boundaries used in this study failed.


UC Berkeley; July 30, 1999