Hierarchical Non-Emitting Markov Models
We describe a simple variant of the interpolated Markov model with
non-emitting state transitions and prove that it is strictly more
powerful than any Markov model. More importantly, the non-emitting
model outperforms the classic interpolated model on the natural
language texts under a wide range of experimental conditions, with
only a modest increase in computational requirements. The non-emitting
model is also much less prone to overfitting.