Dept. of Electrical Engineering
University of Washington, Seattle
This talk presents novel techniques for two well-known NLP problems, statistical language modeling and part-of-speech tagging. It introduces factored language models, which are based on a a feature-vector style representation for words along with a generalized backoff scheme for probability estimation, which is shown to improve the performance on sparse-data tasks significantly compared to standard language models. The part-of-speech tagging problem is addressed within a transductive learning framework that exploits the combination of labeled and unlabeld data and achieves competitive performance while reducing the need for labeled data considerably.
Erstellt von: Anke Weinberger (2005-11-08).
Wartung durch: Anke Weinberger (2005-11-08).