background preloader

NLP Libraries and Toolkits

Facebook Twitter

STAR Laboratory: SRI Language Modeling Toolkit. SRILM is a toolkit for building and applying statistical language models (LMs), primarily for use in speech recognition, statistical tagging and segmentation, and machine translation. It has been under development in the SRI Speech Technology and Research Laboratory since 1995. The toolkit has also greatly benefitted from its use and enhancements during the Johns Hopkins University/CLSP summer workshops in 1995, 1996, 1997, and 2002 (see history). These pages and the software itself assume that you know what statistical language modeling is. To learn about language modeling we recommend the textbooks Either book gives an excellent introduction to N-gram language modeling, which is the main type of LM supported by SRILM.

SRILM consists of the following components: A set of C++ class libraries implementing language models, supporting data stuctures and miscellaneous utility functions. SRILM runs on UNIX and Windows platforms. Documentation SRILM is still under development. Terms of Use. Natural Language Toolkit — NLTK 2.0 documentation. Book - Natural Language Toolkit.