书目名称 | Language Modeling for Information Retrieval | 编辑 | W. Bruce Croft (Distinguished Professor),John Laff | 视频video | | 丛书名称 | The Information Retrieval Series | 图书封面 |  | 描述 | A statisticallanguage model, or more simply a language model, is a prob abilistic mechanism for generating text. Such adefinition is general enough to include an endless variety of schemes. However, a distinction should be made between generative models, which can in principle be used to synthesize artificial text, and discriminative techniques to classify text into predefined cat egories. The first statisticallanguage modeler was Claude Shannon. In exploring the application of his newly founded theory of information to human language, Shannon considered language as a statistical source, and measured how weH simple n-gram models predicted or, equivalently, compressed natural text. To do this, he estimated the entropy of English through experiments with human subjects, and also estimated the cross-entropy of the n-gram models on natural 1 text. The ability of language models to be quantitatively evaluated in tbis way is one of their important virtues. Of course, estimating the true entropy of language is an elusive goal, aiming at many moving targets, since language is so varied and evolves so quickly. Yet fifty years after Shannon‘s study, language models remain, by all measures, | 出版日期 | Book 2003 | 关键词 | DOM; Performance; Text; cognition; database; filtering; machine translation; speech recognition | 版次 | 1 | doi | https://doi.org/10.1007/978-94-017-0171-6 | isbn_softcover | 978-90-481-6263-5 | isbn_ebook | 978-94-017-0171-6Series ISSN 1871-7500 Series E-ISSN 2730-6836 | issn_series | 1871-7500 | copyright | Springer Science+Business Media Dordrecht 2003 |
The information of publication is updating
|
|