书目名称 | Explorations in Automatic Thesaurus Discovery | 编辑 | Gregory Grefenstette | 视频video | | 丛书名称 | The Springer International Series in Engineering and Computer Science | 图书封面 |  | 描述 | .Explorations in Automatic Thesaurus Discovery. presentsan automated method for creating a first-draft thesaurus from rawtext. It describes natural processing steps of tokenization, surfacesyntactic analysis, and syntactic attribute extraction. From theseattributes, word and term similarity is calculated and a thesaurus iscreated showing important common terms and their relation to eachother, common verb--noun pairings, common expressions, and word familymembers. .The techniques are tested on twenty different corpora ranging frombaseball newsgroups, assassination archives, medical X-ray reports,abstracts on AIDS, to encyclopedia articles on animals, even on thetext of the book itself. The corpora range from 40,000 to 6 millioncharacters of text, and results are presented for each in theAppendix. .The methods described in the book have undergone extensive evaluation.Their time and space complexity are shown to be modest. The resultsare shown to converge to a stable state as the corpus grows. Thesimilarities calculated are compared to those produced bypsychological testing. A method of evaluation using ArtificialSynonyms is tested. Gold Standards evaluation show that techniquessignif | 出版日期 | Book 1994 | 关键词 | Thesaurus; artificial intelligence; complexity; corpus; semantic analysis | 版次 | 1 | doi | https://doi.org/10.1007/978-1-4615-2710-7 | isbn_softcover | 978-1-4613-6167-1 | isbn_ebook | 978-1-4615-2710-7Series ISSN 0893-3405 | issn_series | 0893-3405 | copyright | Springer Science+Business Media New York 1994 |
The information of publication is updating
|
|