We test the technique on the problem of Text Summarization TS. Extractive TS relies on the concept of sentence salience to identify the most important sentences in a document or set of documents. Salience is typically defined in terms of the presence of particular important words or in terms of similarity to a centroid pseudo-sentence. We consider a new approach, LexRank, for computing sentence importance based on the concept of eigenvector centrality in a graph representation of sentences. In this model, a connectivity matrix based on intra-sentence cosine similarity is used as the adjacency matrix of the graph representation of sentences. Our system, based on LexRank ranked in first place in more than one task in the recent DUC evaluation.
|Published (Last):||4 February 2014|
|PDF File Size:||8.67 Mb|
|ePub File Size:||19.64 Mb|
|Price:||Free* [*Free Regsitration Required]|
Man-made index for technical litterature - an experiment. IBM J. Using lexical chains for text summarization. In Inderjeet Mani and Mark T. Maybury, editors, Advances in Automatic Text Summarization, pages The MIT Press, Spectral clustering for german verbs. Learning to paraphrase: An unsupervised approach using multiple-sequence alignment. Automatic condensation of electronic publications by sentence selection.
Information Processing and Management, 31 5 , The anatomy of a large-scale hypertextual Web search engine. Carbonell and Jade Goldstein. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In Research and Development in Information Retrieval, pages , Three generative, lexicalised models for statistical parsing.
Association for Computational Linguistics. New Methods in Automatic Extracting. Journal of the Association for Computing Machinery, 16 2 , April Lexpagerank: Prestige in multi-document text summarization. Hatzivassiloglou, J. Klavans, M. Holcombe, R. Barzilay, M. Kan, and K. Simfinder: A flexible clustering tool for summarization, Using hidden markov modeling to decompose Human-Written summaries.
CL, 28 4 , Statistics-based summarization -- step one: Sentence compression. Pedersen, and Francine Chen. A trainable document summarizer.
Automatic evaluation of summaries using n-gram co-occurrence. Training a Selection Function for Extraction. The Automatic Creation of Literature Abstracts. Multi-document summarization by graph search and matching. American Association for Artificial Intelligence. Kan, and Barry Schiffman. Textrank: Bringing order into texts. Pagerank on semantic networks, with application to word sense disambiguation.
Abstracting of legal cases: the potential of clustering based on the selection of representative objects. Using Maximum Entropy for Sentence Extraction. Page, S. Brin, R. Motwani, and T. The pagerank citation ranking: Bringing order to the web. A common theory of information fusion from multiple text sources, step one: Cross-document structure. Experiments in single and multi-document summarization using MEAD. Radev, Hongyan Jing, and Malgorzata Budzikowska.
Centroid-based summarization of multiple documents: sentence extraction, utility-based evaluation, and user studies. Radev and Kathleen R. Generating natural language summaries from multiple on-line sources.
Computational Linguistics, 24 3 , September Non-negative matrices and markov chains. Springer-Verlag, New York, A statistical interpretation of term specificity and its application in retrieval. Journal of Documentation, 28 1 , S[SMB97] G. Salton, A. Singhal, M. Mitra, and C. Automatic Text Structuring and Summarization.
Learning random walk models for inducing word dependency distributions. Unsupervised word sense disambiguation rivaling supervised methods.
Donate to arXiv
論文翻訳: LexRank: Graph-based Lexical Centrality as Salience in Text Summarization