text summarization
Recently Published Documents


TOTAL DOCUMENTS

1490
(FIVE YEARS 697)

H-INDEX

38
(FIVE YEARS 8)

Author(s):  
Jovi D’Silva ◽  
Uzzal Sharma

<span lang="EN-US">Automatic text summarization has gained immense popularity in research. Previously, several methods have been explored for obtaining effective text summarization outcomes. However, most of the work pertains to the most popular languages spoken in the world. Through this paper, we explore the area of extractive automatic text summarization using deep learning approach and apply it to Konkani language, which is a low-resource language as there are limited resources, such as data, tools, speakers and/or experts in Konkani. In the proposed technique, Facebook’s fastText <br /> pre-trained word embeddings are used to get a vector representation for sentences. Thereafter, deep multi-layer perceptron technique is employed, as a supervised binary classification task for auto-generating summaries using the feature vectors. Using pre-trained fastText word embeddings eliminated the requirement of a large training set and reduced training time. The system generated summaries were evaluated against the ‘gold-standard’ human generated summaries with recall-oriented understudy for gisting evaluation (ROUGE) toolkit. The results thus obtained showed that performance of the proposed system matched closely to the performance of the human annotators in generating summaries.</span>


2022 ◽  
Vol 2022 ◽  
pp. 1-14
Author(s):  
Y.M. Wazery ◽  
Marwa E. Saleh ◽  
Abdullah Alharbi ◽  
Abdelmgeid A. Ali

Text summarization (TS) is considered one of the most difficult tasks in natural language processing (NLP). It is one of the most important challenges that stand against the modern computer system’s capabilities with all its new improvement. Many papers and research studies address this task in literature but are being carried out in extractive summarization, and few of them are being carried out in abstractive summarization, especially in the Arabic language due to its complexity. In this paper, an abstractive Arabic text summarization system is proposed, based on a sequence-to-sequence model. This model works through two components, encoder and decoder. Our aim is to develop the sequence-to-sequence model using several deep artificial neural networks to investigate which of them achieves the best performance. Different layers of Gated Recurrent Units (GRU), Long Short-Term Memory (LSTM), and Bidirectional Long Short-Term Memory (BiLSTM) have been used to develop the encoder and the decoder. In addition, the global attention mechanism has been used because it provides better results than the local attention mechanism. Furthermore, AraBERT preprocess has been applied in the data preprocessing stage that helps the model to understand the Arabic words and achieves state-of-the-art results. Moreover, a comparison between the skip-gram and the continuous bag of words (CBOW) word2Vec word embedding models has been made. We have built these models using the Keras library and run-on Google Colab Jupiter notebook to run seamlessly. Finally, the proposed system is evaluated through ROUGE-1, ROUGE-2, ROUGE-L, and BLEU evaluation metrics. The experimental results show that three layers of BiLSTM hidden states at the encoder achieve the best performance. In addition, our proposed system outperforms the other latest research studies. Also, the results show that abstractive summarization models that use the skip-gram word2Vec model outperform the models that use the CBOW word2Vec model.


2022 ◽  
Vol 15 (1) ◽  
pp. 1-18
Author(s):  
Krishnaveni P. ◽  
Balasundaram S. R.

The day-to-day growth of online information necessitates intensive research in automatic text summarization (ATS). The ATS software produces summary text by extracting important information from the original text. With the help of summaries, users can easily read and understand the documents of interest. Most of the approaches for ATS used only local properties of text. Moreover, the numerous properties make the sentence selection difficult and complicated. So this article uses a graph based summarization to utilize structural and global properties of text. It introduces maximal clique based sentence selection (MCBSS) algorithm to select important and non-redundant sentences that cover all concepts of the input text for summary. The MCBSS algorithm finds novel information using maximal cliques (MCs). The experimental results of recall oriented understudy for gisting evaluation (ROUGE) on Timeline dataset show that the proposed work outperforms the existing graph algorithms Bushy Path (BP), Aggregate Similarity (AS), and TextRank (TR).


2022 ◽  
Vol 12 (1) ◽  
pp. 0-0

The traditional frequency based approach to creating multi-document extractive summary ranks sentences based on scores computed by summing up TF*IDF weights of words contained in the sentences. In this approach, TF or term frequency is calculated based on how frequently a term (word) occurs in the input and TF calculated in this way does not take into account the semantic relations among terms. In this paper, we propose methods that exploits semantic term relations for improving sentence ranking and redundancy removal steps of a summarization system. Our proposed summarization system has been tested on DUC 2003 and DUC 2004 benchmark multi-document summarization datasets. The experimental results reveal that performance of our multi-document text summarizer is significantly improved when the distributional term similarity measure is used for finding semantic term relations. Our multi-document text summarizer also outperforms some well known summarization baselines to which it is compared.


Sign in / Sign up

Export Citation Format

Share Document