metadata extraction
Recently Published Documents


TOTAL DOCUMENTS

183
(FIVE YEARS 42)

H-INDEX

12
(FIVE YEARS 1)

2021 ◽  
Author(s):  
Muntabir Hasan Choudhury ◽  
Himarsha R. Jayanetti ◽  
Jian Wu ◽  
William A. Ingram ◽  
Edward A. Fox

2021 ◽  
Vol 38 ◽  
pp. 301266
Author(s):  
Kyle Porter ◽  
Rune Nordvik ◽  
Fergus Toolan ◽  
Stefan Axelsson
Keyword(s):  

2021 ◽  
Author(s):  
Zeyd Boukhers ◽  
Nada Beili ◽  
Timo Hartmann ◽  
Prantik Goswami ◽  
Muhammad Arslan Zafar

2021 ◽  
Vol 13 (16) ◽  
pp. 9391
Author(s):  
Despina Christou ◽  
Grigorios Tsoumakas

In the era of Big Data, the digitization of texts and the advancements in Artificial Intelligence (AI) and Natural Language Processing (NLP) are enabling the automatic analysis of literary works, allowing us to delve into the structure of artifacts and to compare, explore, manage and preserve the richness of our written heritage. This paper proposes a deep-learning-based approach to discovering semantic relationships in literary texts (19th century Greek Literature) facilitating the analysis, organization and management of collections through the automation of metadata extraction. Moreover, we provide a new annotated dataset used to train our model. Our proposed model, REDSandT_Lit, recognizes six distinct relationships, extracting the richest set of relations up to now from literary texts. It efficiently captures the semantic characteristics of the investigating time-period by finetuning the state-of-the-art transformer-based Language Model (LM) for Modern Greek in our corpora. Extensive experiments and comparisons with existing models on our dataset reveal that REDSandT_Lit has superior performance (90% accuracy), manages to capture infrequent relations (100%F in long-tail relations) and can also correct mislabelled sentences. Our results suggest that our approach efficiently handles the peculiarities of literary texts, and it is a promising tool for managing and preserving cultural information in various settings.


2021 ◽  
Vol 8 ◽  
Author(s):  
Murtaza Saifee ◽  
Jian Wu ◽  
Yingna Liu ◽  
Ping Ma ◽  
Jutima Patlidanon ◽  
...  

Purpose: To introduce and validate hvf_extraction_script, an open-source software script for the automated extraction and structuring of metadata, value plot data, and percentile plot data from Humphrey visual field (HVF) report images.Methods: Validation was performed on 90 HVF reports over three different report layouts, including a total of 1,530 metadata fields, 15,536 value plot data points, and 10,210 percentile data points, between the computer script and four human extractors, compared against DICOM reference data. Computer extraction and human extraction were compared on extraction time as well as accuracy of extraction for metadata, value plot data, and percentile plot data.Results: Computer extraction required 4.9-8.9 s per report, compared to the 6.5-19 min required by human extractors, representing a more than 40-fold difference in extraction speed. Computer metadata extraction error rate varied from an aggregate 1.2-3.5%, compared to 0.2-9.2% for human metadata extraction across all layouts. Computer value data point extraction had an aggregate error rate of 0.9% for version 1, <0.01% in version 2, and 0.15% in version 3, compared to 0.8-9.2% aggregate error rate for human extraction. Computer percentile data point extraction similarly had very low error rates, with no errors occurring in version 1 and 2, and 0.06% error rate in version 3, compared to 0.06-12.2% error rate for human extraction.Conclusions: This study introduces and validates hvf_extraction_script, an open-source tool for fast, accurate, automated data extraction of HVF reports to facilitate analysis of large-volume HVF datasets, and demonstrates the value of image processing tools in facilitating faster and cheaper large-volume data extraction in research settings.


IEEE Access ◽  
2021 ◽  
Vol 9 ◽  
pp. 10621-10633
Author(s):  
Pradeeban Kathiravelu ◽  
Ashish Sharma ◽  
Puneet Sharma

Sign in / Sign up

Export Citation Format

Share Document