document indexing Latest Research Papers

Fuzzy Ontology-Based Possibilistic Approach for Document Indexing Using Semantic Concept Relations

10.1007/978-3-030-86475-0_26 ◽

2021 ◽

pp. 264-269

Author(s):

Kabil Boukhari ◽

Mohamed Nazih Omri

Keyword(s):

Semantic Concept ◽

Document Indexing ◽

Fuzzy Ontology

Download Full-text

Testing the validity of Wikipedia categories for subject matter labelling of open-domain corpus data

Journal of Information Science ◽

10.1177/0165551520977438 ◽

2020 ◽

pp. 016555152097743

Author(s):

Ahmad Aghaebrahimian ◽

Andy Stauder ◽

Michael Ustaszewski

Keyword(s):

Subject Matter ◽

Qualitative Data ◽

Classification Systems ◽

Open Domain ◽

Document Indexing ◽

Category System ◽

Text Corpora ◽

Knowledge Organisation ◽

Valid Instrument ◽

Manual Indexing

The Wikipedia category system was designed to enable browsing and navigation of Wikipedia. It is also a useful resource for knowledge organisation and document indexing, especially using automatic approaches. However, it has received little attention as a resource for manual indexing. In this article, a hierarchical taxonomy of three-level depth is extracted from the Wikipedia category system. The resulting taxonomy is explored as a lightweight alternative to expert-created knowledge organisation systems (e.g. library classification systems) for the manual labelling of open-domain text corpora. Combining quantitative and qualitative data from a crowd-based text labelling study, the validity of the taxonomy is tested and the results quantified in terms of interrater agreement. While the usefulness of the Wikipedia category system for automatic document indexing is documented in the pertinent literature, our results suggest that at least the taxonomy we derived from it is not a valid instrument for manual subject matter labelling of open-domain text corpora.

Download Full-text

Naca/Nasa Document Indexing: 1915 95

10.18260/1-2--5051 ◽

2020 ◽

Author(s):

Larry Thompson

Keyword(s):

Document Indexing

Download Full-text

End-to-End Contextualized Document Indexing and Retrieval with Neural Networks

Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval ◽

10.1145/3397271.3401453 ◽

2020 ◽

Author(s):

Sebastian Hofstätter

Keyword(s):

Neural Networks ◽

Document Indexing ◽

Indexing And Retrieval ◽

End To End

Download Full-text

Algorithm of the automated events classification process in the information space

Artificial Intelligence ◽

10.15407/jai2020.02.042 ◽

2020 ◽

Vol 25 (2) ◽

pp. 42-52

Author(s):

Hrytsiuk V.V. ◽

Keyword(s):

Decision Making ◽

Russian Federation ◽

Data Retrieval ◽

Negative Information ◽

Information Space ◽

Global Information ◽

Automated Classification ◽

Document Indexing ◽

The Russian Federation

The article defines the algorithm and details the sequential tasks for building an effective model of automated classification of events in the information space. On the eve and during the armed aggression of the Russian Federation against Ukraine, the consequences of external negative information influence were noticeable. Therefore, the organization and implementation of counteraction to such influence is urgent. An important component of this activity is the classification (clustering) of information events in the information space in order to further analyze them and form proposals for decision-making to counteract the negative information impact. Given the fact that in the global information space and, in particular, the information space of the state in the interests of counteracting such influence, it is necessary to constantly process a significant amount of information, so the task of improving the efficiency of this process is provided by automating its components. The algorithm of the automated classification process is based on a number of consecutive tasks, namely: data retrieval, preelection of messages ("rough" classification), saving pre-selected messages in the database, determining a set of indicators for automated classification of information events, pre-processing a single document (indexing), distribution of messages by criteria by categories ("accurate" classification), presentation of information in a convenient form (visualization), saving the results of classification in the database. The proposed material reveals the content of these tasks. The proposed algorithm will serve to automatically divide information events (messages) of different nature into categories (classes) in order to increase the efficiency of assessing the level of negative information impact on target audiences for timely (proactive) response to its manifestations.

Download Full-text

Approximate matching-based unsupervised document indexing approach: application to biomedical domain

Scientometrics ◽

10.1007/s11192-020-03474-w ◽

2020 ◽

Vol 124 (2) ◽

pp. 903-924 ◽

Cited By ~ 1

Author(s):

Kabil Boukhari ◽

Mohamed Nazih Omri

Keyword(s):

Biomedical Domain ◽

Approximate Matching ◽

Document Indexing

Download Full-text

DL-VSM based document indexing approach for information retrieval

Journal of Ambient Intelligence and Humanized Computing ◽

10.1007/s12652-020-01684-x ◽

2020 ◽

Cited By ~ 1

Author(s):

Kabil Boukhari ◽

Mohamed Nazih Omri

Keyword(s):

Information Retrieval ◽

Document Indexing

Download Full-text

An Approach of Documents Indexing Using Summarization

Advances in Library and Information Science - Critical Approaches to Information Retrieval Research ◽

10.4018/978-1-7998-1021-6.ch005 ◽

2020 ◽

pp. 78-86 ◽

Cited By ~ 1

Author(s):

Rida Khalloufi ◽

Rachid El Ayachi ◽

Mohamed Biniz ◽

Mohamed Fakir ◽

Muhammad Sarfraz

Keyword(s):

Information Retrieval ◽

Research Process ◽

Storage Space ◽

Document Indexing ◽

Retrieval Systems ◽

Information Retrieval Systems

Document indexing is an active domain, which is interesting a lot of researchers. Generally, it is used in the information retrieval systems. Document indexing encompasses a set of approaches that can be applied to index a document using a corpus. This treatment has several advantages, like accelerating the research process, finding the pertinent contains related to a query, reducing storage space, etc. The use of the entire document in the indexing process affects several parameters, such as indexing time, research time, storage space of treatment, etc. The focus of this chapter is to improve all parameters (cited above) related to the indexing process by proposing a new indexing approach. The goal of proposed approach is to use a summarization to minimize the size of documents without affecting the meaning.

Download Full-text

Cross-Language Plagiarism Detection Based on CLAD Method

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.b7404.129219 ◽

2019 ◽

Vol 9 (2) ◽

pp. 4903-4909

Keyword(s):

Machine Translation ◽

Detection Method ◽

Translation Method ◽

Plagiarism Detection ◽

Working Process ◽

Detection Process ◽

Document Indexing ◽

Cross Language ◽

Multiple Languages ◽

Analog Detector

This paper describes the cross-language plagiarism detection method CLAD (Cross-Language Analog Detector) between test document and indexed documents. The main difference of this method from existing versions is the detection of plagiarism among multiple languages not only two languages. While translating terms, it used the dictionary-based machine-translation method. CLAD’s working process consists of document indexing and detection process phases. In this paper, we will describe both of these phases.

Download Full-text

Arabic Document Indexing for Improved Text Retrieval

2019 2nd International Conference on new Trends in Computing Sciences (ICTCS) ◽

10.1109/ictcs.2019.8923096 ◽

2019 ◽

Cited By ~ 1

Author(s):

Yaser A. M. Al-Lahham

Keyword(s):

Text Retrieval ◽

Document Indexing

Download Full-text

document indexing
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Fuzzy Ontology-Based Possibilistic Approach for Document Indexing Using Semantic Concept Relations

Testing the validity of Wikipedia categories for subject matter labelling of open-domain corpus data

Naca/Nasa Document Indexing: 1915 95

End-to-End Contextualized Document Indexing and Retrieval with Neural Networks

Algorithm of the automated events classification process in the information space

Approximate matching-based unsupervised document indexing approach: application to biomedical domain

DL-VSM based document indexing approach for information retrieval

An Approach of Documents Indexing Using Summarization

Cross-Language Plagiarism Detection Based on CLAD Method

Arabic Document Indexing for Improved Text Retrieval

Export Citation Format

document indexingRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Fuzzy Ontology-Based Possibilistic Approach for Document Indexing Using Semantic Concept Relations

Testing the validity of Wikipedia categories for subject matter labelling of open-domain corpus data

Naca/Nasa Document Indexing: 1915 95

End-to-End Contextualized Document Indexing and Retrieval with Neural Networks

Algorithm of the automated events classification process in the information space

Approximate matching-based unsupervised document indexing approach: application to biomedical domain

DL-VSM based document indexing approach for information retrieval

An Approach of Documents Indexing Using Summarization

Cross-Language Plagiarism Detection Based on CLAD Method

Arabic Document Indexing for Improved Text Retrieval

document indexing
Recently Published Documents