multimedia information retrieval
Recently Published Documents


TOTAL DOCUMENTS

205
(FIVE YEARS 13)

H-INDEX

15
(FIVE YEARS 1)

2020 ◽  
Vol 17 (4) ◽  
pp. 507-514
Author(s):  
Sidra Sajid ◽  
Ali Javed ◽  
Aun Irtaza

Speech and music segregation from a single channel is a challenging task due to background interference and intermingled signals of voice and music channels. It is of immense importance due to its utility in wide range of applications such as music information retrieval, singer identification, lyrics recognition and alignment. This paper presents an effective method for speech and music segregation. Considering the repeating nature of music, we first detect the local repeating structures in the signal using a locally defined window for each segment. After detecting the repeating structure, we extract them and perform separation using a soft time-frequency mask. We apply an ideal binary mask to enhance the speech and music intelligibility. We evaluated the proposed method on the mixtures set at -5 dB, 0 dB, 5 dB from Multimedia Information Retrieval-1000 clips (MIR-1K) dataset. Experimental results demonstrate that the proposed method for speech and music segregation outperforms the existing state-of-the-art methods in terms of Global-Normalized-Signal-to-Distortion Ratio (GNSDR) values


2020 ◽  
Vol 47 (1) ◽  
pp. 45-55
Author(s):  
Andrew MacFarlane ◽  
Sondess Missaoui ◽  
Sylwia Frankowska-Takhari

Recent technological developments have increased the use of machine learning to solve many problems, including many in information retrieval. Multimedia information retrieval as a problem represents a significant challenge to machine learning as a technological solution, but some problems can still be addressed by using appropriate AI techniques. We review the technological developments and provide a perspective on the use of machine learning in conjunction with knowledge organization to address multimedia IR needs. The semantic gap in multimedia IR remains a significant problem in the field, and solutions to them are many years off. However, new technological developments allow the use of knowledge organization and machine learning in multimedia search systems and services. Specifically, we argue that, the improvement of detection of some classes of low-level features in images music and video can be used in conjunction with knowledge organization to tag or label multimedia content for better retrieval performance. We provide an overview of the use of knowledge organization schemes in machine learning and make recommendations to information professionals on the use of this technology with knowledge organization techniques to solve multimedia IR problems. We introduce a five-step process model that extracts features from multimedia objects (Step 1) from both knowledge organization (Step 1a) and machine learning (Step 1b), merging them together (Step 2) to create an index of those multimedia objects (Step 3). We also overview further steps in creating an application to utilize the multimedia objects (Step 4) and maintaining and updating the database of features on those objects (Step 5).


Author(s):  
Maha Mahmood ◽  
Wijdan Jaber AL-kubaisy ◽  
Belal Al-Khateeb

Multimedia Information Retrieval (MIR) is an important field due to the great amount of information going through the Internet. Multimedia data can be considered as raw data or the features that compose it. Raw multimedia data consists of data structures with diverse characteristics such as image, audio, video, and text. The big challenge of MIR is a semantic gap, which is the difference between the human perception of a concept and how it can be represented using a machine-level language. The aim of this paper is to use different algorithms through two stages one for training and the other for testing. The first algorithm depends on the nature of the query language to retrieve the text document using two models, Vector Space Model (VSM) and Latent Semantic Index (LSI). The second algorithm is based on the extracted features using curvelet decomposition and the statistic parameters such as mean, standard deviation and energy of signals. The other algorithm is based on the discrete wavelet transform (DWT) and features of signals to retrieve audio signals, then the neural network is applied to describe the information retrieval model which retrieves the information from the multimedia. The neural network model, based on multiplayer perceptron and spreading activation network type, accepts the structure of conceptually and linguistically oriented model.


Sign in / Sign up

Export Citation Format

Share Document