Efficient Multi-modal Hashing with Online Query Adaption for Multimedia Retrieval

Lei Zhu; Chaoqun Zheng; Xu Lu; Zhiyong Cheng; Liqiang Nie; Huaxiang Zhang

doi:10.1145/3477180

Efficient Multi-modal Hashing with Online Query Adaption for Multimedia Retrieval

ACM Transactions on Information Systems ◽

10.1145/3477180 ◽

2022 ◽

Vol 40 (2) ◽

pp. 1-36

Author(s):

Lei Zhu ◽

Chaoqun Zheng ◽

Xu Lu ◽

Zhiyong Cheng ◽

Liqiang Nie ◽

...

Keyword(s):

Matrix Factorization ◽

Multimedia Retrieval ◽

Similarity Matrix ◽

Fusion Strategy ◽

Multimedia Contents ◽

Weighted Fusion ◽

Specific Objective ◽

Online Streaming ◽

And Storage ◽

Hash Codes

Multi-modal hashing supports efficient multimedia retrieval well. However, existing methods still suffer from two problems: (1) Fixed multi-modal fusion. They collaborate the multi-modal features with fixed weights for hash learning, which cannot adaptively capture the variations of online streaming multimedia contents. (2) Binary optimization challenge. To generate binary hash codes, existing methods adopt either two-step relaxed optimization that causes significant quantization errors or direct discrete optimization that consumes considerable computation and storage cost. To address these problems, we first propose a Supervised Multi-modal Hashing with Online Query-adaption method. A self-weighted fusion strategy is designed to adaptively preserve the multi-modal features into hash codes by exploiting their complementarity. Besides, the hash codes are efficiently learned with the supervision of pair-wise semantic labels to enhance their discriminative capability while avoiding the challenging symmetric similarity matrix factorization. Further, we propose an efficient Unsupervised Multi-modal Hashing with Online Query-adaption method with an adaptive multi-modal quantization strategy. The hash codes are directly learned without the reliance on the specific objective formulations. Finally, in both methods, we design a parameter-free online hashing module to adaptively capture query variations at the online retrieval stage. Experiments validate the superiority of our proposed methods.

Download Full-text

Label Consistent Flexible Matrix Factorization Hashing for Efficient Cross-modal Retrieval

ACM Transactions on Multimedia Computing Communications and Applications ◽

10.1145/3446774 ◽

2021 ◽

Vol 17 (3) ◽

pp. 1-18

Author(s):

Donglin Zhang ◽

Xiao-Jun Wu ◽

Jun Yu

Keyword(s):

Matrix Factorization ◽

Large Scale ◽

Semantic Representation ◽

Heterogeneous Data ◽

Binary Codes ◽

Similarity Matrix ◽

Pairwise Similarity ◽

Multimodal Data ◽

Cross Media ◽

Hash Codes

Hashing methods have sparked a great revolution on large-scale cross-media search due to its effectiveness and efficiency. Most existing approaches learn unified hash representation in a common Hamming space to represent all multimodal data. However, the unified hash codes may not characterize the cross-modal data discriminatively, because the data may vary greatly due to its different dimensionalities, physical properties, and statistical information. In addition, most existing supervised cross-modal algorithms preserve the similarity relationship by constructing an n × n pairwise similarity matrix, which requires a large amount of calculation and loses the category information. To mitigate these issues, a novel cross-media hashing approach is proposed in this article, dubbed label flexible matrix factorization hashing (LFMH). Specifically, LFMH jointly learns the modality-specific latent subspace with similar semantic by the flexible matrix factorization. In addition, LFMH guides the hash learning by utilizing the semantic labels directly instead of the large n × n pairwise similarity matrix. LFMH transforms the heterogeneous data into modality-specific latent semantic representation. Therefore, we can obtain the hash codes by quantifying the representations, and the learned hash codes are consistent with the supervised labels of multimodal data. Then, we can obtain the similar binary codes of the corresponding modality, and the binary codes can characterize such samples flexibly. Accordingly, the derived hash codes have more discriminative power for single-modal and cross-modal retrieval tasks. Extensive experiments on eight different databases demonstrate that our model outperforms some competitive approaches.

Download Full-text

A Novel Blind Wavelet Base Watermarking of ECG Signals on Medical Images Using EZW Algorithm

Encyclopedia of Healthcare Information Systems ◽

10.4018/978-1-59904-889-5.ch125 ◽

2008 ◽

pp. 1004-1015

Author(s):

Mohammad Saleh Nambakhsh ◽

M. Shiva

Keyword(s):

Medical Images ◽

Wavelet Coefficients ◽

Ecg Signals ◽

Multimedia Contents ◽

Cost Of Health Care ◽

Ezw Algorithm ◽

Diagnostics System ◽

The Cost ◽

Efficient Memory ◽

And Storage

Exchange of databases between hospitals needs efficient and reliable transmission and storage techniques to cut down the cost of health care. This exchange involves a large amount of vital patient information such as biosignals and medical images. Interleaving one form of data such as 1-D signal over digital images can combine the advantages of data security with efficient memory utilization (Norris, Englehart & Lovely, 2001), but nothing prevents the user from manipulating or copying the decrypted data for illegal uses. Embedding vital information of patients inside their scan images will help physicians make a better diagnosis of a disease. In order to solve these issues, watermark algorithms have been proposed as a way to complement the encryption processes and provide some tools to track the retransmission and manipulation of multimedia contents (Barni, Podilchuk, Bartolini & Delp, 2001; Vallabha, 2003). A watermarking system is based on an imperceptible insertion of a watermark (a signal) in an image. This technique is adapted here for interleaving graphical ECG signals within medical images to reduce storage and transmission overheads as well as helping for computer-aided diagnostics system. In this chapter, we present a new wavelet-based watermarking method combined with the EZW coder. The principle is to replace significant wavelet coefficients of ECG signals by the corresponding significant wavelet coefficients belonging to the host image, which is much bigger in size than the mark signal. This chapter presents a brief introduction to watermarking and the EZW coder that acts as a platform for our watermarking algorithm.

Download Full-text

Unsupervised Hashing with Gradient Attention

Symmetry ◽

10.3390/sym12071193 ◽

2020 ◽

Vol 12 (7) ◽

pp. 1193

Author(s):

Shaochen Jiang ◽

Liejun Wang ◽

Shuli Cheng ◽

Anyu Du ◽

Yongming Li

Keyword(s):

Gradient Descent ◽

Network Models ◽

Image Features ◽

Similarity Matrix ◽

Hash Code ◽

Unsupervised Training ◽

Public Datasets ◽

Cosine Distance ◽

Hash Codes ◽

Trained Network

The existing learning-based unsupervised hashing method usually uses a pre-trained network to extract features, and then uses the extracted feature vectors to construct a similarity matrix which guides the generation of hash codes through gradient descent. Existing research shows that the algorithm based on gradient descent will cause the hash codes of the paired images to be updated toward each other’s position during the training process. For unsupervised training, this situation will cause large fluctuations in the hash code during training and limit the learning efficiency of the hash code. In this paper, we propose a method named Deep Unsupervised Hashing with Gradient Attention (UHGA) to solve this problem. UHGA mainly includes the following contents: (1) use pre-trained network models to extract image features; (2) calculate the cosine distance of the corresponding features of the pair of images, and construct a similarity matrix through the cosine distance to guide the generation of hash codes; (3) a gradient attention mechanism is added during the training of the hash code to pay attention to the gradient. Experiments on two existing public datasets show that our proposed method can obtain more discriminating hash codes.

Download Full-text

MDIPA: a microRNA–drug interaction prediction approach based on non-negative matrix factorization

Bioinformatics ◽

10.1093/bioinformatics/btaa577 ◽

2020 ◽

Vol 36 (20) ◽

pp. 5061-5067

Author(s):

Ali Akbar Jamali ◽

Anthony Kusalik ◽

Fang-Xiang Wu

Keyword(s):

Drug Interaction ◽

Drug Interactions ◽

Matrix Factorization ◽

Structural Information ◽

Superior Performance ◽

Supplementary Information ◽

Similarity Matrix ◽

Interaction Prediction ◽

Prediction Approach ◽

Drug Similarity

Abstract Motivation Evidence has shown that microRNAs, one type of small biomolecule, regulate the expression level of genes and play an important role in the development or treatment of diseases. Drugs, as important chemical compounds, can interact with microRNAs and change their functions. The experimental identification of microRNA–drug interactions is time-consuming and expensive. Therefore, it is appealing to develop effective computational approaches for predicting microRNA–drug interactions. Results In this study, a matrix factorization-based method, called the microRNA–drug interaction prediction approach (MDIPA), is proposed for predicting unknown interactions among microRNAs and drugs. Specifically, MDIPA utilizes experimentally validated interactions between drugs and microRNAs, drug similarity and microRNA similarity to predict undiscovered interactions. A path-based microRNA similarity matrix is constructed, while the structural information of drugs is used to establish a drug similarity matrix. To evaluate its performance, our MDIPA is compared with four state-of-the-art prediction methods with an independent dataset and cross-validation. The results of both evaluation methods confirm the superior performance of MDIPA over other methods. Finally, the results of molecular docking in a case study with breast cancer confirm the efficacy of our approach. In conclusion, MDIPA can be effective in predicting potential microRNA–drug interactions. Availability and implementation All code and data are freely available from https://github.com/AliJam82/MDIPA. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Summarizing video using non-negative similarity matrix factorization

2002 IEEE Workshop on Multimedia Signal Processing. ◽

10.1109/mmsp.2002.1203239 ◽

2004 ◽

Cited By ~ 30

Author(s):

M. Cooper ◽

J. Foote

Keyword(s):

Matrix Factorization ◽

Similarity Matrix

Download Full-text

Incremental Matrix Factorization: A Linear Feature Transformation Perspective

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/264 ◽

2017 ◽

Cited By ~ 7

Author(s):

Xunpeng Huang ◽

Le Wu ◽

Enhong Chen ◽

Hengshu Zhu ◽

Qi Liu ◽

...

Keyword(s):

Matrix Factorization ◽

Low Rank ◽

Feature Transformation ◽

Low Rank Approximation ◽

Linear Feature ◽

Special Cases ◽

Rank Approximation ◽

Training Error ◽

Real World Datasets ◽

And Storage

Matrix Factorization (MF) is among the most widely used techniques for collaborative filtering based recommendation. Along this line, a critical demand is to incrementally refine the MF models when new ratings come in an online scenario. However, most of existing incremental MF algorithms are limited by specific MF models or strict use restrictions. In this paper, we propose a general incremental MF framework by designing a linear transformation of user and item latent vectors over time. This framework shows a relatively high accuracy with a computation and space efficient training process in an online scenario. Meanwhile, we explain the framework with a low-rank approximation perspective, and give an upper bound on the training error when this framework is used for incremental learning in some special cases. Finally, extensive experimental results on two real-world datasets clearly validate the effectiveness, efficiency and storage performance of the proposed framework.

Download Full-text

MCCMF: collaborative matrix factorization based on matrix completion for predicting miRNA-disease associations

BMC Bioinformatics ◽

10.1186/s12859-020-03799-6 ◽

2020 ◽

Vol 21 (1) ◽

Author(s):

Tian-Ru Wu ◽

Meng-Meng Yin ◽

Cui-Na Jiao ◽

Ying-Lian Gao ◽

Xiang-Zhen Kong ◽

...

Keyword(s):

Matrix Factorization ◽

Cross Validation ◽

Matrix Completion ◽

Factorization Method ◽

Similarity Matrix ◽

Validation Experiment ◽

Disease Associations ◽

Auc Value ◽

Regulatory Functions ◽

Association Matrix

Abstract Background MicroRNAs (miRNAs) are non-coding RNAs with regulatory functions. Many studies have shown that miRNAs are closely associated with human diseases. Among the methods to explore the relationship between the miRNA and the disease, traditional methods are time-consuming and the accuracy needs to be improved. In view of the shortcoming of previous models, a method, collaborative matrix factorization based on matrix completion (MCCMF) is proposed to predict the unknown miRNA-disease associations. Results The complete matrix of the miRNA and the disease is obtained by matrix completion. Moreover, Gaussian Interaction Profile kernel is added to the miRNA functional similarity matrix and the disease semantic similarity matrix. Then the Weight K Nearest Known Neighbors method is used to pretreat the association matrix, so the model is close to the reality. Finally, collaborative matrix factorization method is applied to obtain the prediction results. Therefore, the MCCMF obtains a satisfactory result in the fivefold cross-validation, with an AUC of 0.9569 (0.0005). Conclusions The AUC value of MCCMF is higher than other advanced methods in the fivefold cross validation experiment. In order to comprehensively evaluate the performance of MCCMF, accuracy, precision, recall and f-measure are also added. The final experimental results demonstrate that MCCMF outperforms other methods in predicting miRNA-disease associations. In the end, the effectiveness and practicability of MCCMF are further verified by researching three specific diseases.

Download Full-text

A Knowledge-Driven Multimedia Retrieval System Based on Semantics and Deep Features

Future Internet ◽

10.3390/fi12110183 ◽

2020 ◽

Vol 12 (11) ◽

pp. 183

Author(s):

Antonio Maria Rinaldi ◽

Cristiano Russo ◽

Cristian Tommasino

Keyword(s):

Retrieval System ◽

Research Field ◽

Multimedia Retrieval ◽

User Needs ◽

Semantic Retrieval ◽

Multimedia Document ◽

Multimedia Contents ◽

Retrieval Systems ◽

Web Contents ◽

Information User

In recent years the information user needs have been changed due to the heterogeneity of web contents which increasingly involve in multimedia contents. Although modern search engines provide visual queries, it is not easy to find systems that allow searching from a particular domain of interest and that perform such search by combining text and visual queries. Different approaches have been proposed during years and in the semantic research field many authors proposed techniques based on ontologies. On the other hand, in the context of image retrieval systems techniques based on deep learning have obtained excellent results. In this paper we presented novel approaches for image semantic retrieval and a possible combination for multimedia document analysis. Several results have been presented to show the performance of our approach compared with literature baselines.

Download Full-text

TDCMR: Triplet-Based Deep Cross-Modal Retrieval for Geo-Multimedia Data

Applied Sciences ◽

10.3390/app112210803 ◽

2021 ◽

Vol 11 (22) ◽

pp. 10803

Author(s):

Jiagang Song ◽

Yunwu Lin ◽

Jiayu Song ◽

Weiren Yu ◽

Leyuan Zhang

Keyword(s):

High Performance ◽

Large Scale ◽

Location Based Services ◽

Multimedia Data ◽

Multimedia Retrieval ◽

Geographical Information ◽

Superior Performance ◽

Hybrid Index ◽

High Level ◽

Hash Codes

Mass multimedia data with geographical information (geo-multimedia) are collected and stored on the Internet due to the wide application of location-based services (LBS). How to find the high-level semantic relationship between geo-multimedia data and construct efficient index is crucial for large-scale geo-multimedia retrieval. To combat this challenge, the paper proposes a deep cross-modal hashing framework for geo-multimedia retrieval, termed as Triplet-based Deep Cross-Modal Retrieval (TDCMR), which utilizes deep neural network and an enhanced triplet constraint to capture high-level semantics. Besides, a novel hybrid index, called TH-Quadtree, is developed by combining cross-modal binary hash codes and quadtree to support high-performance search. Extensive experiments are conducted on three common used benchmarks, and the results show the superior performance of the proposed method.

Download Full-text

MLS3RDUH: Deep Unsupervised Hashing via Manifold based Local Semantic Similarity Structure Reconstructing

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/479 ◽

2020 ◽

Author(s):

Rong-Cheng Tu ◽

Xian-Ling Mao ◽

Wei Wei

Keyword(s):

Semantic Similarity ◽

State Of The Art ◽

Feature Space ◽

Similarity Matrix ◽

Retrieval Performance ◽

Nearest Neighbours ◽

Similarity Structure ◽

Similarity Preserving ◽

Public Datasets ◽

Hash Codes

Most of the unsupervised hashing methods usually map images into semantic similarity-preserving hash codes by constructing local semantic similarity structure as guiding information, i.e., treating each point similar to its k nearest neighbours. However, for an image, some of its k nearest neighbours may be dissimilar to it, i.e., they are noisy datapoints which will damage the retrieval performance. Thus, to tackle this problem, in this paper, we propose a novel deep unsupervised hashing method, called MLS3RDUH, which can reduce the noisy datapoints to further enhance retrieval performance. Specifically, the proposed method first defines a novel similarity matrix by utilising the intrinsic manifold structure in feature space and the cosine similarity of datapoints to reconstruct the local semantic similarity structure. Then a novel log-cosh hashing loss function is used to optimize the hashing network to generate compact hash codes by incorporating the defined similarity as guiding information. Extensive experiments on three public datasets show that the proposed method outperforms the state-of-the-art baselines.

Download Full-text