Cross-Modality Transfer Learning for Image-Text Information Management

2022 ◽  
Vol 13 (1) ◽  
pp. 1-14
Author(s):  
Shuteng Niu ◽  
Yushan Jiang ◽  
Bowen Chen ◽  
Jian Wang ◽  
Yongxin Liu ◽  
...  

In the past decades, information from all kinds of data has been on a rapid increase. With state-of-the-art performance, machine learning algorithms have been beneficial for information management. However, insufficient supervised training data is still an adversity in many real-world applications. Therefore, transfer learning (TF) was proposed to address this issue. This article studies a not well investigated but important TL problem termed cross-modality transfer learning (CMTL). This topic is closely related to distant domain transfer learning (DDTL) and negative transfer. In general, conventional TL disciplines assume that the source domain and the target domain are in the same modality. DDTL aims to make efficient transfers even when the domains or the tasks are entirely different. As an extension of DDTL, CMTL aims to make efficient transfers between two different data modalities, such as from image to text. As the main focus of this study, we aim to improve the performance of image classification by transferring knowledge from text data. Previously, a few CMTL algorithms were proposed to deal with image classification problems. However, most existing algorithms are very task specific, and they are unstable on convergence. There are four main contributions in this study. First, we propose a novel heterogeneous CMTL algorithm, which requires only a tiny set of unlabeled target data and labeled source data with associate text tags. Second, we introduce a latent semantic information extraction method to connect the information learned from the image data and the text data. Third, the proposed method can effectively handle the information transfer across different modalities (text-image). Fourth, we examined our algorithm on a public dataset, Office-31. It has achieved up to 5% higher classification accuracy than “non-transfer” algorithms and up to 9% higher than existing CMTL algorithms.

2020 ◽  
Vol 10 (21) ◽  
pp. 7831
Author(s):  
Han Kyul Kim ◽  
Sae Won Choi ◽  
Ye Seul Bae ◽  
Jiin Choi ◽  
Hyein Kwon ◽  
...  

With growing interest in machine learning, text standardization is becoming an increasingly important aspect of data pre-processing within biomedical communities. As performances of machine learning algorithms are affected by both the amount and the quality of their training data, effective data standardization is needed to guarantee consistent data integrity. Furthermore, biomedical organizations, depending on their geographical locations or affiliations, rely on different sets of text standardization in practice. To facilitate easier machine learning-related collaborations between these organizations, an effective yet practical text data standardization method is needed. In this paper, we introduce MARIE (a context-aware term mapping method with string matching and embedding vectors), an unsupervised learning-based tool, to find standardized clinical terminologies for queries, such as a hospital’s own codes. By incorporating both string matching methods and term embedding vectors generated by BioBERT (bidirectional encoder representations from transformers for biomedical text mining), it utilizes both structural and contextual information to calculate similarity measures between source and target terms. Compared to previous term mapping methods, MARIE shows improved mapping accuracy. Furthermore, it can be easily expanded to incorporate any string matching or term embedding methods. Without requiring any additional model training, it is not only effective, but also a practical term mapping method for text data standardization and pre-processing.


Entropy ◽  
2019 ◽  
Vol 21 (5) ◽  
pp. 456 ◽  
Author(s):  
Hao Cheng ◽  
Dongze Lian ◽  
Shenghua Gao ◽  
Yanlin Geng

Inspired by the pioneering work of the information bottleneck (IB) principle for Deep Neural Networks’ (DNNs) analysis, we thoroughly study the relationship among the model accuracy, I ( X ; T ) and I ( T ; Y ) , where I ( X ; T ) and I ( T ; Y ) are the mutual information of DNN’s output T with input X and label Y. Then, we design an information plane-based framework to evaluate the capability of DNNs (including CNNs) for image classification. Instead of each hidden layer’s output, our framework focuses on the model output T. We successfully apply our framework to many application scenarios arising in deep learning and image classification problems, such as image classification with unbalanced data distribution, model selection, and transfer learning. The experimental results verify the effectiveness of the information plane-based framework: Our framework may facilitate a quick model selection and determine the number of samples needed for each class in the unbalanced classification problem. Furthermore, the framework explains the efficiency of transfer learning in the deep learning area.


Author(s):  
Lambodar Jena ◽  
Ramakrushna Swain ◽  
N.K. Kamila

Image mining is more than just an extension of data mining to image domain. Web Image mining is a technique commonly used to extract knowledge directly from images on WWW. Since main targets of conventional Web mining are numerical and textual data, Web mining for image data is on demand. There are huge image data as well as text data on the Web. However, mining image data from the Web is paid less attention than mining text data, since treating semantics of images are much more difficult. This paper proposes a novel image recognition and image classification technique using a large number of images automatically gathered from the Web as learning images. For classification the system uses imagefeature- based search exploited in content-based image retrieval(CBIR), which do not restrict target images unlike conventional image recognition methods and support vector machine(SVM), which is one of the most efficient & widely used statistical method for generic image classification that fit to the learning tasks. By the experiments it is observed that the proposed system outperforms some existing search systems.


Author(s):  
Pratiksha Bongale

Today’s world is mostly data-driven. To deal with the humongous amount of data, Machine Learning and Data Mining strategies are put into usage. Traditional ML approaches presume that the model is tested on a dataset extracted from the same domain from where the training data has been taken from. Nevertheless, some real-world situations require machines to provide good results with very little domain-specific training data. This creates room for the development of machines that are capable of predicting accurately by being trained on easily found data. Transfer Learning is the key to it. It is the scientific art of applying the knowledge gained while learning a task to another task that is similar to the previous one in some or another way. This article focuses on building a model that is capable of differentiating text data into binary classes; one roofing the text data that is spam and the other not containing spam using BERT’s pre-trained model (bert-base-uncased). This pre-trained model has been trained on Wikipedia and Book Corpus data and the goal of this paper is to highlight the pre-trained model’s capabilities to transfer the knowledge that it has learned from its training (Wiki and Book Corpus) to classifying spam texts from the rest.


2021 ◽  
Author(s):  
Shufeng Kong ◽  
Dan Guevarra ◽  
Carla P. Gomes ◽  
John Gregoire

The adoption of machine learning in materials science has rapidly transformed materials property prediction. Hurdles limiting full capitalization of recent advancements in machine learning include the limited development of methods to learn the underlying interactions of multiple elements, as well as the relationships among multiple properties, to facilitate property prediction in new composition spaces. To address these issues, we introduce the Hierarchical Correlation Learning for Multi-property Prediction (H-CLMP) framework that seamlessly integrates (i) prediction using only a material’s composition, (ii) learning and exploitation of correlations among target properties in multitarget regression, and (iii) leveraging training data from tangential domains via generative transfer learning. The model is demonstrated for prediction of spectral optical absorption of complex metal oxides spanning 69 3-cation metal oxide composition spaces. H-CLMP accurately predicts non-linear composition-property relationships in composition spaces for which no training data is available, which broadens the purview of machine learning to the discovery of materials with exceptional properties. This achievement results from the principled integration of latent embedding learning, property correlation learning, generative transfer learning, and attention models. The best performance is obtained using H-CLMP with Transfer learning (H-CLMP(T)) wherein a generative adversarial network is trained on computational density of states data and deployed in the target domain to augment prediction of optical absorption from composition. H-CLMP(T) aggregates multiple knowledge sources with a framework that is well-suited for multi-target regression across the physical sciences.


2021 ◽  
Author(s):  
Justin Larocque-Villiers ◽  
Patrick Dumond

Abstract Through the intelligent classification of bearing faults, predictive maintenance provides for the possibility of service schedule, inventory, maintenance, and safety optimization. However, real-world rotating machinery undergo a variety of operating conditions, fault conditions, and noise. Due to these factors, it is often required that a fault detection algorithm perform accurately even on data outside its trained domain. Although open-source datasets offer an incredible opportunity to advance the performance of predictive maintenance technology and methods, more research is required to develop algorithms capable of generalized intelligent fault detection across domains and discrepancies. In this study, current benchmarks on source–target domain discrepancy challenges are reviewed using the Case Western Reserve University (CWRU) and the Paderborn University (PbU) datasets. A convolutional neural network (CNN) architecture and data augmentation technique more suitable for generalization tasks is proposed and tested against existing benchmarks on the Pb U dataset by training on artificial faults and testing on real faults. The proposed method improves fault classification by 13.35%, with less than half the standard deviation of the compared benchmark. Transfer learning is then used to leverage the larger PbU dataset in order to make predictions on the CWRU dataset under a challenging source-target domain discrepancy in which there is minimal training data to adequately represent unseen bearing faults. The transfer learning-based CNN is found to be capable of generalizing across two open-source datasets, resulting in an improvement in accuracy from 53.1% to 68.3%.


2019 ◽  
Vol 16 (2) ◽  
pp. 172988141984086 ◽  
Author(s):  
Chuanqi Tan ◽  
Fuchun Sun ◽  
Bin Fang ◽  
Tao Kong ◽  
Wenchang Zhang

The brain–computer interface-based rehabilitation robot has quickly become a very important research area due to its natural interaction. One of the most important problems in brain–computer interface is that large-scale annotated electroencephalography data sets required by advanced classifiers are almost impossible to acquire because biological data acquisition is challenging and quality annotation is costly. Transfer learning relaxes the hypothesis that the training data must be independent and identically distributed with the test data. It can be considered a powerful tool for solving the problem of insufficient training data. There are two basic issues with transfer learning, under transfer and negative transfer. We proposed a novel brain–computer interface framework by using autoencoder-based transfer learning, which includes three main components: an autoencoder framework, a joint adversarial network, and a regularized manifold constraint. The autoencoder framework automatically encodes and reconstructs data from source and target domains and forces the neural network to learn to represent these domains reliably. The joint adversarial network aims to force the network to learn to encode more appropriately for the source domain and target domain simultaneously, thereby overcoming the problem of under transfer. The regularized manifold constraint aims to avoid the problem of negative transfer by avoiding geometric manifold structure in the target domain being destroyed by the source domain. Experiments show that the brain–computer interface framework proposed by us can achieve better results than state-of-the-art approaches in electroencephalography signal classification tasks. This is helpful in aiding our rehabilitation robot to understand the intention of patients and can help patients to carry out rehabilitation exercises effectively.


2020 ◽  
Vol 10 (13) ◽  
pp. 4523 ◽  
Author(s):  
Laith Alzubaidi ◽  
Mohammed A. Fadhel ◽  
Omran Al-Shamma ◽  
Jinglan Zhang ◽  
J. Santamaría ◽  
...  

One of the main challenges of employing deep learning models in the field of medicine is a lack of training data due to difficulty in collecting and labeling data, which needs to be performed by experts. To overcome this drawback, transfer learning (TL) has been utilized to solve several medical imaging tasks using pre-trained state-of-the-art models from the ImageNet dataset. However, there are primary divergences in data features, sizes, and task characteristics between the natural image classification and the targeted medical imaging tasks. Therefore, TL can slightly improve performance if the source domain is completely different from the target domain. In this paper, we explore the benefit of TL from the same and different domains of the target tasks. To do so, we designed a deep convolutional neural network (DCNN) model that integrates three ideas including traditional and parallel convolutional layers and residual connections along with global average pooling. We trained the proposed model against several scenarios. We utilized the same and different domain TL with the diabetic foot ulcer (DFU) classification task and with the animal classification task. We have empirically shown that the source of TL from the same domain can significantly improve the performance considering a reduced number of images in the same domain of the target dataset. The proposed model with the DFU dataset achieved F1-score value of 86.6% when trained from scratch, 89.4% with TL from a different domain of the targeted dataset, and 97.6% with TL from the same domain of the targeted dataset.


Author(s):  
Fouzia Altaf ◽  
Syed M. S. Islam ◽  
Naeem Khalid Janjua

AbstractDeep learning has provided numerous breakthroughs in natural imaging tasks. However, its successful application to medical images is severely handicapped with the limited amount of annotated training data. Transfer learning is commonly adopted for the medical imaging tasks. However, a large covariant shift between the source domain of natural images and target domain of medical images results in poor transfer learning. Moreover, scarcity of annotated data for the medical imaging tasks causes further problems for effective transfer learning. To address these problems, we develop an augmented ensemble transfer learning technique that leads to significant performance gain over the conventional transfer learning. Our technique uses an ensemble of deep learning models, where the architecture of each network is modified with extra layers to account for dimensionality change between the images of source and target data domains. Moreover, the model is hierarchically tuned to the target domain with augmented training data. Along with the network ensemble, we also utilize an ensemble of dictionaries that are based on features extracted from the augmented models. The dictionary ensemble provides an additional performance boost to our method. We first establish the effectiveness of our technique with the challenging ChestXray-14 radiography data set. Our experimental results show more than 50% reduction in the error rate with our method as compared to the baseline transfer learning technique. We then apply our technique to a recent COVID-19 data set for binary and multi-class classification tasks. Our technique achieves 99.49% accuracy for the binary classification, and 99.24% for multi-class classification.


BMC Genomics ◽  
2020 ◽  
Vol 21 (S6) ◽  
Author(s):  
Sunitha Basodi ◽  
Pelin Icer Baykal ◽  
Alex Zelikovsky ◽  
Pavel Skums ◽  
Yi Pan

Abstract Background Analysis of heterogeneous populations such as viral quasispecies is one of the most challenging bioinformatics problems. Although machine learning models are becoming to be widely employed for analysis of sequence data from such populations, their straightforward application is impeded by multiple challenges associated with technological limitations and biases, difficulty of selection of relevant features and need to compare genomic datasets of different sizes and structures. Results We propose a novel preprocessing approach to transform irregular genomic data into normalized image data. Such representation allows to restate the problems of classification and comparison of heterogeneous populations as image classification problems which can be solved using variety of available machine learning tools. We then apply the proposed approach to two important problems in molecular epidemiology: inference of viral infection stage and detection of viral transmission clusters using next-generation sequencing data. The infection staging method has been applied to HCV HVR1 samples collected from 108 recently and 257 chronically infected individuals. The SVM-based image classification approach achieved more than 95% accuracy for both recently and chronically HCV-infected individuals. Clustering has been performed on the data collected from 33 epidemiologically curated outbreaks, yielding more than 97% accuracy. Conclusions Sequence image normalization method allows for a robust conversion of genomic data into numerical data and overcomes several issues associated with employing machine learning methods to viral populations. Image data also help in the visualization of genomic data. Experimental results demonstrate that the proposed method can be successfully applied to different problems in molecular epidemiology and surveillance of viral diseases. Simple binary classifiers and clustering techniques applied to the image data are equally or more accurate than other models.


Sign in / Sign up

Export Citation Format

Share Document