triplet loss
Recently Published Documents


TOTAL DOCUMENTS

197
(FIVE YEARS 168)

H-INDEX

12
(FIVE YEARS 7)

2022 ◽  
Vol 40 (4) ◽  
pp. 1-27
Author(s):  
Zhongwei Xie ◽  
Ling Liu ◽  
Yanzhao Wu ◽  
Luo Zhong ◽  
Lin Li

This article introduces a two-phase deep feature engineering framework for efficient learning of semantics enhanced joint embedding, which clearly separates the deep feature engineering in data preprocessing from training the text-image joint embedding model. We use the Recipe1M dataset for the technical description and empirical validation. In preprocessing, we perform deep feature engineering by combining deep feature engineering with semantic context features derived from raw text-image input data. We leverage LSTM to identify key terms, deep NLP models from the BERT family, TextRank, or TF-IDF to produce ranking scores for key terms before generating the vector representation for each key term by using Word2vec. We leverage Wide ResNet50 and Word2vec to extract and encode the image category semantics of food images to help semantic alignment of the learned recipe and image embeddings in the joint latent space. In joint embedding learning, we perform deep feature engineering by optimizing the batch-hard triplet loss function with soft-margin and double negative sampling, taking into account also the category-based alignment loss and discriminator-based alignment loss. Extensive experiments demonstrate that our SEJE approach with deep feature engineering significantly outperforms the state-of-the-art approaches.


Author(s):  
Xiaoyu He ◽  
Yong Wang ◽  
Shuang Zhao ◽  
Chunli Yao

AbstractCurrently, convolutional neural networks (CNNs) have made remarkable achievements in skin lesion classification because of their end-to-end feature representation abilities. However, precise skin lesion classification is still challenging because of the following three issues: (1) insufficient training samples, (2) inter-class similarities and intra-class variations, and (3) lack of the ability to focus on discriminative skin lesion parts. To address these issues, we propose a deep metric attention learning CNN (DeMAL-CNN) for skin lesion classification. In DeMAL-CNN, a triplet-based network (TPN) is first designed based on deep metric learning, which consists of three weight-shared embedding extraction networks. TPN adopts a triplet of samples as input and uses the triplet loss to optimize the embeddings, which can not only increase the number of training samples, but also learn the embeddings robust to inter-class similarities and intra-class variations. In addition, a mixed attention mechanism considering both the spatial-wise and channel-wise attention information is designed and integrated into the construction of each embedding extraction network, which can further strengthen the skin lesion localization ability of DeMAL-CNN. After extracting the embeddings, three weight-shared classification layers are used to generate the final predictions. In the training procedure, we combine the triplet loss with the classification loss as a hybrid loss to train DeMAL-CNN. We compare DeMAL-CNN with the baseline method, attention methods, advanced challenge methods, and state-of-the-art skin lesion classification methods on the ISIC 2016 and ISIC 2017 datasets, and test its generalization ability on the PH2 dataset. The results demonstrate its effectiveness.


2021 ◽  
Vol 33 (1) ◽  
Author(s):  
Mariana-Iuliana Georgescu ◽  
Georgian-Emilian Duţǎ ◽  
Radu Tudor Ionescu

2021 ◽  
pp. 108473
Author(s):  
Fadi Boutros ◽  
Naser Damer ◽  
Florian Kirchbuchner ◽  
Arjan Kuijper

2021 ◽  
Author(s):  
Lisle Faray de Paiva ◽  
Alan Carlos de Moura Lima ◽  
Geraldo Braz Júnior ◽  
Anselmo Cardoso de Paiva ◽  
Aristófanes Correa Silva

De acordo com a Organização Mundial de Saúde, anualmente são contabilizadas 8 milhões de mortes devido a doenças do trato gastrointestinal. A detecção automática das marcações anatômicas é uma tarefa que pode auxiliar profissionais da área da saúde, reduzindo custo e tempo em exames exploratórios. Sistemas de detecção e diagnóstico auxiliados por computador têm sido vastamente explorado no âmbito científico. No entanto é necessário muito poder de processamento para atingir resultados satisfatórios. Com o intuito de contornar esse problema, este trabalho utiliza uma Rede Neural Convolucional simples em conjunto da função de custo Triplet Loss para extrair características de imagens de 3 marcações anatômicas gastrointestinais (z-line, pylorus e cecum) para classificação dessas imagens. Para o treinamento é utilizada a base de dados Kvasir-v2, obtendo 96,60% de Precisão, 97,71% de Acurácia, 96,91% de Especificidade, 98,61% de Sensibilidade e um F1-score de 97,59%.


Author(s):  
Lutao Liu ◽  
Xinyu Li

AbstractRecently, due to the wide application of low probability of intercept (LPI) radar, lots of recognition approaches about LPI radar signal modulations have been proposed. However, facing the increasingly complex electromagnetic environment, most existing methods have poor performance to identify different modulation types in low signal-to-noise ratio (SNR). This paper proposes an automatic recognition method for different LPI radar signal modulations. Firstly, time-domain signals are converted to time-frequency images (TFIs) by smooth pseudo-Wigner–Ville distribution. Then, these TFIs are fed into a designed triplet convolutional neural network (TCNN) to obtain high-dimensional feature vectors. In essence, TCNN is a CNN network that triplet loss is adopted to optimize parameters of the network in the training process. The participation of triplet loss can ensure that the distance between samples in different classes is greater than that between samples with the same label, improving the discriminability of TCNN. Eventually, a fully connected neural network is employed as the classifier to recognize different modulation types. Simulation shows that the overall recognition success rate can achieve 94% at − 10 dB, which proves the proposed method has a strong discriminating capability for the recognition of different LPI radar signal modulations, even under low SNR.


Agriculture ◽  
2021 ◽  
Vol 11 (11) ◽  
pp. 1126
Author(s):  
Xia Hao ◽  
Man Zhang ◽  
Tianru Zhou ◽  
Xuchao Guo ◽  
Federico Tomasetto ◽  
...  

The identification of light stress is crucial for light control in plant factories. Image-based lighting classification of leafy vegetables has exhibited remarkable performance with high convenience and economy. Convolutional Neural Network (CNN) has been widely used for crop image analysis because of its architecture, high accuracy and efficiency. Among them, large intra-class differences and small inter-class differences are important factors affecting crop identification and a critical challenge for fine-grained classification tasks based on CNN. To address this problem, we took the Lettuce (Lactuca sativa L.) widely grown in plant factories as the research object and constructed a leaf image set containing four stress levels. Then a light stress grading model combined with classic pre-trained CNN and Triplet loss function is constructed, which is named Tr-CNN. The model uses the Triplet loss function to constrain the distance of images in the feature space, which can reduce the Euclidean distance of the samples from the same class and increase the heterogeneous Euclidean distance. Multiple sets of experimental results indicate that the model proposed in this paper (Tr-CNN) has obvious advantages in light stress grading dataset and generalized dataset.


2021 ◽  
Vol 2089 (1) ◽  
pp. 012078
Author(s):  
Syed Mansoora ◽  
Giribabu Sadineni ◽  
Shaik Heena Kauser

Abstract When it comes to classroom management, the attendance check is a critical component. Time-consuming, particularly when it comes to open meetings, is checking attendance by calling names or by handing around a sign-in sheet to make it easier to commit fraud. An implementation of a real-time attendance check is described in this article in great detail facial recognition system and its outcomes. The system must be able to identify a student’s face in order for it to work first snap a photograph of the pupil and save it in a database as a reference for future use. During the event, there were students may be identified by using the webcam, which captures photos of their faces auto-detects faces and selects students with names that are most likely to match, and lastly, depending on the facial recognition findings, an excel file will be updated to reflect attendance. To identify faces in webcam footage, the system uses a pre-trained Haar Cascade model. As a result, a 128-bit FaceNet has been generated by training it to minimise the triplet loss. The dimensions of the facial picture. When two facial pictures have similar encodings If the two facial pictures are from the same student or different. Use of the system as part of a class, and the outcomes have been extremely positive. There has been a poll done to find out more about There are both advantages and disadvantages to using a college attendance system.


Sensors ◽  
2021 ◽  
Vol 21 (21) ◽  
pp. 7260
Author(s):  
Yuanqing Xian ◽  
Guangjun Liu ◽  
Jinfu Fan ◽  
Yang Yu ◽  
Zhongjie Wang

Copper elbows are an important product in industry. They are used to connect pipes for transferring gas, oil, and liquids. Defective copper elbows can lead to serious industrial accidents. In this paper, a novel model named YOT-Net (YOLOv3 combined triplet loss network) is proposed to automatically detect defective copper elbows. To increase the defect detection accuracy, triplet loss function is employed in YOT-Net. The triplet loss function is introduced into the loss module of YOT-Net, which utilizes image similarity to enhance feature extraction ability. The proposed method of YOT-Net shows outstanding performance in copper elbow surface defect detection.


2021 ◽  
Author(s):  
Ēvalds Urtāns

This work describes the importance of loss functions and related methods for deep reinforcement learning and deep metric learning. A novel MDQN loss function outperformed DDQN loss function in PLE computer game environments, and a novel Exponential Triplet loss function outperformed the Triplet loss function in the face re-identification task with VGGFace2 dataset reaching 85,7 % accuracy using zero-shot setting. This work also presents a novel UNet-RNN-Skip model to improve the performance of the value function for path planning tasks.


Sign in / Sign up

Export Citation Format

Share Document