multimodal data
Recently Published Documents


TOTAL DOCUMENTS

681
(FIVE YEARS 392)

H-INDEX

22
(FIVE YEARS 11)

Author(s):  
Jinwoo Kim ◽  
Ehsanul Haque Nirjhar ◽  
Jaeyoon Kim ◽  
Theodora Chaspari ◽  
Youngjib Ham ◽  
...  

2022 ◽  
Author(s):  
Britta Velten ◽  
Jana M. Braunger ◽  
Ricard Argelaguet ◽  
Damien Arnol ◽  
Jakob Wirbel ◽  
...  

AbstractFactor analysis is a widely used method for dimensionality reduction in genome biology, with applications from personalized health to single-cell biology. Existing factor analysis models assume independence of the observed samples, an assumption that fails in spatio-temporal profiling studies. Here we present MEFISTO, a flexible and versatile toolbox for modeling high-dimensional data when spatial or temporal dependencies between the samples are known. MEFISTO maintains the established benefits of factor analysis for multimodal data, but enables the performance of spatio-temporally informed dimensionality reduction, interpolation, and separation of smooth from non-smooth patterns of variation. Moreover, MEFISTO can integrate multiple related datasets by simultaneously identifying and aligning the underlying patterns of variation in a data-driven manner. To illustrate MEFISTO, we apply the model to different datasets with spatial or temporal resolution, including an evolutionary atlas of organ development, a longitudinal microbiome study, a single-cell multi-omics atlas of mouse gastrulation and spatially resolved transcriptomics.


2022 ◽  
Author(s):  
Bastian Pfeifer ◽  
Afan Secic ◽  
Anna Saranti ◽  
Andreas Holzinger

The tremendous success of graphical neural networks (GNNs) has already had a major impact on systems biology research. For example, GNNs are currently used for drug target recognition in protein-drug interaction networks as well as cancer gene discovery and more. Important aspects whose practical relevance is often underestimated are comprehensibility, interpretability, and explainability. In this work, we present a graph-based deep learning framework for disease subnetwork detection via explainable GNNs. In our framework, each patient is represented by the topology of a protein-protein network (PPI), and the nodes are enriched by molecular multimodal data, such as gene expression and DNA methylation. Therefore, our novel modification of the GNNexplainer for model-wide explanations can detect potential disease subnetworks, which is of high practical relevance. The proposed methods are implemented in the GNN-SubNet Python program, which we have made freely available on our GitHub for the international research community (https://github.com/pievos101/GNN-SubNet).


Sensors ◽  
2022 ◽  
Vol 22 (2) ◽  
pp. 568
Author(s):  
Bertrand Schneider ◽  
Javaria Hassan ◽  
Gahyun Sung

While the majority of social scientists still rely on traditional research instruments (e.g., surveys, self-reports, qualitative observations), multimodal sensing is becoming an emerging methodology for capturing human behaviors. Sensing technology has the potential to complement and enrich traditional measures by providing high frequency data on people’s behavior, cognition and affects. However, there is currently no easy-to-use toolkit for recording multimodal data streams. Existing methodologies rely on the use of physical sensors and custom-written code for accessing sensor data. In this paper, we present the EZ-MMLA toolkit. This toolkit was implemented as a website and provides easy access to multimodal data collection algorithms. One can collect a variety of data modalities: data on users’ attention (eye-tracking), physiological states (heart rate), body posture (skeletal data), gestures (from hand motion), emotions (from facial expressions and speech) and lower-level computer vision algorithms (e.g., fiducial/color tracking). This toolkit can run from any browser and does not require dedicated hardware or programming experience. We compare this toolkit with traditional methods and describe a case study where the EZ-MMLA toolkit was used by aspiring educational researchers in a classroom context. We conclude by discussing future work and other applications of this toolkit, potential limitations and implications.


2022 ◽  
Vol 12 (1) ◽  
pp. 527
Author(s):  
Fei Ma ◽  
Yang Li ◽  
Shiguang Ni ◽  
Shaolun Huang ◽  
Lin Zhang

Audio–visual emotion recognition is the research of identifying human emotional states by combining the audio modality and the visual modality simultaneously, which plays an important role in intelligent human–machine interactions. With the help of deep learning, previous works have made great progress for audio–visual emotion recognition. However, these deep learning methods often require a large amount of data for training. In reality, data acquisition is difficult and expensive, especially for the multimodal data with different modalities. As a result, the training data may be in the low-data regime, which cannot be effectively used for deep learning. In addition, class imbalance may occur in the emotional data, which can further degrade the performance of audio–visual emotion recognition. To address these problems, we propose an efficient data augmentation framework by designing a multimodal conditional generative adversarial network (GAN) for audio–visual emotion recognition. Specifically, we design generators and discriminators for audio and visual modalities. The category information is used as their shared input to make sure our GAN can generate fake data of different categories. In addition, the high dependence between the audio modality and the visual modality in the generated multimodal data is modeled based on Hirschfeld–Gebelein–Re´nyi (HGR) maximal correlation. In this way, we relate different modalities in the generated data to approximate the real data. Then, the generated data are used to augment our data manifold. We further apply our approach to deal with the problem of class imbalance. To the best of our knowledge, this is the first work to propose a data augmentation strategy with a multimodal conditional GAN for audio–visual emotion recognition. We conduct a series of experiments on three public multimodal datasets, including eNTERFACE’05, RAVDESS, and CMEW. The results indicate that our multimodal conditional GAN has high effectiveness for data augmentation of audio–visual emotion recognition.


2022 ◽  
Author(s):  
Tomoyuki Hiroyasu ◽  
Kensuke Tanioka ◽  
Daigo Uraki ◽  
Satoru Hiwa ◽  
Hiroashi Furutani

Human error is the leading cause of traffic accidents and originates from the distraction caused by various factors, such as the driver's physical condition and mental state. One of the significant factors causing driver distraction is the presence of stress. In a previous study, multiple stressors were used to examine distraction while driving. Multiple stressors were given to the driver and the corresponding driver biometric data were obtained, and a multimodal dataset was published thereafter. In this study, we reiterate the results of existing studies and investigated the relationship between gaze variability while driving and stressor intervention, which has not yet been examined. We also examined whether biometric and vehicle information can estimate the presence or absence of secondary tasks during driving.


2021 ◽  
Author(s):  
Arish Alreja ◽  
Michael James Ward ◽  
Qianli Ma ◽  
Mark Richardson ◽  
Brian Russ ◽  
...  

Eye tracking and other behavioral measurements collected from patient-participants in their hospital rooms afford a unique opportunity to study immersive natural behavior for basic and clinical-translational research, and also requires addressing important logistical, technical, and ethical challenges. Hospital rooms provide the opportunity to richly capture both clinically relevant and ordinary natural behavior. As clinical settings, they add the potential to study the relationship between behavior and physiology by collecting physiological data synchronized to behavioral measures from participants. Combining eye-tracking, other behavioral measures, and physiological measurements enables clinical-translational research into understanding the participants' disorders and clinician-patient interactions, as well as basic research into natural, real world behavior as participants eat, read, converse with friends and family, etc. Here we describe a paradigm in individuals undergoing surgical treatment for epilepsy who spend 1-2 weeks in the hospital with electrodes implanted in their brain to determine the source of their seizures. This provides the unique opportunity to record behavior using eye tracking glasses customized to address clinically-related ergonomic concerns, synchronized direct neural recordings, use computer vision to assist with video annotation, and apply multivariate machine learning analyses to multimodal data encompassing hours of natural behavior. We discuss the acquisition, quality control, annotation, and analysis pipelines to study the neural basis of real world social and affective perception during natural conversations with friends and family in participants with epilepsy. We also discuss clinical, logistical, and ethical and privacy considerations that must be addressed to acquire high quality multimodal data in this setting.


Author(s):  
Shweta Purawat ◽  
Subhasis Dasgupta ◽  
Jining Song ◽  
Shakti Davis ◽  
Kajal T. Claypool ◽  
...  
Keyword(s):  
Big Data ◽  

Sign in / Sign up

Export Citation Format

Share Document