Disentangling environmental effects in microbial association networks
Abstract Background: Ecolocial interctions among microorganisms are fundamental for ecosystem function, yet they are mostly unknown or poorly understood. High-throughput-omics can indicate microbial interactions by associations across time and space, which can be represented as association networks. Links in these networks could result from either ecological interactions between microorganisms, or from environmental selection, where the association is environmentally-driven. Therefore, before downstream analysis and interpretation, we need to distinguish the nature of the association, particularly if it is due to environmental selection or not.Results: We present EnDED (Environmentally-Driven Edge Detection), an implementation of four approaches as well as their combination to predict which links between microorganisms in an association network are environmentally-driven. The four approaches are Sign Pattern, Overlap, Interaction Information, and Data Processing Inequality. We tested EnDED on networks from simulated data of 50 microorganisms. The networks contained on average 50 nodes and 1,087 edges, of which 60 were true interactions but 1,026 false associations (i.e. environmentally-driven or due to chance). Applying each method individually, we detected a moderate to high number of environmentally-driven edges—87% Sign Pattern and Overlap, 67% Interaction Information, and 44% Data Processing Inequality. Combining these methods in an intersection approach resulted in retaining more interactions, both true and false (32% of environmentally-driven associations). The addition of noise to the simulated datasets did not alter qualitatively these results. After validation with the simulated datasets, we applied EnDED on a marine microbial network inferred from 10 years of monthly observations of microbial-plankton abundance. The intersection combination predicted that 14.2% of the associations were environmentally-driven, while individual methods predicted 31.4% (Data Processing Inequality), 38.3% (Interaction Information), and up to 83.4% (Sign Pattern as well as Overlap).Conclusions: To reach accurate hypotheses about ecological interactions, it is important to determine, quantify, and remove environmentally-driven associations in marine microbial association networks. For that, EnDED offers up to four individual methods as well as their combination. However, especially for the intersection combination, we suggest to use EnDED with other strategies to reduce the number of false associations and consequently the number of potential interaction hypotheses.