1. Most ecosystems are subject to co-occurring, anthropogenically driven changes and understanding how these multiple stressors interact is a pressing concern. Stressor interactions are typically studied using null models, with the additive and multiplicative null expectation being those most widely applied. Such approaches classify interactions as being synergistic, antagonistic, reversal, or indistinguishable from the null expectation. Despite their wide-spread use, there has been no thorough analysis of these null models, nor a systematic test of the robustness of their results to sample size or sampling error in the estimates of the responses to stressors.
2. We use data simulated from food web models where the true stressor interactions are known, and analytical results based on the null model equations to uncover how (i) sample size, (ii) variation in biological responses to the stressors and (iii) statistical significance, affect the ability to detect non-null interactions.
3. Our analyses lead to three main results. Firstly, it is clear the additive and multiplicative null models are not directly comparable, and over one third of all simulated interactions had classifications that were model dependent. Secondly, both null models have weak power to correctly classify interactions at commonly implemented sample sizes (i.e., ≤6 replicates), unless data uncertainty is unrealistically low. This means all but the most extreme interactions are indistinguishable from the null model expectation. Thirdly, we show that increasing sample size increases the power to detect the true interactions but only very slowly. However, the biggest gains come from increasing replicates from 3 up to 25 and we provide an R function for users to determine sample sizes required to detect a critical effect size of biological interest for the additive model.
4. Our results will aid researchers in the design of their experiments and the subsequent interpretation of results. We find no clear statistical advantage of using one null model over the other and argue null model choice should be based on biological relevance rather than statistical properties. However, there is a pressing need to increase experiment sample sizes otherwise many biologically important synergistic and antagonistic stressor interactions will continue to be missed.