Abstract. Spatiotemporally continuous estimates of the hydrologic cycle are often generated through hydrologic modeling, reanalysis, or remote sensing methods, and commonly applied as a supplement to, or a substitute for, in-situ measurements when observational data are sparse or unavailable. Many of these datasets are shared within the public domain, helping to accelerate progress in the fields of hydrology, climatology, and meteorology by (a) reducing the need for technical programming skills and computational power, and (b) providing a wide range of forecast and hindcast estimates of terrestrial hydrology that can be applied within ensemble analyses. Past model inter-comparisons focused on the causes of model disagreement, emphasizing forcing data, model structure, and calibration methods. Despite the relatively recent increased application of publicly available modeled estimates in the scientific community, there is limited discussion or understanding of how selection of one dataset over others can affect study results. This study compares estimates of precipitation (P), actual evapotranspiration (AET), runoff (R), snow water equivalent (SWE), and rootzone soil moisture (RZSM) from 87 unique datasets generated by 47 hydrologic models, reanalysis datasets, and remote sensing products at the monthly timescale across the conterminous United States (CONUS) from 1982 to 2014. To understand the effect of model selection on terrestrial hydrology analyses, 2,925 water budgets were calculated over 2001-2010 for each of eight Environmental Protection Agency ecoregions by iterating through all combinations of 43 hydrologic flux estimates. Variability between hydrologic component estimates was shown to be higher in the western CONUS, with median coefficient of variation (CV) ranging from 11–22 % for P, 14–27 % for AET, 28–153 % for R, 92-102 % for SWE, and 39-92% for RZSM. Variability between estimates was lower in the eastern CONUS, with median CV ranging from 5–15 % for P, 13–23% for AET, 29–96 % for R, 64–70 % for SWE, and 44–81 % for RZSM. Inter-annual trends in estimates from 1982–2010 show more comprehensive agreement for trends in P and AET fluxes but common disagreement for trends in R, SWE, and RZSM. Correlating fluxes and stores against remote sensing-derived products shows poor overall correlation in the western CONUS for AET and RZSM estimates. Iterative budget relative imbalances were shown to range from −50 % to +50 % in major eastern ecoregions and −150 % to +60 % in western ecoregions, depending on models selected. These results demonstrate that disagreement between estimates can be substantial, sometimes exceeding the magnitude of the measurements themselves. The authors conclude that multi-model ensembles are not only useful, but are in fact a necessity, to accurately represent uncertainty in research results. Spatial biases of model disagreement values in the western United States show that targeted research efforts in arid and semi-arid water-limited regions are warranted, with the greatest emphasis on storage and runoff components, to better describe complexities of the terrestrial hydrologic system and reconcile model disagreement.