Performance Evaluation of the Multiple Quantile Regression Model for Estimating Spatial Soil Moisture after Filtering Soil Moisture Outliers
The spatial distribution of soil moisture (SM) was estimated by a multiple quantile regression (MQR) model with Terra Moderate Resolution Imaging Spectroradiometer (MODIS) and filtered SM data from 2013 to 2015 in South Korea. For input data, observed precipitation and SM data were collected from the Korea Meteorological Administration and various institutions monitoring SM. To improve the work of a previous study, prior to the estimation of SM, outlier detection using the isolation forest (IF) algorithm was applied to the observed SM data. The original observed SM data resulted in IF_SM data following outlier detection. This study obtained an average data removal rate of 20.1% at 58 stations. For various reasons, such as instrumentation, environment, and random errors, the original observed SM data contained approximately 20% uncertain data. After outlier detection, this study performed a regression analysis by estimating land surface temperature quantiles. The soil characteristics were considered through reclassification into four soil types (clay, loam, silt, and sand), and the five-day antecedent precipitation was considered in order to estimate the regression coefficient of the MQR model. For all soil types, the coefficient of determination (R2) and root mean square error (RMSE) values ranged from 0.25 to 0.77 and 1.86% to 12.21%, respectively. The MQR results showed a much better performance than that of the multiple linear regression (MLR) results, which yielded R2 and RMSE values of 0.20 to 0.66 and 1.08% to 7.23%, respectively. As a further illustration of improvement, the box plots of the MQR SM were closer to those of the observed SM than those of the MLR SM. This result indicates that the cumulative distribution functions (CDF) of MQR SM matched the CDF of the observed SM. Thus, the MQR algorithm with outlier detection can overcome the limitations of the MLR algorithm by reducing both the bias and variance.