Quantitative Geochemical Prediction from Spectral Measurements and Its Application to Spatially Dispersed Spectral Data
The efficacy of predicting geochemical parameters with a 2-chain workflow using spectral data as the initial input is evaluated. Spectral measurements spanning the approximate 400–25000 nm spectral range are used to train a workflow consisting of a non-negative matrix function (NMF) step, for data reduction, and a random forest regression (RFR) to predict eight geochemical parameters. Approximately 175,000 spectra with their corresponding chemical analysis were available for training, testing and validation purposes. The samples and their spectral and chemical parameters represent 9399 drillcore. Of those, approximately 20,000 spectra and their accompanying analysis were used for training and 5000 for model validation. The remaining pairwise data (150,000 samples) were used for testing of the method. The data are distributed over two large spatial extents (980 km2 and 3025 km2, respectively) and allowed the proposed method to be tested against samples that are spatially distant from the initial training points. Global R2 scores and wt.% RMSE on the 150,000 validation samples are Fe (0.95/3.01), SiO2 (0.96/3.77), Al2O3 (0.92/1.27), TiO (0.68/0.13), CaO (0.89/0.41), MgO (0.87/0.35), K2O (0.65/0.21) and LOI (0.90/1.14), given as Parameter (R2/RMSE), and demonstrate that the proposed method is capable of predicting the eight parameters and is stable enough, in the environment tested, to extend beyond the training sets initial spatial location.