Variable Screening for Near Infrared (NIR) Spectroscopy Data Based on Ridge Partial Least Squares Regression

2020 ◽  
Vol 23 (8) ◽  
pp. 740-756
Author(s):  
Naifei Zhao ◽  
Qingsong Xu ◽  
Man-lai Tang ◽  
Hong Wang

Aim and Objective: Near Infrared (NIR) spectroscopy data are featured by few dozen to many thousands of samples and highly correlated variables. Quantitative analysis of such data usually requires a combination of analytical methods with variable selection or screening methods. Commonly-used variable screening methods fail to recover the true model when (i) some of the variables are highly correlated, and (ii) the sample size is less than the number of relevant variables. In these cases, Partial Least Squares (PLS) regression based approaches can be useful alternatives. Materials and Methods : In this research, a fast variable screening strategy, namely the preconditioned screening for ridge partial least squares regression (PSRPLS), is proposed for modelling NIR spectroscopy data with high-dimensional and highly correlated covariates. Under rather mild assumptions, we prove that using Puffer transformation, the proposed approach successfully transforms the problem of variable screening with highly correlated predictor variables to that of weakly correlated covariates with less extra computational effort. Results: We show that our proposed method leads to theoretically consistent model selection results. Four simulation studies and two real examples are then analyzed to illustrate the effectiveness of the proposed approach. Conclusion: By introducing Puffer transformation, high correlation problem can be mitigated using the PSRPLS procedure we construct. By employing RPLS regression to our approach, it can be made more simple and computational efficient to cope with the situation where model size is larger than the sample size while maintaining a high precision prediction.

2001 ◽  
Vol 9 (2) ◽  
pp. 133-139 ◽  
Author(s):  
L.G. Thygesen ◽  
S.B. Engelsen ◽  
M.H. Madsen ◽  
O.B. Sørensen

A set of 97 potato starch samples with a phosphate content corresponding to a phosphorus content between 0.029 and 0.11 g per 100 g dry matter was analysed using a Rapid Visco Analyzer (RVA) and near infrared (NIR) spectroscopy, (700–2498 nm). NIR-based prediction of phosphate content was possible with a root mean square error of cross-validation ( RMSECV) of 0.006% using PLSR (partial least squares regression). However, the NIR/PLSR model relied on weak spectral signals, and was highly sensitive to sample preparation. The best prediction of phosphate content from the RVA viscograms was a linear regression model based on the RVA variable Breakdown, which gave a RMSECV of 0.008%. NIR/PLSR prediction of the RVA variables Peak viscosity and Breakdown was successful, probably because they were highly related to phosphate content in the present data. Prediction of the other RVA variables from NIR/PLSR was mediocre (Through, Final Viscosity) or not possible (Setback, Peak time, Pasting temperature).


2020 ◽  
Vol 38 (No. 2) ◽  
pp. 131-136
Author(s):  
Wojciech Poćwiardowski ◽  
Joanna Szulc ◽  
Grażyna Gozdecka

The aim of the study was to elaborate a universal calibration for the near infrared (NIR) spectrophotometer to determine the moisture of various kinds of vegetable seeds. The research was conducted on the seeds of 5 types of vegetables – carrot, parsley, lettuce, radish and beetroot. For the spectra correlation with moisture values, the method of partial least squares regression (PLS) was used. The resulting qualitative indicators of a calibration model (R = 0.9968, Q = 0.8904) confirmed an excellent fit of the obtained calibration to the experimental data. As a result of the study, the possibilities of creating a calibration model for NIR spectrophotometer for non-destructive moisture analysis of various kinds of vegetable seeds was confirmed.<br /><br />


2019 ◽  
Vol 2 (1) ◽  
pp. 43-55
Author(s):  
Joan Espel Grekopoulos

There is an increasing interest in cannabinoids as they are being proved to effectively treat the symptoms of a variety of medical conditions. Commercialization of cannabinoid-based pharmaceutical products is expected to grow in the near future, favored by the recent changes in medical regulations in many developed countries. Hence, robust and reliable analytical methods for determining the content of the active pharmaceutical ingredient will be needed, as this is one of the most relevant parameters for the decision to release the final pharmaceutical product into the market. The aim of this work was to demonstrate that near-infrared (NIR) spectroscopy fulfills the needed requirements for this purpose, as well as to provide a methodology to be applied to other cannabinoid-based products. We present two validated methods for the quantification of different liquid pharma-grade cannabidiol (CBD) formulations based on NIR spectroscopy and partial least squares regression modelling. The methods were constructed and validated with spectra belonging both to production samples and to laboratory samples specifically made for this purpose, and they fulfill European Medicines Agency and International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use guideline requirements. These methods allow determining the CBD content with results comparable to the usual method of choice while saving reagent- as well as time-related costs.


2019 ◽  
Vol 28 (3) ◽  
pp. e015
Author(s):  
José-Henrique Camargo Pace ◽  
João-Vicente De Figueiredo Latorraca ◽  
Paulo-Ricardo Gherardi Hein ◽  
Alexandre Monteiro de Carvalho ◽  
Jonnys Paz Castro ◽  
...  

Aim of study: Fast and reliable wood identification solutions are needed to combat the illegal trade in native woods. In this study, multivariate analysis was applied in near-infrared (NIR) spectra to identify wood of the Atlantic Forest species.Area of study: Planted forests located in the Vale Natural Reserve in the county of Sooretama (19 ° 01'09 "S 40 ° 05'51" W), Espírito Santo, Brazil.Material and methods: Three trees of 12 native species from homogeneous plantations. The principal component analysis (PCA) and partial least squares regression by discriminant function (PLS-DA) were performed on the woods spectral signatures.Main results: The PCA scores allowed to agroup some wood species from their spectra. The percentage of correct classifications generated by the PLS-DA model was 93.2%. In the independent validation, the PLS-DA model correctly classified 91.3% of the samples.Research highlights: The PLS-DA models were adequate to classify and identify the twelve native wood species based on the respective NIR spectra, showing good ability to classify independent native wood samples.Keywords: native woods; NIR spectra; principal components; partial least squares regression.


2018 ◽  
Vol 11 (7) ◽  
pp. e201700365 ◽  
Author(s):  
Raphael Henn ◽  
Christian G. Kirchler ◽  
Zora L. Schirmeister ◽  
Andreas Roth ◽  
Werner Mäntele ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document