scholarly journals Snow Depth Fusion Based on Machine Learning Methods for the Northern Hemisphere

2021 ◽  
Vol 13 (7) ◽  
pp. 1250
Author(s):  
Yanxing Hu ◽  
Tao Che ◽  
Liyun Dai ◽  
Lin Xiao

In this study, a machine learning algorithm was introduced to fuse gridded snow depth datasets. The input variables of the machine learning method included geolocation (latitude and longitude), topographic data (elevation), gridded snow depth datasets and in situ observations. A total of 29,565 in situ observations were used to train and optimize the machine learning algorithm. A total of five gridded snow depth datasets—Advanced Microwave Scanning Radiometer for the Earth Observing System (AMSR-E) snow depth, Global Snow Monitoring for Climate Research (GlobSnow) snow depth, Long time series of daily snow depth over the Northern Hemisphere (NHSD) snow depth, ERA-Interim snow depth and Modern-Era Retrospective Analysis for Research and Applications, version 2 (MERRA-2) snow depth—were used as input variables. The first three snow depth datasets are retrieved from passive microwave brightness temperature or assimilation with in situ observations, while the last two are snow depth datasets obtained from meteorological reanalysis data with a land surface model and data assimilation system. Then, three machine learning methods, i.e., Artificial Neural Networks (ANN), Support Vector Regression (SVR), and Random Forest Regression (RFR), were used to produce a fused snow depth dataset from 2002 to 2004. The RFR model performed best and was thus used to produce a new snow depth product from the fusion of the five snow depth datasets and auxiliary data over the Northern Hemisphere from 2002 to 2011. The fused snow-depth product was verified at five well-known snow observation sites. The R2 of Sodankylä, Old Aspen, and Reynolds Mountains East were 0.88, 0.69, and 0.63, respectively. At the Swamp Angel Study Plot and Weissfluhjoch observation sites, which have an average snow depth exceeding 200 cm, the fused snow depth did not perform well. The spatial patterns of the average snow depth were analyzed seasonally, and the average snow depths of autumn, winter, and spring were 5.7, 25.8, and 21.5 cm, respectively. In the future, random forest regression will be used to produce a long time series of a fused snow depth dataset over the Northern Hemisphere or other specific regions.

Entropy ◽  
2021 ◽  
Vol 23 (3) ◽  
pp. 300
Author(s):  
Mark Lokanan ◽  
Susan Liu

Protecting financial consumers from investment fraud has been a recurring problem in Canada. The purpose of this paper is to predict the demographic characteristics of investors who are likely to be victims of investment fraud. Data for this paper came from the Investment Industry Regulatory Organization of Canada’s (IIROC) database between January of 2009 and December of 2019. In total, 4575 investors were coded as victims of investment fraud. The study employed a machine-learning algorithm to predict the probability of fraud victimization. The machine learning model deployed in this paper predicted the typical demographic profile of fraud victims as investors who classify as female, have poor financial knowledge, know the advisor from the past, and are retired. Investors who are characterized as having limited financial literacy but a long-time relationship with their advisor have reduced probabilities of being victimized. However, male investors with low or moderate-level investment knowledge were more likely to be preyed upon by their investment advisors. While not statistically significant, older adults, in general, are at greater risk of being victimized. The findings from this paper can be used by Canadian self-regulatory organizations and securities commissions to inform their investors’ protection mandates.


Friction ◽  
2021 ◽  
Author(s):  
Vigneashwara Pandiyan ◽  
Josef Prost ◽  
Georg Vorlaufer ◽  
Markus Varga ◽  
Kilian Wasmer

AbstractFunctional surfaces in relative contact and motion are prone to wear and tear, resulting in loss of efficiency and performance of the workpieces/machines. Wear occurs in the form of adhesion, abrasion, scuffing, galling, and scoring between contacts. However, the rate of the wear phenomenon depends primarily on the physical properties and the surrounding environment. Monitoring the integrity of surfaces by offline inspections leads to significant wasted machine time. A potential alternate option to offline inspection currently practiced in industries is the analysis of sensors signatures capable of capturing the wear state and correlating it with the wear phenomenon, followed by in situ classification using a state-of-the-art machine learning (ML) algorithm. Though this technique is better than offline inspection, it possesses inherent disadvantages for training the ML models. Ideally, supervised training of ML models requires the datasets considered for the classification to be of equal weightage to avoid biasing. The collection of such a dataset is very cumbersome and expensive in practice, as in real industrial applications, the malfunction period is minimal compared to normal operation. Furthermore, classification models would not classify new wear phenomena from the normal regime if they are unfamiliar. As a promising alternative, in this work, we propose a methodology able to differentiate the abnormal regimes, i.e., wear phenomenon regimes, from the normal regime. This is carried out by familiarizing the ML algorithms only with the distribution of the acoustic emission (AE) signals captured using a microphone related to the normal regime. As a result, the ML algorithms would be able to detect whether some overlaps exist with the learnt distributions when a new, unseen signal arrives. To achieve this goal, a generative convolutional neural network (CNN) architecture based on variational auto encoder (VAE) is built and trained. During the validation procedure of the proposed CNN architectures, we were capable of identifying acoustics signals corresponding to the normal and abnormal wear regime with an accuracy of 97% and 80%. Hence, our approach shows very promising results for in situ and real-time condition monitoring or even wear prediction in tribological applications.


The article aims to develop a model for forecasting the characteristics of traffic flows in real-time based on the classification of applications using machine learning methods to ensure the quality of service. It is shown that the model can forecast the mean rate and frequency of packet arrival for the entire flow of each class separately. The prediction is based on information about the previous flows of this class and the first 15 packets of the active flow. Thus, the Random Forest Regression method reduces the prediction error by approximately 1.5 times compared to the standard mean estimate for transmitted packets issued at the switch interface.


2002 ◽  
Vol 2 (5) ◽  
pp. 1599-1633 ◽  
Author(s):  
M. Seifert ◽  
J. Ström ◽  
R. Krejci ◽  
A. Minikin ◽  
A. Petzold ◽  
...  

Abstract. In situ observations of aerosol particles contained in cirrus crystals are presented and compared to interstitial aerosol size distributions (non-activated particles in between the cirrus crystals). The observations were conducted in cirrus clouds in the Southern and Northern Hemisphere mid-latitudes during the INCA project. The first campaign in March and April 2000 was performed from Punta Arenas, Chile (54° S) in pristine air. The second campaign in September and October 2000 was performed from Prestwick, Scotland (53° N) in the vicinity of the North Atlantic flight corridor. Size distribution measurements of crystal residuals (particles remaining after evaporation of the crystals) show that small aerosol particles (Dp < 0.1µm) dominate the number density of residuals. The crystal residual size distributions were significantly different in the two campaigns. On average the residual size distributions were shifted towards larger sizes in the Southern Hemisphere. For a given integral residual number density, the calculated particle volume was on average three times larger in the Southern Hemisphere. This may be of significance to the vertical redistribution of aerosol mass by clouds in the tropopause region. In both campaigns the mean residual size increased with increasing crystal number density. The observations of ambient aerosol particles were consistent with the expected higher pollution level in the Northern Hemisphere. The fraction of residual particles only contributes to approximately a percent or less of the total number of particles, which is the sum of the residual and interstitial particles.


2021 ◽  
Vol 295 (2) ◽  
pp. 97-100
Author(s):  
K. Seniva ◽  

This article discusses the main ways of using neural networks and machine learning methods of various types in computer games. Machine learning and neural networks are hot topics in many technology fields. One of them is the creation of computer games, where new tools are used to make games more interesting. Remastered and modified games with neural networks have become a new trend. One of the most popular ways to implement artificial intelligence is neural networks. They are used in everything from medicine to the entertainment industry. But one of the most promising areas for their development is games. The game world is an ideal platform for testing artificial intelligence without the danger of harming nature or people. Making bots more complex is just a small part of what neural networks can do. They are also actively used in game development, and in some areas they already make people feel uncomfortable. Research is ongoing on color and light correction, real-time character animation and behavior control. The main types of neural networks that can learn such functions are considered. Neural networks learn (and self-learn) very quickly. The more primitive the task, the faster the person will become unnecessary. This is already noticeable in the gaming industry, but will soon spread to other areas of life, because games are just a convenient platform for experimenting with artificial intelligence before its implementation in real life. The main problem faced by scientists is that it is difficult for neural networks to copy the mechanics of the game. There are some achievements in this direction, but research continues. Therefore, in the future, real specialists will be required for the development of games for a long time, although AI is already coping with some tasks.


2020 ◽  
Vol 24 (10) ◽  
pp. 4887-4902
Author(s):  
Fraser King ◽  
Andre R. Erler ◽  
Steven K. Frey ◽  
Christopher G. Fletcher

Abstract. Snow is a critical contributor to Ontario's water-energy budget, with impacts on water resource management and flood forecasting. Snow water equivalent (SWE) describes the amount of water stored in a snowpack and is important in deriving estimates of snowmelt. However, only a limited number of sparsely distributed snow survey sites (n=383) exist throughout Ontario. The SNOw Data Assimilation System (SNODAS) is a daily, 1 km gridded SWE product that provides uniform spatial coverage across this region; however, we show here that SWE estimates from SNODAS display a strong positive mean bias of 50 % (16 mm SWE) when compared to in situ observations from 2011 to 2018. This study evaluates multiple statistical techniques of varying complexity, including simple subtraction, linear regression and machine learning methods to bias-correct SNODAS SWE estimates using absolute mean bias and RMSE as evaluation criteria. Results show that the random forest (RF) algorithm is most effective at reducing bias in SNODAS SWE, with an absolute mean bias of 0.2 mm and RMSE of 3.64 mm when compared with in situ observations. Other methods, such as mean bias subtraction and linear regression, are somewhat effective at bias reduction; however, only the RF method captures the nonlinearity in the bias and its interannual variability. Applying the RF model to the full spatio-temporal domain shows that the SWE bias is largest before 2015, during the spring melt period, north of 44.5∘ N and east (downwind) of the Great Lakes. As an independent validation, we also compare estimated snowmelt volumes with observed hydrographs and demonstrate that uncorrected SNODAS SWE is associated with unrealistically large volumes at the time of the spring freshet, while bias-corrected SWE values are highly consistent with observed discharge volumes.


2021 ◽  
Vol 4 (2) ◽  
pp. 34-69
Author(s):  
Dávid Burka ◽  
László Kovács ◽  
László Szepesváry

Pricing an insurance product covering motor third-party liability is a major challenge for actuaries. Comprehensive statistical modelling and modern computational power are necessary to solve this problem. The generalised linear and additive modelling approaches have been widely used by insurance companies for a long time. Modelling with modern machine learning methods has recently started, but applying them properly with relevant features is a great issue for pricing experts. This study analyses the claim-causing probability by fitting generalised linear modelling, generalised additive modelling, random forest, and neural network models. Several evaluation measures are used to compare these techniques. The best model is a mixture of the base methods. The authors’ hypothesis about the existence of significant interactions between feature variables is proved by the models. A simplified classification and visualisation is performed on the final model, which can support tariff applications later.


2021 ◽  
Vol 8 (3) ◽  
pp. 209-221
Author(s):  
Li-Li Wei ◽  
Yue-Shuai Pan ◽  
Yan Zhang ◽  
Kai Chen ◽  
Hao-Yu Wang ◽  
...  

Abstract Objective To study the application of a machine learning algorithm for predicting gestational diabetes mellitus (GDM) in early pregnancy. Methods This study identified indicators related to GDM through a literature review and expert discussion. Pregnant women who had attended medical institutions for an antenatal examination from November 2017 to August 2018 were selected for analysis, and the collected indicators were retrospectively analyzed. Based on Python, the indicators were classified and modeled using a random forest regression algorithm, and the performance of the prediction model was analyzed. Results We obtained 4806 analyzable data from 1625 pregnant women. Among these, 3265 samples with all 67 indicators were used to establish data set F1; 4806 samples with 38 identical indicators were used to establish data set F2. Each of F1 and F2 was used for training the random forest algorithm. The overall predictive accuracy of the F1 model was 93.10%, area under the receiver operating characteristic curve (AUC) was 0.66, and the predictive accuracy of GDM-positive cases was 37.10%. The corresponding values for the F2 model were 88.70%, 0.87, and 79.44%. The results thus showed that the F2 prediction model performed better than the F1 model. To explore the impact of sacrificial indicators on GDM prediction, the F3 data set was established using 3265 samples (F1) with 38 indicators (F2). After training, the overall predictive accuracy of the F3 model was 91.60%, AUC was 0.58, and the predictive accuracy of positive cases was 15.85%. Conclusions In this study, a model for predicting GDM with several input variables (e.g., physical examination, past history, personal history, family history, and laboratory indicators) was established using a random forest regression algorithm. The trained prediction model exhibited a good performance and is valuable as a reference for predicting GDM in women at an early stage of pregnancy. In addition, there are certain requirements for the proportions of negative and positive cases in sample data sets when the random forest algorithm is applied to the early prediction of GDM.


2018 ◽  
Vol 10 (4) ◽  
pp. 1829-1842 ◽  
Author(s):  
Athanasia Iona ◽  
Athanasios Theodorou ◽  
Sarantis Sofianos ◽  
Sylvain Watelet ◽  
Charles Troupin ◽  
...  

Abstract. We present a new product composed of a set of thermohaline climatic indices from 1950 to 2015 for the Mediterranean Sea such as decadal temperature and salinity anomalies, their mean values over selected depths, decadal ocean heat and salt content anomalies at selected depth layers as well as their long time series. It is produced from a new high-resolution climatology of temperature and salinity on a 1∕8∘ regular grid based on historical high-quality in situ observations. Ocean heat and salt content differences between 1980–2015 and 1950–1979 are compared for evaluation of the climate shift in the Mediterranean Sea. The two successive periods are chosen according to the standard WMO climate normals. The spatial patterns of heat and salt content shifts demonstrate that the climate changes differently in the several regions of the basin. Long time series of heat and salt content for the period 1950 to 2015 are also provided which indicate that in the Mediterranean Sea there is a net mean volume warming and salinification since 1950 that has accelerated during the last two decades. The time series also show that the ocean heat content seems to fluctuate on a cycle of about 40 years and seems to follow the Atlantic Multidecadal Oscillation climate cycle, indicating that the natural large-scale atmospheric variability could be superimposed onto the warming trend. This product is an observation-based estimation of the Mediterranean climatic indices. It relies solely on spatially interpolated data produced from in situ observations averaged over decades in order to smooth the decadal variability and reveal the long-term trends. It can provide a valuable contribution to the modellers' community, next to the satellite-based products, and serve as a baseline for the evaluation of climate-change model simulations, thus contributing to a better understanding of the complex response of the Mediterranean Sea to the ongoing global climate change. The product is available in netCDF at the following sources: annual and seasonal T∕S anomalies (https://doi.org/10.5281/zenodo.1408832), annual and seasonal T∕S vertical averaged anomalies (https://doi.org/10.5281/zenodo.1408929), annual and seasonal areal density of OHC/OSC anomalies (https://doi.org/10.5281/zenodo.1408877), annual and seasonal linear trends of T∕S, OHC/OSC anomalies (https://doi.org/10.5281/zenodo.1408917), annual and seasonal time series of T∕S, OHC/OSC anomalies (https://doi.org/10.5281/zenodo.1411398), and differences of two 30-year averages of annual and seasonal T∕S, OHC/OSC anomalies (https://doi.org/10.5281/zenodo.1408903).


Sign in / Sign up

Export Citation Format

Share Document