A note on leveraging synergy in multiple meteorological data sets with deep learning for rainfall–runoff modeling

Frederik Kratzert; Daniel Klotz; Sepp Hochreiter; Grey S. Nearing

doi:10.5194/hess-25-2685-2021

A note on leveraging synergy in multiple meteorological data sets with deep learning for rainfall–runoff modeling

Hydrology and Earth System Sciences ◽

10.5194/hess-25-2685-2021 ◽

2021 ◽

Vol 25 (5) ◽

pp. 2685-2703

Author(s):

Frederik Kratzert ◽

Daniel Klotz ◽

Sepp Hochreiter ◽

Grey S. Nearing

Keyword(s):

Deep Learning ◽

Short Term Memory ◽

Meteorological Data ◽

Data Sets ◽

Rainfall Runoff ◽

Data Set ◽

Simulation Accuracy ◽

Meteorological Forcing ◽

Rainfall Runoff Model ◽

Land Data Assimilation System

Abstract. A deep learning rainfall–runoff model can take multiple meteorological forcing products as input and learn to combine them in spatially and temporally dynamic ways. This is demonstrated with Long Short-Term Memory networks (LSTMs) trained over basins in the continental US, using the Catchment Attributes and Meteorological data set for Large Sample Studies (CAMELS). Using meteorological input from different data products (North American Land Data Assimilation System, NLDAS, Maurer, and Daymet) in a single LSTM significantly improved simulation accuracy relative to using only individual meteorological products. A sensitivity analysis showed that the LSTM combines precipitation products in different ways, depending on location, and also in different ways for the simulation of different parts of the hydrograph.

Download Full-text

A note on leveraging synergy in multiple meteorological datasets with deep learning for rainfall-runoff modeling

10.5194/hess-2020-221 ◽

2020 ◽

Author(s):

Frederik Kratzert ◽

Daniel Klotz ◽

Sepp Hochreiter ◽

Grey S. Nearing

Keyword(s):

Deep Learning ◽

Short Term Memory ◽

Rainfall Runoff ◽

Data Set ◽

Simulation Accuracy ◽

Meteorological Forcing ◽

Runoff Modeling ◽

Long Short Term Memory ◽

Rainfall Runoff Model ◽

Runoff Model

Abstract. A deep learning rainfall-runoff model can take multiple meteorological forcing products as inputs and learn to combine them in spatially and temporally dynamic ways. This is demonstrated using Long Short Term Memory networks (LSTMs) trained over basins in the continental US using the CAMELS data set. Using multiple precipitation products (NLDAS, Maurer, DayMet) in a single LSTM significantly improved simulation accuracy relative to using only individual precipitation products. A sensitivity analysis showed that the LSTM learned to utilize different precipitation products in different ways in different basins and for simulating different parts of the hydrograph in individual basins.

Download Full-text

Human Activity Recognition using Fourier Transform Inspired Deep Learning Combination Model

International Journal of Sensors Wireless Communications and Control ◽

10.2174/2210327908666180727123657 ◽

2019 ◽

Vol 9 (1) ◽

pp. 16-31

Author(s):

Kyungkoo Jun

Keyword(s):

Fourier Transform ◽

Deep Learning ◽

Short Term Memory ◽

Window Size ◽

Sensor Data ◽

Data Sets ◽

Data Set ◽

Proposed Model ◽

Testing Data ◽

Labeling Scheme

Background & Objective: This paper proposes a Fourier transform inspired method to classify human activities from time series sensor data. Methods: Our method begins by decomposing 1D input signal into 2D patterns, which is motivated by the Fourier conversion. The decomposition is helped by Long Short-Term Memory (LSTM) which captures the temporal dependency from the signal and then produces encoded sequences. The sequences, once arranged into the 2D array, can represent the fingerprints of the signals. The benefit of such transformation is that we can exploit the recent advances of the deep learning models for the image classification such as Convolutional Neural Network (CNN). Results: The proposed model, as a result, is the combination of LSTM and CNN. We evaluate the model over two data sets. For the first data set, which is more standardized than the other, our model outperforms previous works or at least equal. In the case of the second data set, we devise the schemes to generate training and testing data by changing the parameters of the window size, the sliding size, and the labeling scheme. Conclusion: The evaluation results show that the accuracy is over 95% for some cases. We also analyze the effect of the parameters on the performance.

Download Full-text

Investigation of satellite rainfall-driven rainfall–runoff model using deep learning approaches in two different catchments in India

Journal of Hydroinformatics ◽

10.2166/hydro.2021.067 ◽

2021 ◽

Author(s):

Pavan Kumar Yeditha ◽

Maheswaran Rathinasamy ◽

Sai Sumanth Neelamsetty ◽

Biswa Bhattacharya ◽

Ankit Agarwal

Keyword(s):

Deep Learning ◽

River Basin ◽

Flood Forecasting ◽

Precipitation Data ◽

Rainfall Runoff ◽

Data Set ◽

Satellite Precipitation ◽

Satellite Rainfall ◽

Rainfall Runoff Model ◽

Runoff Model

Abstract Rainfall–runoff models are valuable tools for flood forecasting, management of water resources, and drought warning. With the advancement in space technology, a plethora of satellite precipitation products (SPPs) are available publicly. However, the application of the satellite data for the data-driven rainfall–runoff model is emerging and requires careful investigation. In this work, two satellite rainfall data sets, namely Global Precipitation Measurement-Integrated Multi-Satellite Retrieval Product V6 (GPM-IMERG) and Climate Hazards Group Infrared Precipitation with Station (CHIRPS), are evaluated for the development of rainfall–runoff models and the prediction of 1-day ahead streamflow. The accuracy of the data from the SPPs is compared to the India Meteorological Department (IMD)-gridded precipitation data set. Detection metrics showed that for light rainfall (1–10 mm), the probability of detection (POD) value ranges between 0.67 and 0.75 and with an increasing rainfall range, i.e., medium and heavy rainfall (10–50 mm and >50 mm), the POD values ranged from 0.24 to 0.45. These results indicate that the satellite precipitation performs satisfactorily with reference to the IMD-gridded data set. Using the daily precipitation data of nearly two decades (2000–2018) over two river basins in India's Eastern part, artificial neural network, extreme learning machine (ELM), and long short-time memory (LSTM) models are developed for rainfall–runoff modelling. One-day ahead runoff prediction using the developed rainfall–runoff modelling confirmed that both the SPPs are sufficient to drive the rainfall–runoff models with a reasonable accuracy estimated using the Nash–Sutcliffe Efficiency coefficient, correlation coefficient, and the root-mean-squared error. In particular, the 1-day streamflow forecasts for the Vamsadhara river basin (VRB) using LSTM with GPM-IMERG inputs resulted in NSC values of 0.68 and 0.67, while ELM models for Mahanadhi river basin (MRB) with the same input resulted in NSC values of 0.86 and 0.87, respectively, during training and validation stages. At the same time, the LSTM model with CHIRPS inputs for the VRB resulted in NSC values of 0.68 and 0.65, and the ELM model with CHIRPS inputs for the MRB resulted in NSC values of 0.89 and 0.88, respectively, in training and validation stages. These results indicated that both the SPPs could reliably be used with LSTM and ELM models for rainfall–runoff modelling and streamflow prediction. This paper highlights that deep learning models, such as ELM and LSTM, with the GPM-IMERG products can lead to a new horizon to provide flood forecasting in flood-prone catchments.

Download Full-text

Opportunities and challenges for the use of scintillometer-based catchment-averaged evapotranspiration estimates as model forcing

Hydrology and Earth System Sciences Discussions ◽

10.5194/hessd-10-3973-2013 ◽

2013 ◽

Vol 10 (4) ◽

pp. 3973-4013

Author(s):

B. Samain ◽

V. R. N. Pauwels

Keyword(s):

Potential Evapotranspiration ◽

Seasonal Pattern ◽

Rainfall Runoff ◽

Data Set ◽

Meteorological Forcing ◽

Pass Filter ◽

Low Pass Filter ◽

Stable Conditions ◽

Low Pass ◽

Rainfall Runoff Model

Abstract. To date, lumped rainfall-runoff models rely on rough estimates of catchment-averaged potential evapotranspiration (ETp) rates as meteorological forcing. A model parameter converts this ETp input into actual evapotranspiration (ETact) estimates. This paper examines the potential use of scintillometer-based ETact rates for rainfall-runoff modeling. It has been found that the reservoir-structure of the rainfall-runoff model functions as a low-pass filter for the ETp input. If the long-term volume of the ETp used in the model simulations is consistent with the data set used for calibration, a good match of the seasonal pattern, using temporally constant ETp data, is sufficient to obtain adequate discharge simulations. However, these results are then obtained with strongly erroneous evapotranspiration estimates. A better match of the diurnal cycle does not lead to better model results. Replacing the ETp inputs by scintillometer-based ETact estimates does not lead to better model predictions. Small underestimations of ETact under stable conditions, which occur at night and during the Winter, and which accumulate to significant amounts, are the cause of this problem. Consistent with other studies, the scintillometer-based ETact estimates can be considered reliable and realistic under unstable conditions. These values can thus be used as forcing for rainfall-runoff models.

Download Full-text

A Deep Learning based Arabic Script Recognition System: Benchmark on KHAT

The International Arab Journal of Information Technology ◽

10.34028/iajit/17/3/3 ◽

2020 ◽

Vol 17 (3) ◽

pp. 299-305 ◽

Cited By ~ 1

Author(s):

Riaz Ahmad ◽

Saeeda Naz ◽

Muhammad Afzal ◽

Sheikh Rashid ◽

Marcus Liwicki ◽

...

Keyword(s):

Deep Learning ◽

Character Recognition ◽

Data Augmentation ◽

Short Term Memory ◽

Recognition System ◽

Learning Approach ◽

Arabic Text ◽

Data Set ◽

Processing Step ◽

Handwritten Arabic

This paper presents a deep learning benchmark on a complex dataset known as KFUPM Handwritten Arabic TexT (KHATT). The KHATT data-set consists of complex patterns of handwritten Arabic text-lines. This paper contributes mainly in three aspects i.e., (1) pre-processing, (2) deep learning based approach, and (3) data-augmentation. The pre-processing step includes pruning of white extra spaces plus de-skewing the skewed text-lines. We deploy a deep learning approach based on Multi-Dimensional Long Short-Term Memory (MDLSTM) networks and Connectionist Temporal Classification (CTC). The MDLSTM has the advantage of scanning the Arabic text-lines in all directions (horizontal and vertical) to cover dots, diacritics, strokes and fine inflammation. The data-augmentation with a deep learning approach proves to achieve better and promising improvement in results by gaining 80.02% Character Recognition (CR) over 75.08% as baseline.

Download Full-text

Investigating the impact of pre-processing techniques and pre-trained word embeddings in detecting Arabic health information on social media

Journal Of Big Data ◽

10.1186/s40537-021-00488-w ◽

2021 ◽

Vol 8 (1) ◽

Author(s):

Yahya Albalawi ◽

Jim Buckley ◽

Nikola S. Nikolov

Keyword(s):

Social Media ◽

Deep Learning ◽

Comprehensive Evaluation ◽

Classification Problem ◽

Data Sets ◽

Word Embeddings ◽

Data Set ◽

Lower Accuracy ◽

Health Related ◽

The Impact

AbstractThis paper presents a comprehensive evaluation of data pre-processing and word embedding techniques in the context of Arabic document classification in the domain of health-related communication on social media. We evaluate 26 text pre-processings applied to Arabic tweets within the process of training a classifier to identify health-related tweets. For this task we use the (traditional) machine learning classifiers KNN, SVM, Multinomial NB and Logistic Regression. Furthermore, we report experimental results with the deep learning architectures BLSTM and CNN for the same text classification problem. Since word embeddings are more typically used as the input layer in deep networks, in the deep learning experiments we evaluate several state-of-the-art pre-trained word embeddings with the same text pre-processing applied. To achieve these goals, we use two data sets: one for both training and testing, and another for testing the generality of our models only. Our results point to the conclusion that only four out of the 26 pre-processings improve the classification accuracy significantly. For the first data set of Arabic tweets, we found that Mazajak CBOW pre-trained word embeddings as the input to a BLSTM deep network led to the most accurate classifier with F1 score of 89.7%. For the second data set, Mazajak Skip-Gram pre-trained word embeddings as the input to BLSTM led to the most accurate model with F1 score of 75.2% and accuracy of 90.7% compared to F1 score of 90.8% achieved by Mazajak CBOW for the same architecture but with lower accuracy of 70.89%. Our results also show that the performance of the best of the traditional classifier we trained is comparable to the deep learning methods on the first dataset, but significantly worse on the second dataset.

Download Full-text

CLASSIC: a semi-distributed rainfall-runoff modelling system

Hydrology and Earth System Sciences ◽

10.5194/hess-11-516-2007 ◽

2007 ◽

Vol 11 (1) ◽

pp. 516-531 ◽

Cited By ~ 27

Author(s):

S. M. Crooks ◽

P. S. Naden

Keyword(s):

Spatial Scales ◽

Digital Terrain Model ◽

Flood Frequency ◽

Data Sets ◽

Rainfall Runoff ◽

Terrain Model ◽

Flow Duration Curves ◽

Rainfall Runoff Model ◽

The Uk ◽

Modelling System

Abstract. This paper describes the development of a semi-distributed conceptual rainfall–runoff model, originally formulated to simulate impacts of climate and land-use change on flood frequency. The model has component modules for soil moisture balance, drainage response and channel routing and is grid-based to allow direct incorporation of GIS- and Digital Terrain Model (DTM)-derived data sets into the initialisation of parameter values. Catchment runoff is derived from the aggregation of components of flow from the drainage module within each grid square and from total routed flow from all grid squares. Calibration is performed sequentially for the three modules using different objective functions for each stage. A key principle of the modelling system is the concept of nested calibration, which ensures that all flows simulated for points within a large catchment are spatially consistent. The modelling system is robust and has been applied successfully at different spatial scales to three large catchments in the UK, including comparison of observed and modelled flood frequency and flow duration curves, simulation of flows for uncalibrated catchments and identification of components of flow within a modelled hydrograph. The role of such a model in integrated catchment studies is outlined.

Download Full-text

Classification of Clinically Significant Prostate Cancer on Multi-Parametric MRI: A Validation Study Comparing Deep Learning and Radiomics

Cancers ◽

10.3390/cancers14010012 ◽

2021 ◽

Vol 14 (1) ◽

pp. 12

Author(s):

Jose M. Castillo T. ◽

Muhammad Arif ◽

Martijn P. A. Starmans ◽

Wiro J. Niessen ◽

Chris H. Bangma ◽

...

Keyword(s):

Prostate Cancer ◽

Deep Learning ◽

Characteristic Curve ◽

Model Development ◽

Learning Model ◽

Multiparametric Mri ◽

Data Sets ◽

Data Set ◽

Test Sets ◽

Deep Learning Model

The computer-aided analysis of prostate multiparametric MRI (mpMRI) could improve significant-prostate-cancer (PCa) detection. Various deep-learning- and radiomics-based methods for significant-PCa segmentation or classification have been reported in the literature. To be able to assess the generalizability of the performance of these methods, using various external data sets is crucial. While both deep-learning and radiomics approaches have been compared based on the same data set of one center, the comparison of the performances of both approaches on various data sets from different centers and different scanners is lacking. The goal of this study was to compare the performance of a deep-learning model with the performance of a radiomics model for the significant-PCa diagnosis of the cohorts of various patients. We included the data from two consecutive patient cohorts from our own center (n = 371 patients), and two external sets of which one was a publicly available patient cohort (n = 195 patients) and the other contained data from patients from two hospitals (n = 79 patients). Using multiparametric MRI (mpMRI), the radiologist tumor delineations and pathology reports were collected for all patients. During training, one of our patient cohorts (n = 271 patients) was used for both the deep-learning- and radiomics-model development, and the three remaining cohorts (n = 374 patients) were kept as unseen test sets. The performances of the models were assessed in terms of their area under the receiver-operating-characteristic curve (AUC). Whereas the internal cross-validation showed a higher AUC for the deep-learning approach, the radiomics model obtained AUCs of 0.88, 0.91 and 0.65 on the independent test sets compared to AUCs of 0.70, 0.73 and 0.44 for the deep-learning model. Our radiomics model that was based on delineated regions resulted in a more accurate tool for significant-PCa classification in the three unseen test sets when compared to a fully automated deep-learning model.

Download Full-text

Effect of input variables on rainfall-runoff modeling using a deep learning method

10.5194/egusphere-egu21-4398 ◽

2021 ◽

Author(s):

Kazuki yokoo ◽

Kei ishida ◽

Takeyoshi nagasato ◽

Ali Ercan

Keyword(s):

Deep Learning ◽

Air Temperature ◽

Meteorological Variables ◽

Learning Method ◽

Rainfall Runoff ◽

Model Accuracy ◽

Runoff Modeling ◽

Input Variables ◽

Rainfall Runoff Model ◽

Runoff Model

<p>In recent years, deep learning has been applied to various issues in natural science, including hydrology. These application results show its high applicability. There are some studies that performed rainfall-runoff modeling by means of a deep learning method, LSTM (Long Short-Term Memory). LSTM is a kind of RNN (Recurrent Neural Networks) that is suitable for modeling time series data with long-term dependence. These studies showed the capability of LSTM for rainfall-runoff modeling. However, there are few studies that investigate the effects of input variables on the estimation accuracy. Therefore, this study, investigated the effects of the selection of input variables on the accuracy of a rainfall-runoff model by means of LSTM. As the study watershed, this study selected a snow-dominated watershed, the Ishikari River basin, which is in the Hokkaido region of Japan. The flow discharge was obtained at a gauging station near the outlet of the river as the target data. For the input data to the model, Meteorological variables were obtained from an atmospheric reanalysis dataset, ERA5, in addition to the gridded precipitation dataset. The selected meteorological variables were air temperature, evaporation, longwave radiation, shortwave radiation, and mean sea level pressure. Then, the rainfall-runoff model was trained with several combinations of the input variables. After the training, the model accuracy was compared among the combinations. The use of meteorological variables in addition to precipitation and air temperature as input improved the model accuracy. In some cases, however, the model accuracy was worsened by using more variables as input. The results indicate the importance to select adequate variables as input for rainfall-runoff modeling by LSTM.</p>

Download Full-text

Modeling Superposition of Flat Plate Film Cooling Under Complicated Conditions Using Recurrent Neural Networks

Volume 7B: Heat Transfer ◽

10.1115/gt2020-15131 ◽

2020 ◽

Author(s):

Li Yang ◽

Qi Wang ◽

Yu Rao

Keyword(s):

Film Cooling ◽

Gas Turbines ◽

Short Term Memory ◽

Machine Learning Algorithms ◽

Data Sets ◽

Data Set ◽

Surface Areas ◽

Numerous Data ◽

Input Variables ◽

Complicated Conditions

Abstract Film Cooling is an important and widely used technology to protect hot sections of gas turbines. The last decades witnessed a fast growth of research and publications in the field of film cooling. However, except for the correlations for single row film cooling and the Seller correlation for cooling superposition, there were rarely generalized models for film cooling under superposition conditions. Meanwhile, the numerous data obtained for complex hole distributions were not emerged or integrated from different sources, and recent new data had no avenue to contribute to a compatible model. The technical barriers that obstructed the generalization of film cooling models are: a) the lack of a generalizable model; b) the large number of input variables to describe film cooling. The present study aimed at establishing a generalizable model to describe multiple row film cooling under a large parameter space, including hole locations, hole size, hole angles, blowing ratios etc. The method allowed data measured within different streamwise lengths and different surface areas to be integrated in a single model, in the form 1-D sequences. A Long Short Term Memory model was designed to model the local behavior of film cooling. Careful training, testing and validation were conducted to regress the model. The presented results showed that the method was accurate within the CFD data set generated in this study. The presented method could serve as a base model that allowed past and future film cooling research to contribute to a common data base. Meanwhile, the model could also be transferred from simulation data sets to experimental data sets using advanced machine learning algorithms in the future.

Download Full-text