Exploring the Predictive Power of News and Neural Machine Learning Models for Economic Forecasting

Mining Data for Financial Applications - Lecture Notes in Computer Science ◽

10.1007/978-3-030-66981-2_11 ◽

2021 ◽

pp. 135-149

Author(s):

Luca Barbaglia ◽

Sergio Consoli ◽

Sebastiano Manzan

Keyword(s):

Machine Learning ◽

Time Series ◽

Data Science ◽

Financial Time Series ◽

Signal To Noise Ratio ◽

Word Embedding ◽

Learning Models ◽

Financial Variables ◽

Out Of Sample ◽

Density Forecasts

AbstractForecasting economic and financial variables is a challenging task for several reasons, such as the low signal-to-noise ratio, regime changes, and the effect of volatility among others. A recent trend is to extract information from news as an additional source to forecast economic activity and financial variables. The goal is to evaluate if news can improve forecasts from standard methods that usually are not well-specified and have poor out-of-sample performance. In a currently on-going project, our goal is to combine a richer information set that includes news with a state-of-the-art machine learning model. In particular, we leverage on two recent advances in Data Science, specifically on Word Embedding and Deep Learning models, which have recently attracted extensive attention in many scientific fields. We believe that by combining the two methodologies, effective solutions can be built to improve the prediction accuracy for economic and financial time series. In this preliminary contribution, we provide an overview of the methodology under development and some initial empirical findings. The forecasting model is based on DeepAR, an auto-regressive probabilistic Recurrent Neural Network model, that is combined with GloVe Word Embeddings extracted from economic news. The target variable is the spread between the US 10-Year Treasury Constant Maturity and the 3-Month Treasury Constant Maturity (T10Y3M). The DeepAR model is trained on a large number of related GloVe Word Embedding time series, and employed to produce point and density forecasts.

Download Full-text

A Labeling Method for Financial Time Series Prediction Based on Trends

Entropy ◽

10.3390/e22101162 ◽

2020 ◽

Vol 22 (10) ◽

pp. 1162

Author(s):

Dingming Wu ◽

Xiaolong Wang ◽

Jingyong Su ◽

Buzhou Tang ◽

Shaocong Wu

Keyword(s):

Machine Learning ◽

Time Series ◽

Time Series Data ◽

Financial Time Series ◽

Time Series Prediction ◽

Series Data ◽

Learning Models ◽

Finance Industry ◽

Financial Time ◽

Labeling Method

Time series prediction has been widely applied to the finance industry in applications such as stock market price and commodity price forecasting. Machine learning methods have been widely used in financial time series prediction in recent years. How to label financial time series data to determine the prediction accuracy of machine learning models and subsequently determine final investment returns is a hot topic. Existing labeling methods of financial time series mainly label data by comparing the current data with those of a short time period in the future. However, financial time series data are typically non-linear with obvious short-term randomness. Therefore, these labeling methods have not captured the continuous trend features of financial time series data, leading to a difference between their labeling results and real market trends. In this paper, a new labeling method called “continuous trend labeling” is proposed to address the above problem. In the feature preprocessing stage, this paper proposed a new method that can avoid the problem of look-ahead bias in traditional data standardization or normalization processes. Then, a detailed logical explanation was given, the definition of continuous trend labeling was proposed and also an automatic labeling algorithm was given to extract the continuous trend features of financial time series data. Experiments on the Shanghai Composite Index and Shenzhen Component Index and some stocks of China showed that our labeling method is a much better state-of-the-art labeling method in terms of classification accuracy and some other classification evaluation metrics. The results of the paper also proved that deep learning models such as LSTM and GRU are more suitable for dealing with the prediction of financial time series data.

Download Full-text

Data science in economics: comprehensive review of advanced machine learning and deep learning methods

10.31232/osf.io/4pxq2 ◽

2020 ◽

Author(s):

Saeed Nosratabadi ◽

Amir Mosavi ◽

Puhong Duan ◽

Pedram Ghamisi ◽

Ferdinand Filip ◽

...

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Data Science ◽

State Of The Art ◽

Science Methods ◽

Learning Models ◽

Diverse Range ◽

Hybrid Machine ◽

Economics Research

This paper provides a state-of-the-art investigation of advances in data science in emerging economic applications. The analysis was performed on novel data science methods in four individual classes of deep learning models, hybrid deep learning models, hybrid machine learning, and ensemble models. Application domains include a wide and diverse range of economics research from the stock market, marketing, and e-commerce to corporate banking and cryptocurrency. Prisma method, a systematic literature review methodology, was used to ensure the quality of the survey. The findings reveal that the trends follow the advancement of hybrid models, which, based on the accuracy metric, outperform other learning algorithms. It is further expected that the trends will converge toward the advancements of sophisticated hybrid deep learning models.

Download Full-text

Machine Learning in Futures Markets

Journal of Risk and Financial Management ◽

10.3390/jrfm14030119 ◽

2021 ◽

Vol 14 (3) ◽

pp. 119

Author(s):

Fabian Waldow ◽

Matthias Schnaubelt ◽

Christopher Krauss ◽

Thomas Günter Fischer

Keyword(s):

Machine Learning ◽

Futures Markets ◽

Learning Models ◽

Cross Sectional ◽

Data Set ◽

Statistical Arbitrage ◽

Out Of Sample ◽

Sample Testing ◽

Arbitrage Strategy ◽

Machine Learning Models

In this paper, we demonstrate how a well-established machine learning-based statistical arbitrage strategy can be successfully transferred from equity to futures markets. First, we preprocess futures time series comprised of front months to render them suitable for our returns-based trading framework and compile a data set comprised of 60 futures covering nearly 10 trading years. Next, we train several machine learning models to predict whether the h-day-ahead return of each future out- or underperforms the corresponding cross-sectional median return. Finally, we enter long/short positions for the top/flop-k futures for a duration of h days and assess the financial performance of the resulting portfolio in an out-of-sample testing period. Thereby, we find the machine learning models to yield statistically significant out-of-sample break-even transaction costs of 6.3 bp—a clear challenge to the semi-strong form of market efficiency. Finally, we discuss sources of profitability and the robustness of our findings.

Download Full-text

Combining Public Machine Learning Models by Using Word Embedding for Human Activity Recognition

2021 IEEE International Conference on Pervasive Computing and Communications Workshops and other Affiliated Events (PerCom Workshops) ◽

10.1109/percomworkshops51409.2021.9431141 ◽

2021 ◽

Author(s):

Koichi Shimoda ◽

Akihito Taya ◽

Yoshito Tobe

Keyword(s):

Machine Learning ◽

Activity Recognition ◽

Human Activity ◽

Human Activity Recognition ◽

Word Embedding ◽

Learning Models ◽

Machine Learning Models

Download Full-text

A comparison of time series and machine learning models for inflation forecasting: empirical evidence from the USA

Neural Computing and Applications ◽

10.1007/s00521-016-2766-x ◽

2016 ◽

Vol 30 (5) ◽

pp. 1519-1527 ◽

Cited By ~ 8

Author(s):

Volkan Ülke ◽

Afsin Sahin ◽

Abdulhamit Subasi

Keyword(s):

Machine Learning ◽

Time Series ◽

Empirical Evidence ◽

Learning Models ◽

Inflation Forecasting ◽

The Usa ◽

Machine Learning Models

Download Full-text

Response to Comment on “Predicting reaction performance in C–N cross-coupling using machine learning”

Science ◽

10.1126/science.aat8763 ◽

2018 ◽

Vol 362 (6416) ◽

pp. eaat8763 ◽

Cited By ~ 13

Author(s):

Jesús G. Estrada ◽

Derek T. Ahneman ◽

Robert P. Sheridan ◽

Spencer D. Dreher ◽

Abigail G. Doyle

Keyword(s):

Machine Learning ◽

Cross Coupling ◽

Feature Model ◽

Learning Models ◽

Chemical Feature ◽

Out Of Sample ◽

Reaction Performance ◽

Machine Learning Models

We demonstrate that the chemical-feature model described in our original paper is distinguishable from the nongeneralizable models introduced by Chuang and Keiser. Furthermore, the chemical-feature model significantly outperforms these models in out-of-sample predictions, justifying the use of chemical featurization from which machine learning models can extract meaningful patterns in the dataset, as originally described.

Download Full-text

Discriminating Postural Control Behaviors from Posturography with Statistical Tests and Machine Learning Models: Does Time Series Length Matter?

Lecture Notes in Computer Science - Computational Science – ICCS 2018 ◽

10.1007/978-3-319-93713-7_28 ◽

2018 ◽

pp. 350-357

Author(s):

Luiz H. F. Giovanini ◽

Elisangela F. Manffra ◽

Julio C. Nievola

Keyword(s):

Machine Learning ◽

Time Series ◽

Postural Control ◽

Statistical Tests ◽

Learning Models ◽

Series Length ◽

Machine Learning Models

Download Full-text

Intra-domain and cross-domain transfer learning for time series

10.5194/egusphere-egu21-12142 ◽

2021 ◽

Author(s):

Erik Otović ◽

Marko Njirjak ◽

Dario Jozinović ◽

Goran Mauša ◽

Alberto Michelini ◽

...

Keyword(s):

Machine Learning ◽

Time Series ◽

Transfer Learning ◽

Time Series Data ◽

The Other ◽

Series Data ◽

Sound Recognition ◽

Transfer Of Knowledge ◽

Learning Models ◽

Machine Learning Models

In this study, we compared the performance of machine learning models trained using transfer learning and those that were trained from scratch - on time series data. Four machine learning models were used for the experiment. Two models were taken from the field of seismology, and the other two are general-purpose models for working with time series data. The accuracy of selected models was systematically observed and analyzed when switching within the same domain of application (seismology), as well as between mutually different domains of application (seismology, speech, medicine, finance). In seismology, we used two databases of local earthquakes (one in counts, and the other with the instrument response removed) and a database of global earthquakes for predicting earthquake magnitude; other datasets targeted classifying spoken words (speech), predicting stock prices (finance) and classifying muscle movement from EMG signals (medicine). In practice, it is very demanding and sometimes impossible to collect datasets of tagged data large enough to successfully train a machine learning model. Therefore, in our experiment, we use reduced data sets of 1,500 and 9,000 data instances to mimic such conditions. Using the same scaled-down datasets, we trained two sets of machine learning models: those that used transfer learning for training and those that were trained from scratch. We compared the performances between pairs of models in order to draw conclusions about the utility of transfer learning. In order to confirm the validity of the obtained results, we repeated the experiments several times and applied statistical tests to confirm the significance of the results. The study shows when, within the set experimental framework, the transfer of knowledge brought improvements in terms of model accuracy and in terms of model convergence rate. Our results show that it is possible to achieve better performance and faster convergence by transferring knowledge from the domain of global earthquakes to the domain of local earthquakes; sometimes also vice versa. However, improvements in seismology can sometimes also be achieved by transferring knowledge from medical and audio domains. The results show that the transfer of knowledge between other domains brought even more significant improvements, compared to those within the field of seismology. For example, it has been shown that models in the field of sound recognition have achieved much better performance compared to classical models and that the domain of sound recognition is very compatible with knowledge from other domains. We came to similar conclusions for the domains of medicine and finance. Ultimately, the paper offers suggestions when transfer learning is useful, and the explanations offered can provide a good starting point for knowledge transfer using time series data.

Download Full-text

Data science in economics: comprehensive review of advanced machine learning and deep learning methods

10.21203/rs.3.rs-91905/v1 ◽

2020 ◽

Author(s):

Saeed Nosratabadi ◽

Amir Mosavi ◽

Puhong Duan ◽

Pedram Ghamisi ◽

Filip Ferdinand ◽

...

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Prediction Accuracy ◽

Data Science ◽

State Of The Art ◽

Hybrid Models ◽

The Other ◽

Learning Models ◽

Comprehensive Review

Abstract This paper provides the state of the art of data science in economics. Through a novel taxonomy of applications and methods advances in data science are investigated. The data science advances are investigated in three individual classes of deep learning models, ensemble models, and hybrid models. Application domains include stock market, marketing, E-commerce, corporate banking, and cryptocurrency. Prisma method, a systematic literature review methodology is used to ensure the quality of the survey. The findings revealed that the trends are on advancement of hybrid models as more than 51% of the reviewed articles applied hybrid model. On the other hand, it is found that based on the RMSE accuracy metric, hybrid models had higher prediction accuracy than other algorithms. While it is expected the trends go toward the advancements of deep learning models.

Download Full-text

Analysis and Forecasting of Financial Time Series Using CNN and LSTM-Based Deep Learning Models

Lecture Notes in Networks and Systems - Advances in Distributed Computing and Machine Learning ◽

10.1007/978-981-16-4807-6_39 ◽

2022 ◽

pp. 405-423

Author(s):

Sidra Mehtab ◽

Jaydip Sen

Keyword(s):

Time Series ◽

Deep Learning ◽

Financial Time Series ◽

Learning Models ◽

Financial Time

Download Full-text