Bayesian quantile nonhomogeneous hidden Markov models

2020 ◽  
pp. 096228022094280 ◽  
Author(s):  
Hefei Liu ◽  
Xinyuan Song ◽  
Yanlin Tang ◽  
Baoxue Zhang

Hidden Markov models are useful in simultaneously analyzing a longitudinal observation process and its dynamic transition. Existing hidden Markov models focus on mean regression for the longitudinal response. However, the tails of the response distribution are as important as the center in many substantive studies. We propose a quantile hidden Markov model to provide a systematic method to examine the entire conditional distribution of the response given the hidden state and potential covariates. Instead of considering homogeneous hidden Markov models, which assume that the probabilities of between-state transitions are independent of subject- and time-specific characteristics, we allow the transition probabilities to depend on exogenous covariates, thereby yielding nonhomogeneous Markov chains and making the proposed model more flexible than its homogeneous counterpart. We develop a Bayesian approach coupled with efficient Markov chain Monte Carlo methods for statistical inference. Simulations are conducted to assess the empirical performance of the proposed method. The proposed methodology is applied to a cocaine use study to provide new insights into the prevention of cocaine use.

2018 ◽  
Vol 16 (05) ◽  
pp. 1850019 ◽  
Author(s):  
Ioannis A. Tamposis ◽  
Margarita C. Theodoropoulou ◽  
Konstantinos D. Tsirigos ◽  
Pantelis G. Bagos

Hidden Markov Models (HMMs) are probabilistic models widely used in computational molecular biology. However, the Markovian assumption regarding transition probabilities which dictates that the observed symbol depends only on the current state may not be sufficient for some biological problems. In order to overcome the limitations of the first order HMM, a number of extensions have been proposed in the literature to incorporate past information in HMMs conditioning either on the hidden states, or on the observations, or both. Here, we implement a simple extension of the standard HMM in which the current observed symbol (amino acid residue) depends both on the current state and on a series of observed previous symbols. The major advantage of the method is the simplicity in the implementation, which is achieved by properly transforming the observation sequence, using an extended alphabet. Thus, it can utilize all the available algorithms for the training and decoding of HMMs. We investigated the use of several encoding schemes and performed tests in a number of important biological problems previously studied by our team (prediction of transmembrane proteins and prediction of signal peptides). The evaluation shows that, when enough data are available, the performance increased by 1.8%–8.2% and the existing prediction methods may improve using this approach. The methods, for which the improvement was significant (PRED-TMBB2, PRED-TAT and HMM-TM), are available as web-servers freely accessible to academic users at www.compgen.org/tools/ .


2003 ◽  
Vol 7 (5) ◽  
pp. 652-667 ◽  
Author(s):  
M. F. Lambert ◽  
J. P. Whiting ◽  
A. V. Metcalfe

Abstract. Hidden Markov models (HMMs) can allow for the varying wet and dry cycles in the climate without the need to simulate supplementary climate variables. The fitting of a parametric HMM relies upon assumptions for the state conditional distributions. It is shown that inappropriate assumptions about state conditional distributions can lead to biased estimates of state transition probabilities. An alternative non-parametric model with a hidden state structure that overcomes this problem is described. It is shown that a two-state non-parametric model produces accurate estimates of both transition probabilities and the state conditional distributions. The non-parametric model can be used directly or as a technique for identifying appropriate state conditional distributions to apply when fitting a parametric HMM. The non-parametric model is fitted to data from ten rainfall stations and four streamflow gauging stations at varying distances inland from the Pacific coast of Australia. Evidence for hydrological persistence, though not mathematical persistence, was identified in both rainfall and streamflow records, with the latter showing hidden states with longer sojourn times. Persistence appears to increase with distance from the coast. Keywords: Hidden Markov models, non-parametric, two-state model, climate states, persistence, probability distributions


2020 ◽  
Author(s):  
Brett T. McClintock

AbstractHidden Markov models (HMMs) that include individual-level random effects have recently been promoted for inferring animal movement behaviour from biotelemetry data. These “mixed HMMs” come at significant cost in terms of implementation and computation, and discrete random effects have been advocated as a practical alternative to more computationally-intensive continuous random effects. However, the performance of mixed HMMs has not yet been sufficiently explored to justify their widespread adoption, and there is currently little guidance for practitioners weighing the costs and benefits of mixed HMMs for a particular research objective.I performed an extensive simulation study comparing the performance of a suite of fixed and random effect models for individual heterogeneity in the hidden state process of a 2-state HMM. I focused on sampling scenarios more typical of telemetry studies, which often consist of relatively long time series (30 – 250 observations per animal) for relatively few individuals (5 – 100 animals).I generally found mixed HMMs did not improve state assignment relative to standard HMMs. Reliable estimation of random effects required larger sample sizes than are often feasible in telemetry studies. Continuous random effect models performed reasonably well with data generated under discrete random effects, but not vice versa. Random effects accounting for unexplained individual variation can improve estimation of state transition probabilities and measurable covariate effects, but discrete random effects can be a relatively poor (and potentially misleading) approximation for continuous variation.When weighing the costs and benefits of mixed HMMs, three important considerations are study objectives, sample size, and model complexity. HMM applications often focus on state assignment with little emphasis on heterogeneity in state transition probabilities, in which case random effects in the hidden state process simply may not be worth the additional effort. However, if explaining variation in state transition probabilities is a primary objective and sufficient explanatory covariates are not available, then random effects are worth pursuing as a more parsimonious alternative to individual fixed effects.To help put my findings in context and illustrate some potential challenges that practitioners may encounter when applying mixed HMMs, I revisit a previous analysis of long-finned pilot whale biotelemetry data.


2012 ◽  
Vol 2 (2) ◽  
pp. 180-189 ◽  
Author(s):  
S. Schliehe-Diecks ◽  
P. M. Kappeler ◽  
R. Langrock

Analysing behavioural sequences and quantifying the likelihood of occurrences of different behaviours is a difficult task as motivational states are not observable. Furthermore, it is ecologically highly relevant and yet more complicated to scale an appropriate model for one individual up to the population level. In this manuscript (mixed) hidden Markov models (HMMs) are used to model the feeding behaviour of 54 subadult grey mouse lemurs ( Microcebus murinus ), small nocturnal primates endemic to Madagascar that forage solitarily. Our primary aim is to introduce ecologists and other users to various HMM methods, many of which have been developed only recently, and which in this form have not previously been synthesized in the ecological literature. Our specific application of mixed HMMs aims at gaining a better understanding of mouse lemur behaviour, in particular concerning sex-specific differences. The model we consider incorporates random effects for accommodating heterogeneity across animals, i.e. accounts for different personalities of the animals. Additional subject- and time-specific covariates in the model describe the influence of sex, body mass and time of night.


2020 ◽  
Author(s):  
Zeliha Kilic ◽  
Ioannis Sgouralis ◽  
Steve Pressé

ABSTRACTThe time spent by a single RNA polymerase (RNAP) at specific locations along the DNA, termed “residence time”, reports on the initiation, elongation and termination stages of transcription. At the single molecule level, this information can be obtained from dual ultra-stable optical trapping experiments, revealing a transcriptional elongation of RNAP interspersed with residence times of variable duration. Successfully discriminating between long and short residence times was used by previous approaches to learn about RNAP’s transcription elongation dynamics. Here, we propose an approach based on the Bayesian sticky hidden Markov models that treats all residence times, for an E. Coli RNAP, on an equal footing without a priori discriminating between long and short residence times. In addition, our method has two additional advantages, we provide: full distributions around key point statistics; and directly treat the sequence-dependence of RNAP’s elongation rate.By applying our approach to experimental data, we find: no emergent separation between long and short residence times warranted by the data; force dependent average residence time transcription elongation dynamics; limited effects of GreB on average backtracking durations and counts; and a slight drop in the average residence time as a function of applied force in RNaseA’s presence.STATEMENT OF SIGNIFICANCEMuch of what we know about RNA Polymerase, and its associated transcription factors, relies on successfully discriminating between what are believed to be short and long residence times in the data. This is achieved by applying pause-detection algorithms to trace analysis. Here we propose a new method relying on Bayesian sticky hidden Markov models to interpret time traces provided by dual optical trapping experiments associated with transcription elongation of RNAP. Our method does not discriminate between short and long residence times from the offset in the analysis. It allows for DNA site-dependent transition probabilities of RNAP to neighboring sites (thereby accounting for chemical variability in site to site transitions) and does not demand any time trace pre-processing (such as denoising).


2021 ◽  
Vol 16 (4) ◽  
pp. 334-351
Author(s):  
Iulian Cornel Lolea ◽  
Simona Stamule

Abstract Obtaining higher than market returns is a difficult goal to achieve, especially in times of turbulence such as the COVID-19 crisis, which tested the resilience of many models and algorithms. We used a Hidden Markov Models (HMM) methodology based on monthly data (DAX returns, VSTOXX index Germany’s industrial production and Germany’s annual inflation rate) to calibrate a trading strategy in order to obtain higher returns than a buy-and-hold strategy for the DAX index., following Talla (2013) and Nguyen and Nguyen (2015). The stock selection was based on 26 stocks from DAX’s composition, which had enough data for this study, aiming to select the 15 best performing. The training period was January 2000 - December 2015, and the out-of-sample January 2016 - August 2021, including the period of high turbulence generated by COVID-19. Fitting the best model revealed that the following regimes are the most suitable: two regimes for DAX returns, two regimes for VSTOXX and three regimes for the inflation rate and for the industrial production, while the posterior transition probabilities were event-depending on the training sample. Furthermore, portfolios built using HMM strategy outperformed the DAX index for the out-of-sample period, both in terms of annualized returns and risk-adjusted returns. The results were in line with expectations and what other researchers like Talla (2013), Nguyen and Nguyen (2015) and Varenius (2020) found out. We managed to highlight that a strategy calibrated based on HMM methodology works well even in periods of extreme volatility such as the one generated in 2020 by COVID-19 pandemic.


2015 ◽  
Vol 135 (12) ◽  
pp. 1517-1523 ◽  
Author(s):  
Yicheng Jin ◽  
Takuto Sakuma ◽  
Shohei Kato ◽  
Tsutomu Kunitachi

Author(s):  
M. Vidyasagar

This book explores important aspects of Markov and hidden Markov processes and the applications of these ideas to various problems in computational biology. It starts from first principles, so that no previous knowledge of probability is necessary. However, the work is rigorous and mathematical, making it useful to engineers and mathematicians, even those not interested in biological applications. A range of exercises is provided, including drills to familiarize the reader with concepts and more advanced problems that require deep thinking about the theory. Biological applications are taken from post-genomic biology, especially genomics and proteomics. The topics examined include standard material such as the Perron–Frobenius theorem, transient and recurrent states, hitting probabilities and hitting times, maximum likelihood estimation, the Viterbi algorithm, and the Baum–Welch algorithm. The book contains discussions of extremely useful topics not usually seen at the basic level, such as ergodicity of Markov processes, Markov Chain Monte Carlo (MCMC), information theory, and large deviation theory for both i.i.d and Markov processes. It also presents state-of-the-art realization theory for hidden Markov models. Among biological applications, it offers an in-depth look at the BLAST (Basic Local Alignment Search Technique) algorithm, including a comprehensive explanation of the underlying theory. Other applications such as profile hidden Markov models are also explored.


Sign in / Sign up

Export Citation Format

Share Document