Data Mining Technique Applied To DNA Sequencing

Author(s):  
R. Jamuna

CpG islands (CGIs) play a vital role in genome analysis as genomic markers.  Identification of the CpG pair has contributed not only to the prediction of promoters but also to the understanding of the epigenetic causes of cancer. In the human genome [1] wherever the dinucleotides CG occurs the C nucleotide (cytosine) undergoes chemical modifications. There is a relatively high probability of this modification that mutates C into a T. For biologically important reasons the mutation modification process is suppressed in short stretches of the genome, such as ‘start’ regions. In these regions [2] predominant CpG dinucleotides are found than elsewhere. Such regions are called CpG islands. DNA methylation is an effective means by which gene expression is silenced. In normal cells, DNA methylation functions to prevent the expression of imprinted and inactive X chromosome genes. In cancerous cells, DNA methylation inactivates tumor-suppressor genes, as well as DNA repair genes, can disrupt cell-cycle regulation. The most current methods for identifying CGIs suffered from various limitations and involved a lot of human interventions. This paper gives an easy searching technique with data mining of Markov Chain in genes. Markov chain model has been applied to study the probability of occurrence of C-G pair in the given   gene sequence. Maximum Likelihood estimators for the transition probabilities for each model and analgously for the  model has been developed and log odds ratio that is calculated estimates the presence or absence of CpG is lands in the given gene which brings in many  facts for the cancer detection in human genome.

Risks ◽  
2021 ◽  
Vol 9 (2) ◽  
pp. 37
Author(s):  
Manuel L. Esquível ◽  
Gracinda R. Guerreiro ◽  
Matilde C. Oliveira ◽  
Pedro Corte Real

We consider a non-homogeneous continuous time Markov chain model for Long-Term Care with five states: the autonomous state, three dependent states of light, moderate and severe dependence levels and the death state. For a general approach, we allow for non null intensities for all the returns from higher dependence levels to all lesser dependencies in the multi-state model. Using data from the 2015 Portuguese National Network of Continuous Care database, as the main research contribution of this paper, we propose a method to calibrate transition intensities with the one step transition probabilities estimated from data. This allows us to use non-homogeneous continuous time Markov chains for modeling Long-Term Care. We solve numerically the Kolmogorov forward differential equations in order to obtain continuous time transition probabilities. We assess the quality of the calibration using the Portuguese life expectancies. Based on reasonable monthly costs for each dependence state we compute, by Monte Carlo simulation, trajectories of the Markov chain process and derive relevant information for model validation and premium calculation.


2020 ◽  
Vol 27 (2) ◽  
pp. 237-250
Author(s):  
Misuk Lee

Purpose Over the past two decades, online booking has become a predominant distribution channel of tourism products. As online sales have become more important, understanding booking conversion behavior remains a critical topic in the tourism industry. The purpose of this study is to model airline search and booking activities of anonymous visitors. Design/methodology/approach This study proposes a stochastic approach to explicitly model dynamics of airline customers’ search, revisit and booking activities. A Markov chain model simultaneously captures transition probabilities and the timing of search, revisit and booking decisions. The suggested model is demonstrated on clickstream data from an airline booking website. Findings Empirical results show that low prices (captured as discount rates) lead to not only booking propensities but also overall stickiness to a website, increasing search and revisit probabilities. From the decision timing of search and revisit activities, the author observes customers’ learning effect on browsing time and heterogeneous intentions of website visits. Originality/value This study presents both theoretical and managerial implications of online search and booking behavior for airline and tourism marketing. The dynamic Markov chain model provides a systematic framework to predict online search, revisit and booking conversion and the time of the online activities.


2015 ◽  
Vol 2 (1) ◽  
pp. 399-424
Author(s):  
M. S. Cavers ◽  
K. Vasudevan

Abstract. Directed graph representation of a Markov chain model to study global earthquake sequencing leads to a time-series of state-to-state transition probabilities that includes the spatio-temporally linked recurrent events in the record-breaking sense. A state refers to a configuration comprised of zones with either the occurrence or non-occurrence of an earthquake in each zone in a pre-determined time interval. Since the time-series is derived from non-linear and non-stationary earthquake sequencing, we use known analysis methods to glean new information. We apply decomposition procedures such as ensemble empirical mode decomposition (EEMD) to study the state-to-state fluctuations in each of the intrinsic mode functions. We subject the intrinsic mode functions, the orthogonal basis set derived from the time-series using the EEMD, to a detailed analysis to draw information-content of the time-series. Also, we investigate the influence of random-noise on the data-driven state-to-state transition probabilities. We consider a second aspect of earthquake sequencing that is closely tied to its time-correlative behavior. Here, we extend the Fano factor and Allan factor analysis to the time-series of state-to state transition frequencies of a Markov chain. Our results support not only the usefulness the intrinsic mode functions in understanding the time-series but also the presence of power-law behaviour exemplified by the Fano factor and the Allan factor.


2019 ◽  
Vol 2019 ◽  
pp. 1-8 ◽  
Author(s):  
Clement Twumasi ◽  
Louis Asiedu ◽  
Ezekiel N. N. Nortey

Several mathematical and standard epidemiological models have been proposed in studying infectious disease dynamics. These models help to understand the spread of disease infections. However, most of these models are not able to estimate other relevant disease metrics such as probability of first infection and recovery as well as the expected time to infection and recovery for both susceptible and infected individuals. That is, most of the standard epidemiological models used in estimating transition probabilities (TPs) are not able to generalize the transition estimates of disease outcomes at discrete time steps for future predictions. This paper seeks to address the aforementioned problems through a discrete-time Markov chain model. Secondary datasets from cohort studies were collected on HIV, tuberculosis (TB), and hepatitis B (HB) cases from a regional hospital in Ghana. The Markov chain model revealed that hepatitis B was more infectious over time than tuberculosis and HIV even though the probability of first infection of these diseases was relatively low within the study population. However, individuals infected with HIV had comparatively lower life expectancies than those infected with tuberculosis and hepatitis B. Discrete-time Markov chain technique is recommended as viable for modeling disease dynamics in Ghana.


1995 ◽  
Vol 73 (7-8) ◽  
pp. 461-472 ◽  
Author(s):  
Yih Lee ◽  
Larry V. Mclntire ◽  
Kyriacos Zygourakis ◽  
Pauline A. Markenscoff

A Markov chain model was developed to characterize the two-dimensional locomotion of bovine pulmonary artery endothelial (BPAE) cells cultured with or without basic fibroblast growth factor (bFGF). This model provides a detailed description of the migration process by computing the following locomotory parameters: (i) the speed of cell locomotion; (ii) the expected duration of cell movement in any given direction; (iii) the probability distribution of turn angles that will decide the next direction of cell movement; (iv) the frequency of cell stops; and (v) the duration of cell stops. Eight directional states and a stationary state were used in our Markov analysis. From cell trajectory data, the transition probabilities among the various states and the waiting times for the directional and the stationary states were computed. The steady-state probabilities were also calculated to obtain the ultimate direction of cell motion and, thus, determine whether cell motion was random. Our results showed how the addition of bFGF enhanced the locomotory capability of BPAE cells. Cells cultured with 30 ng/mL bFGF had lower probability of moving to the stationary state than those cultured without bFGF In addition, cells cultured with 30 ng/mL bFGF remained in the stationary state for shorter periods of time than cells cultured without bFGF. In both these cases, however, the transition probabilities from the stationary state to any directional state were uniformly distributed and were not affected by the presence of bFGF.Key words: Markov chain model, stochastic process, cell locomotion, endothelial cells.


1987 ◽  
Vol 24 (4) ◽  
pp. 1006-1011 ◽  
Author(s):  
G. Abdallaoui

Our concern is with a particular problem which arises in connection with a discrete-time Markov chain model for a graded manpower system. In this model, the members of an organisation are classified into distinct classes. As time passes, they move from one class to another, or to the outside world, in a random way governed by fixed transition probabilities. In this paper, the emphasis is placed on evaluating exact values of the probabilities of attaining and maintaining a structure.


2021 ◽  
Author(s):  
Ling Wang ◽  
Heather S. Laird-Fick ◽  
Carol J. Parker ◽  
David Solomon

Abstract Medical students must meet curricular expectations and pass national licensing examinations to become physicians. The Michigan State University College of Human Medicine implemented progress testing in place of discipline-specific examinations as its primary assessment of knowledge in 2016. Ideally this innovative assessment strategy will characterize students’ growth in basic science knowledge over time and predict licensing examination performance.Markov chain method was employed to: 1) identify latent states of acquiring scientific knowledge based on progress tests, 2) estimate students’ transition probabilities between states, and 3) predict United States Medical Licensing Examination Step 1 results based on the students’ predicted probabilities in each state. A total of 358 students were included in the analysis. Four latent states were identified based on students’ progress test results: Novice, Advanced Beginner I, Advanced Beginner II and Competent States. At the end of the first year, students predicted to remain in the Novice state had lower mean Step 1 scores compared to those in the Competent state (209, SD = 14.8 versus 255, SD = 10.8 respectively) and had more first attempt failures (11.5% versus 0%). On regression analysis, it is found that at the end of the first year, if there was 10% higher chance staying in Novice State, Step 1 scores will be predicted 2.0 points lower (P< .01); while 10% higher chance in Competent State, Step 1scores will be predicted 4.3 points higher (P< .01). Similar findings were also found at the end of second year medical school.Using the Markov chain model to analyze longitudinal progress test performance offers a flexible and effective estimation method to identify students’ transitions across latent stages for acquiring scientific knowledge. The results can help identify students who are at-risk for licensing examination failure and may benefit from targeted academic support.


2018 ◽  
Vol 35 (6) ◽  
pp. 1268-1288 ◽  
Author(s):  
Kong Fah Tee ◽  
Ejiroghene Ekpiwhre ◽  
Zhang Yi

PurposeAutomated condition surveys have been recently introduced for condition assessment of highway infrastructures worldwide. Accurate predictions of the current state, median life (ML) and future state of highway infrastructures are crucial for developing appropriate inspection and maintenance strategies for newly created as well as existing aging highway infrastructures. The paper aims to discuss these issues.Design/methodology/approachThis paper proposes Markov Chain based deterioration modelling using a linear transition probability (LTP) matrix method and a median life expectancy (MLE) algorithm. The proposed method is applied and evaluated using condition improvement between the two successive inspections from the Surface Condition Assessment of National Network of Roads survey of the UK Pavement Management System.FindingsThe proposed LTP matrix model utilises better insight than the generic or decoupling linear approach used in estimating transition probabilities formulated in the past. The simulated LTP predicted conditions are portrayed in a deterioration profile and a pairwise correlation. The MLs are computed statistically with a cumulative distribution function plot.Originality/valueThe paper concludes that MLE is ideal for projecting half asset life, and the LTP matrix approach presents a feasible approach for new maintenance regime when more certain deterioration data become available.


Sign in / Sign up

Export Citation Format

Share Document