scholarly journals Inchoative discovery of plausible (un)explored synergistic combinatorial biological hypotheses for static/time series Wnt measurements via ranking search engine : BioSearch Engine Design

2018 ◽  
Author(s):  
shriprakash sinha

BACKGROUND Often, in biology, we are faced with the problem of exploring relevant unknown biological hypotheses in the form of myriads of combination of factors that might be affecting the pathway under certain conditions. Currently, a major problem in biology is to cherry pick the combinations based on expert advice, literature survey or guesses for investigation. The search and wet lab testing of these combinations costs a lot in terms of time, investment and energy. In a recent development of the PORCN-WNT inhibitor ETC-1922159 for colorectal cancer, a list of down-regulated genes were recorded in a time buffer after the administration of the drug. The regulation of the genes were recorded individually but for a majority, it is still not known which higher (≥ 2) order combinations might be playing a greater role in the pathway. RESULTS The pipeline provides a prioritised list of important 2nd order combinations of a range of family of genes involved in the Wnt pathway. More specifically, it reveals the various unexplored FZD-WNT combinations that have been untested till now in the pathway. In relation to ETC-1922159 affected combinations, the down-regulation of LGR-RNF family after the drug treatment is evident in these rankings as it takes bottom priorities for LGR5-RNF43 combination. The LGR6-RNF43 takes higher ranking than LGR5-RNF43, indicating that it might not be playing a greater role as LGR5 during the Wnt enhancing signals. These rankings confirm the efficacy of the proposed search engine design. CONCLUSION A pipeline has been developed to prioritise an nth order combination of factors that affect a signaling pathway. It takes into account the sensitivity indices computed from variance based (SOBOL) and density-kernel based (HSIC) methods to estimate the influence of each factor or combination of factors. These are then fed as feature vectors into a powerful support vector ranking algorithm that produces a ranked list of the interactions/combinations.

Author(s):  
Shriprakash Sinha

\textsc{Background} Often, in biology, we are faced with the problem of exploring relevant unknown biological hypotheses in the form of myriads of combination of factors that might be affecting the pathway under certain conditions. Currently, a major problem in biology is to cherry pick the combinations based on expert advice, literature survey or guesses for investigation. The search and wet lab testing of these combinations costs a lot in terms of time, investment and energy. In a recent development of the PORCN-WNT inhibitor ETC-1922159 for colorectal cancer, a list of down-regulated genes were recorded in a time buffer after the administration of the drug. The regulation of the genes were recorded individually but for a majority, it is still not known which higher ($\geq 2$) order combinations might be playing a greater role in the pathway. \textsc{Results} The pipeline provides a prioritised list of important $2^{nd}$ order combinations of a range of family of genes involved in the Wnt pathway. More specifically, it reveals the various unexplored FZD-WNT combinations that have been untested till now in the pathway. In relation to ETC-1922159 affected combinations, the down-regulation of LGR-RNF family after the drug treatment is evident in these rankings as it takes bottom priorities for LGR5-RNF43 combination. The LGR6-RNF43 takes higher ranking than LGR5-RNF43, indicating that it might not be playing a greater role as LGR5 during the Wnt enhancing signals. These rankings confirm the efficacy of the proposed search engine design. \textsc{Conclusion} A pipeline has been developed to prioritise an $n^{th}$ order combination of factors that affect a signaling pathway. It takes into account the sensitivity indices computed from variance based (SOBOL) and density-kernel based (HSIC) methods to estimate the influence of each factor or combination of factors. These are then fed as feature vectors into a powerful support vector ranking algorithm that produces a ranked list of the interactions/combinations.


2017 ◽  
Author(s):  
Shriprakash Sinha

In a recent development of the PORCN-WNT inhibitor ETC-1922159 for colorectal cancer, a list of down-regulated genes were recorded in a time buffer after the administration of the drug. The regulation of the genes were recorded individually but it is still not known which higher (≥ 2) order interactions might be playing a greater role after the administration of the drug. In order to reveal the priority of these higher order interactions among the down-regulated genes or the likely unknown biological hypotheses, a search engine was developed based on the sensitivity indices of the higher order interactions that were ranked using a support vector ranking algorithm and sorted. For example, LGR family (Wnt signal enhancer) is known to neutralize RNF43 (Wnt inhibitor). After the administration of ETC-1922159 it was found that using HSIC (and rbf, linear and laplace variants of kernel) the rankings of the interaction between LGR5-RNF43 were 61, 114 and 85 respectively. Rankings for LGR6-RNF43 were 1652, 939 and 805 respectively. The down-regulation of LGR family after the drug treatment is evident in these rankings as it takes bottom priorities for LGR5-RNF43 interaction. The LGR6-RNF43 takes higher ranking than LGR5-RNF43, indicating that it might not be playing a greater role as LGR5 during the Wnt enhancing signals. These rankings confirm the efficacy of the proposed search engine design. Conclusion: Prioritized unknown biological hypothesis form the basis of further wet lab tests with the aim to reduce the cost of (1) wet lab experiments (2) combinatorial search and (3) lower the testing time for biologist who search for influential interactions in a vast combinatorial search forest. From in silico perspective, a framework for a search engine now exists which can generate rankings for nth order interactions in Wnt signaling pathway, thus revealing unknown/untested/unexplored biological hypotheses and aiding in understanding the mechanism of the pathway. The generic nature of the design can be applied to any signaling pathway or phenomena under investigation where a prioritized order of interactions among the involved factors need to be investigated for deeper understanding. Future improvements of the design are bound to facilitate medical specialists/oncologists in their respective investigations.SignificanceRecent development of PORCN-WNT inhibitor enantiomer ETC-1922159 cancer drug show promise in suppressing some types of colorectal cancer. However, the search and wet lab testing of unknown/unexplored/untested biological hypotheses in the form of combinations of various intra/ extracellular factors/genes/proteins affected by ETC-1922159 is not known. Currently, a major problem in biology is to cherry pick the combinations based on expert advice, literature survey or guesses to investigate a particular combinatorial hypothesis. A search engine has be developed to reveal and prioritise these unknown/untested/unexplored combinations affected by the inhibitor. These ranked unknown biological hypotheses facilitate in narrowing down the investigation in a vast combinatorial search forest of ETC-1922159 affected synergistic-factors.


2018 ◽  
Author(s):  
shriprakash sinha

Often, in biology, we are faced with the problem of exploring relevant unknown biological hypotheses in the form of myriads of combination of factors that might be affecting the pathway under certain conditions. Currently, a major persisting problem is to cherry pick the combinations based on expert advice, literature survey or guesses for investigation. This entails investment in time, energy and expenses at various levels of research. To address these issues, a search engine design was recently been developed, which showed promise by revealing existing con- firmatory published wet lab results. Additionally and of import, the engine mined up a range of unexplored/untested/unknown combinations of genetic factors in the Wnt pathway that were af- fected by ETC-1922159 enantiomer, a PORCN-WNT inhibitor, after the colorectal cancer cells were treated with the inhibitor drug. As an example, MYC is known to upregulate PRC2 com- plex. PRC2 complex contains EZH2, which suppresses tumor suppressor genes via epigenetic modifications. MYC and HOXB8 are up regulated in colorectal cancer, however, the dual working mechanism of the same is not known. The in silico engine showed positioning which correctly approximates and assigns to this 3rd order combination of MYC-HOXB8-EZH2, pointing to the in vitro/in vivo down regulation by ETC-1922159. If the protein interaction of MYC-HOXB8 can be established and a study be done apropos EZH2, it will establish at in vitro/in vivo level, the in silico ranking also. The potential of this engine is immense given the problem faced in biology and other fields. Here we elucidate the R code to understand the mechanics of the search engine in a fluid manner for systems biologists. Though the search engine is in the developmental stage, we share the detailed mechanism of the working principles of the same as it can be generalized to problems in other fields.


Author(s):  
Shriprakash Sinha

Often, in biology, we are faced with the problem of exploring relevant unknown biological hypotheses in the form of myriads of combination of factors that might be affecting the pathway under certain conditions. Currently, a major persisting problem is to cherry pick the combinations based on expert advice, literature survey or guesses for investigation. This entails investment in time, energy and expenses at various levels of research. To address these issues, a search engine design was recently been developed, which showed promise by revealing existing confirmatory published wet lab results. Additionally and of import, the engine mined up a range of unexplored/untested/unknown combinations of genetic factors in the Wnt pathway that were affected by ETC-1922159 enantiomer, a PORCN-WNT inhibitor, after the colorectal cancer cells were treated with the inhibitor drug. As an example, MYC is known to upregulate PRC2 complex. PRC2 complex contains EZH2, which suppresses tumor suppressor genes via epigenetic modifications. MYC and HOXB8 are up regulated in colorectal cancer, however, the dual working mechanism of the same is not known. The in silico engine showed positioning which correctly approximates and assigns to this 3rd order combination of MYC-HOXB8-EZH2, pointing to the in vitro/in vivo down regulation by ETC-1922159. If the protein interaction of MYC-HOXB8 can be established and a study be done apropos EZH2, it will establish at in vitro/in vivo level, the in silico ranking also. The potential of this engine is immense given the problem faced in biology and other fields. Here we elucidate the R code to understand the mechanics of the search engine in a fluid manner for systems biologists. Though the search engine is in the developmental stage, we share the detailed mechanism of the working principles of the same as it can be generalizedto problems in other fields.


2016 ◽  
Author(s):  
shriprakash sinha

AbstractIt is widely known that the sensitivity analysis plays a major role in computing the strength of the influence of involved factors in any phenomena under investigation. When applied to expression profiles of various intra/extracellular factors that form an integral part of a signaling pathway, the variance and density based analysis yields a range of sensitivity indices for individual as well as various combinations of factors. These combinations denote the higher order interactions among the involved factors, that might be of interest in the working mechanism of the pathway. For example, there are 19 types of WNTs and 10 FZDs with their 2ndorder combinations high enough and it is not possible to know which one to test first (except for those for which wet lab validations have been confirmed). But the effect of these combinations vary over time as measurements of fold changes and deviations in fold changes vary. In this work, after estimating the individual effects of factors for a higher order combination, the individual indices are considered as discriminative features. A combination, then is a multivariate feature set in higher order (>=2). With an excessively large number of factors involved in the pathway, it is difficult to search for important combinations in a wide search space over different orders. Exploiting the analogy of prioritizing webpages using ranking algorithms, for a particular order, a full set of combinations of interactions can then be prioritized based on these features using a powerful ranking algorithm via support vectors. Recording the changing rankings of the combinations over time points and durations, reveals how higher order interactions behave within the pathway and when and where an intervention might be necessary to influence the pathway. This could lead to development of time based therapeutic interventions. Based on a small dataset in time, we were able to generate the rankings of the 2ndorder combinations between WNTs and FZDs at different time snap shots and for different duration or time periods. Code has been made available on Google drive athttps://drive.google.com/folderview?id=0B7Kkv8wlhPU-V1Fkd1dMSTd5ak0&usp=sharingSignificanceThe search and wet lab testing of unknown biological hypotheses in the form of combinations of various intra/extracellular factors that are involved in a signaling pathway, costs a lot in terms of time, investment and energy. To reduce this cost of search in a vast combinatorial space, a pipeline has been developed that prioritises these list of combinations so that a biologist can narrow down their investigation. The pipeline uses kernel based sensitivity indices to capture the influence of the factors in a pathway and employs powerful support vector ranking algorithm. The generic workflow and future improvements are bound to cut down the cost for many wet lab experiments and reveal unknown/untested biological hypothesis.


2013 ◽  
pp. 1306-1316
Author(s):  
Wei Xiong ◽  
Min Song ◽  
Lori deVersterre

Word sense disambiguation is the problem of selecting a sense for a word from a set of predefined possibilities. This is a significant problem in the biomedical domain where a single word may be used to describe a gene, protein, or abbreviation. In this paper, we evaluate SENSATIONAL, a novel unsupervised WSD technique, in comparison with two popular learning algorithms: support vector machines (SVM) and K-means. Based on the accuracy measure, our results show that SENSATIONAL outperforms SVM and K-means by 2% and 17%, respectively. In addition, we develop a polysemy-based search engine and an experimental visualization application that utilizes SENSATIONAL’s clustering technique.


India is an agricultural country where most of people are depends on the agriculture. When Plants are infected by the virus, fungus and bacteria, they are mostly seen on leaves and stems of the plants. Because of that, plants production is decreased also economy of the country is decreased. The farmer has to identify the disease and decide which pesticide will be used to control the disease in plants. To finding out which disease affect the plants, the farmer contacts the expert for the solution. The expert gives the advice based on its knowledge and information but sometimes seeking the expert advice is time consuming, expensive and may be not accurate. So, to solve this problem, the image processing techniques and Machine Learning algorithm like Neural Network, Fuzzy Logic and Support Vector Machine gives the better, accurate and affordable solution to control the plants disease than manual method.


2020 ◽  
Author(s):  
Yu Wan ◽  
Zhuo Wang ◽  
Tzong-Yi Lee

Abstract BackgroundCancer is a major cause of death worldwide. To treat cancer, the use of anticancer peptides (ACPs) has received increasing attention in recent years. ACPs are a unique group of small molecules that can target and kill cancer cells fast and directly. However, identifying ACPs by wet-lab experiments is time-consuming and labor-intensive. Therefore, it is significant to develop computational tools for ACPs prediction.ResultsThis study chose amino acid composition (AAC), N5C5, k-space, position-specific scoring matrix (PSSM) as features, and analyzed them by machine learning methods, including support vector machine (SVM) and sequential minimal optimization (SMO) to build a model (model 2) distinguishing ACPs from non-ACPs. Since a growing number of studies have shown that some antimicrobial peptides (AMPs) exhibit anticancer function, a model (model 1) to distinguish ACPs from AMPs is also been developed. Comparing to previous models, models developed in this research show better performance (accuracy: 82.5% for model 1 and 93.5% for model 2).ConclusionsThis work utilizes a new feature, PSSM, which contributes to better performance than other features. In addition to SVM, SMO is used in this research for optimizing SVM and the SMO-models show better performance than unoptimized models. Last but not least, this work provides two different functions, including distinguishing ACPs from AMPs and distinguishing ACPs from all peptides. The second SMO-optimized model, which utilizes PSSM as feature, performs better than all other existing tools.


Sign in / Sign up

Export Citation Format

Share Document