A genetic programming-based approach to identify potential inhibitors of serine protease of Mycobacterium tuberculosis
Aim: We applied genetic programming approaches to understand the impact of descriptors on inhibitory effects of serine protease inhibitors of Mycobacterium tuberculosis ( Mtb) and the discovery of new inhibitors as drug candidates. Materials & methods: The experimental dataset of serine protease inhibitors of Mtb descriptors was optimized by genetic algorithm (GA) along with the correlation-based feature selection (CFS) in order to develop predictive models using machine-learning algorithms. The best model was deployed on a library of 918 phytochemical compounds to screen potential serine protease inhibitors of Mtb. Quality and performance of the predictive models were evaluated using various standard statistical parameters. Result: The best random forest model with CFS-GA screened 126 anti-tubercular agents out of 918 phytochemical compounds. Also, genetic programing symbolic classification method is optimized descriptors and developed an equation for mathematical models. Conclusion: The use of CFS-GA with random forest-enhanced classification accuracy and predicted new serine protease inhibitors of Mtb, which can be used for better drug development against tuberculosis.