Machine Learning Detects Anti-DENV Signatures in Antibody Repertoire Sequences
Dengue infection is a global threat. As of today, there is no universal dengue fever treatment or vaccines unreservedly recommended by the World Health Organization. The investigation of the specific immune response to dengue virus would support antibody discovery as therapeutics for passive immunization and vaccine design. High-throughput sequencing enables the identification of the multitude of antibodies elicited in response to dengue infection at the sequence level. Artificial intelligence can mine the complex data generated and has the potential to uncover patterns in entire antibody repertoires and detect signatures distinctive of single virus-binding antibodies. However, these machine learning have not been harnessed to determine the immune response to dengue virus. In order to enable the application of machine learning, we have benchmarked existing methods for encoding biological and chemical knowledge as inputs and have investigated novel encoding techniques. We have applied different machine learning methods such as neural networks, random forests, and support vector machines and have investigated the parameter space to determine best performing algorithms for the detection and prediction of antibody patterns at the repertoire and antibody sequence levels in dengue-infected individuals. Our results show that immune response signatures to dengue are detectable both at the antibody repertoire and at the antibody sequence levels. By combining machine learning with phylogenies and network analysis, we generated novel sequences that present dengue-binding specific signatures. These results might aid further antibody discovery and support vaccine design.