AA_stat enables extensive characterization of artefact and post-translational modifications in bottom-up proteomics
ABSTRACTWe report on AA_stat, a bioinformatic approach for panoramic profiling of artificial and post-translational modifications and their localization sites in large-scale proteomics data. Presented version of AA_stat provides validation of ultra-tolerant (open) search results followed by interpretation of the observed mass shifts and recommendation of the optimized sets of fixed and variable modifications for subsequent regular searches. Localization of modification sites is based on relative amino acid frequencies and analysis of tandem mass spectra. AA_stat determines groups of peptide identifications with mass shifts from the validated results of the open search and then scores each possible mass shift location by matching the MS/MS spectrum across the theoretical peptide isoforms. Here we demonstrate the utility of AA_stat for blind scanning of abundant and rare amino acid modifications of both artificial and biological origins and analyze advantages and limitations of open search strategies. AA_stat is implemented as an open-source command line tool available at https://github.com/SimpleNumber/aa_stat.