The identification of functionally or structurally important non-conserved residue sites in protein MSAs is an important challenge for understanding the structural basis and molecular mechanism of protein functions. Despite the rich literature in compensatory mutations as well as sequence conservation analysis for the detection of those important residues, previous methods often rely on classical information-theoretic measures. However, these measures usually do not take into account dis/similarities of amino acids which are likely to be crucial for those residues. In this study, we present a new method, the Quantum Coupled Mutation Finder (QCMF) that incorporates significant dis/similar amino acid pair signals in the prediction of functionally or structurally important sites.

Using the essential sites of two human proteins, namely epidermal growth factor receptor (EGFR) and glucokinase (GCK), we tested the QCMF-method.  The QCMF includes two metrics based on quantum Jensen-Shannon divergence to measure both sequence conservation and compensatory mutations. We found out that the QCMF reaches an improved performance in identifying essential sites from MSAs of both proteins with a significant higher Matthews correlation coefficient (MCC) value in comparison to previous methods.

The QCMF utilizes the notion of entanglement, which is a major resource of quantum information, to model significant dis/similar amino acid pair signals in the detection of functionally or structurally important sites. Our results suggest that the QCMF significantly outperforms the previous methods which have been developed using classical information theory. The method of QCMF is computationally intensive. To ensure a feasible computation time of QCMF’s algorithm, we leveraged  Compute Unified Device Architecture (CUDA).