Web PSSM

WebPSSM Description:

WebPSSM is a bioinformatic tool for predicting HIV-1 coreceptor usage from amino acid or nucleotide sequences of the third variable loop (V3) of the envelope gene. When a nucleotide sequence is entered, it will be translated to amino acid sequence first. If a nucleotide sequence contains ambiguous bases, it will be translated to all possible amino acid sequences. The original description of the method can be found here.

No user sequences are stored. If you would like to "donate" sequences that have known associated phenotypes, for the improvement of this method and the development of multiple subtype matrices, please contact James Mullins.

Use:

You may enter as many as 10000 V3 sequences in FASTA format. Sequence names must be unique. All characters within sequence names other than alphanumeric or underscores ('_') will be changed to '_'. Scores and prediction data are returned in the same window after submission. The user can obtain results in tab-delimited format for use in Excel or other programs.

Alignment feature:

The typical V3 loop in HIV-1 subtype B is 35 amino acids long, but length differences are frequent. The matrix in the current implementation is designed to score a 35 AA fragment. To obtain a correct score for length variants, it is important that homologous residues be in the correct position. Before sequences are scored, they are aligned against a HIV-1 subtype B consensus sequence by using Needleman-Wunsch algorithm and an amino acid distance matrix. Gaps and insertions relative to the consensus are ignored in the scoring (in general, this does not substantially affect the predictions; see Jensen et al. 2003). If multiple best alignments are calculated, all of these alignments are scored, and the actual scored sequences are displayed in the output.

Sequences that align poorly to the V3 consensus are flagged in the output. These sequences may not be actual V3 loops, or may be from a highly divergent subtype.

Matrices:

Two matrices are available for determining scores in subtype B: X4R5, calculated using sequences of known coreceptor phenotype, as assayed on indicator cells expressing exogenous CD4 and either CCR5 or CXCR4; and SINSI, calculated using sequences of known synctyium-inducing phenotype on the MT2 cell line. We have found that these matrices can give different phenotype predictions depending on sequence (see Jensen et al., 2003), and that correlations with disease progression (unpublished) and prognosis on HAART (Brumme et al., 2004) are better using SINSI scores. For subtype C only a SINSI matrix is available.

Citations:

Jensen, M. A., F.-S. Li, A. B. van 't Wout, D. C. Nickle, D. Shriner, H.-X. He, S. McLaughlin, R. Shankarappa, J. B. Margolick, and J. I. Mullins. 2003. Improved coreceptor usage prediction and genotypic monitoring of R5-to-X4 transition by motif analysis of HIV-1 env V3 loop sequences. Journal of Virology 77: 13376-13388.
Brumme, Z. L., W. W. Dong, B. Yip, B. Wynhoven, N. G. Hoffman, R. Swanstrom, M. A. Jensen, J. I. Mullins, R. S. Hogg, J. S. Montaner, and P. R. Harrigan. 2004. Clinical and immunological impact of HIV envelope V3 sequence variation after starting initial triple antiretroviral therapy. AIDS 18: F1-F9.
Jensen, M. A., M. Coetzer, A. B. van 't Wout, L. Morris, and J. I. Mullins. 2006. A reliable phenotype predictor for human immunodeficiency virus type 1 subtype C based on Envelope V3 sequences. Journal of Virology 80: 4698-4704.

Caveats:

The current implementation uses matrices derived using either subtype B or C sequences only, and have been tested only on phenotyped subtype B or C sequences, respectively. Predictions for other subtypes should be treated with skepticism.

Contact:

For any questions, bugs and suggestions, please send email to mullspt+cfar@uw.edu and include a few sentences describing, briefly, the nature of your questions and include contact information.