SUPPLEMENTAL DATA:

Distributions of Intra-subject mean amino acid diversity

Figure 2. Distributions of Intra-subject mean amino acid diversity. All intra-subject pairwise amino acid distance were calculated using PAUP*. Mean amino acid diversity was calculated from the Distance Matrix. Line shows cumulative percentage of subjects.

Unique ENV gp120 viral variants in 37 subjects
pdfpdf of Figure 3

Figure 3. Unique ENV gp120 viral variants in 37 subjects. Assessed by analyzing numbers of phylogenetically informative sites ("private" mutations removed) for each subject. Each colored box represents an individual subject. Colors represent unique variants for each subject, putative recombinants are labeled (rec). Nucleotide positions at phylogenetically informative sites are based on the individual subject alignments.

Envelope V3 loop amino acid variation

Figure 4. Envelope V3 loop amino acid variation. Top panel: The representation of amino acids at each position of the V3 loop, including 522 sequences with open reading frames from 38 subjects, are shown in descending order of prevalence. The most common amino acid at each site is shown at the top in bold. Gaps in the V3 loop are designated by a (-). Bottom panel: Sequence logos (generated at http://weblogo.berkeley.edu/logo.cgi) for the same dataset. The characters at each logo position and their size depict the relative proportions of the designated amino acids at each site.

Figure 5. Histogram of V1V2 amino acid loop length versus potential n-linked glycosylation sites (PNLGS). Individual subjects (n=37) are numbered; Black bars represent V1V2 loop lengths and gray bars represent PNLGS from individual sequences. Seven subjects had V1V2 length variants and 33 subjects had PNLGS variants.