SI training set in FASTA format

u[nn] in name line indicates sample from [nn]th infected individual GenBank acc. (indicated where available)
>95ZW748	ZW	C	SI	u1	BATRA (2000)	
CTRPNNNVRKHIRIGIGKVFYA-NDIIGDIRQARC						
>95ZW2036	ZW	C	SI	u2	BATRA (2000)	
CIRPGNNTRKSIRIGPGQVFYAATNIIGNIRQAHC						
>95ZW2288	ZW	C	SI	u3	BATRA (2000)	
CTRPNNNTRKSVRIGPGQVFYA-NDIIGDIRQAHC						
>ET030	ET	C	SI/CXCR4	u4	ABEBE (1999)	AF158851
CTRPNNNTRKSIRIGPGQAFYATGTIIGDIRQAHC						
>ET074	ET	C	SI/CXCR4	u5	ABEBE (1999)	AF158871
CTRPNNNTRKSIGIGPGQAFYARGDIIGDIRQAFC						
>ET074MT	ET	C	SI/CXCR4	u5	ABEBE (1999)	AF158890
CTRPNYTKRRSIGIGPGQAFFARGGITGDIRQAFC						
>ET079MT	ET	C	SI/CXCR4	u6	ABEBE (1999)	AF158891
CTRPNNNIRKSVRIGRGHTFYATGAIRGDIRQTHC						
>ET030MT	ET	C	SI/CXCR4	u4	ABEBE (1999)	AF158892
CTRPYNTIRKRIKIGPGHAFHTTKTIRGDIRQAFC						
>ET030A6	ET	C	SI/CXCR4	u4	ABEBE (1999)	AF158893
CTRPYNTIRTRIKIGPGHAFHTTKTIRGDIRQAFC						
>ET030A11	ET	C	SI/CXCR4	u4	ABEBE (1999)	AF158894
CTRPYNTIRKRIKIGPGHAFHTTKTIRGDIRQAFC						
>ET030A12	ET	C	SI/CXCR4	u4	ABEBE (1999)	AF158895
CTRPYNTIRTRIKIGPGHAFHTTKTIRGDIRQAFC						
>ET030B6	ET	C	SI/CXCR4	u4	ABEBE (1999)	AF158896
CTRPNNNTRKSIRIGPGHAFHATGAIIGDIRQAYC						
>ET030C11	ET	C	SI/CXCR4	u4	ABEBE (1999)	AF158897
CTRPYNTIRTRIKIGPGHAFHTTKTIRGDIRQAFC						
>ET030F7	ET	C	SI/CXCR4	u4	ABEBE (1999)	AF158898
CTRPYNTIRKRIKIGPGHAFHTTKTIKGDIRQANC						
>ET074C1	ET	C	SI/CXCR4	u5	ABEBE (1999)	AF158899
CTRPNYTKRRSIGIGPGQAFFARGGITGDIRQAFC						
>ET074C4	ET	C	SI/CXCR4	u5	ABEBE (1999)	AF158900
CTRPNYTKRRSIGIGPGQAFFARGGIGRDIRQAFC						
>ET074D6	ET	C	SI/CXCR4	u5	ABEBE (1999)	AF158901
CTRPNYTKRRSIGIGPGQAFFARGGIGRDIRQAFC						
>ET074E7	ET	C	SI/CXCR4	u5	ABEBE (1999)	AF158902
CTRPNYTKRRSIGIGPGQAFFARGGIGRDLRQAFC						
>ET074F3	ET	C	SI/CXCR4	u5	ABEBE (1999)	AF158903
CTRPNYSKRRSIGIGPGQAFFARGGITGDIRQAFC						
>ET074G2	ET	C	SI/CXCR4	u5	ABEBE (1999)	AF158904
CTRPNYTKRRSIGIGPGQAFFARGGITGDIRQAFC						
>ET074H1	ET	C	SI/CXCR4	u5	ABEBE (1999)	AF158905
CKRPNINKRKSIGIGPGQGFLARGGITGDIRQAFC						
>ET074H3	ET	C	SI/CXCR4	u5	ABEBE (1999)	AF158906
CTRPNYIKRKSIGIGPGQAFFARGGIGRDLRQAFC						
>ET079B8	ET	C	SI/CXCR4	u6	ABEBE (1999)	AF158913
CTRPNNNIRKSVRIGRGHTFYATGAIKGNIRQAHC						
>ET079E7	ET	C	SI/CXCR4	u6	ABEBE (1999)	AF158914
CTRPNNNIRKSVRIGRGHTFYATGAIRGNIRQTHC						
>ET079H8	ET	C	SI/CXCR4	u6	ABEBE (1999)	AF158915
CTRPNNNIRKSVRIGRGHTFYATGNIIGDIRQAHC						
>TV005	ZA	C	SI/R5X4	u7	TREURNICHT (2002)	AF254770
CTRPNNNTRKSIRIGPGQTFYATGDIIGDIRQAHC						
>ET074E2	ET	C	SI/CXCR4	u5	ABEBE (1999)	AF158908
CTRPNNNTRKSIGIGPGQAFYARGDIIGDIRQAFC						
>ET079A1	ET	C	SI/CXCR4	u6	ABEBE (1999)	AF158909
CTRPNNNTRKSVRIGPGQTFYATGAIIGDIRQAHC						
>ET079A7	ET	C	SI/CXCR4	u6	ABEBE (1999)	AF158910
CTRPNNNTRKSVRIGPGQTFYATGAIIGDIRQAHC						
>ET079B12	ET	C	SI/CXCR4	u6	ABEBE (1999)	AF158911
CTRPNNNTRKSVRIGPGQTFYATGAIIGDIRQAHC						
>ET079F5	ET	C	SI/CXCR4	u6	ABEBE (1999)	AF158912
CIRPNNNTRKSVGIGPGQAFYATGDIIGDIRQAHC						
>TM1		ZA	C	SI/R5X4	COETZER (SUBM)	
CTRPNNNTRKNVRIGRGQTFYANGRIIGNIRQAHC						
>TM2		ZA	C	SI/CXCR4	COETZER (SUBM)	
CARPGNNTRKMMRIGRGQTFYANGQVIGDIRQAHC						
>TM9		ZA	C	SI/CXCR4	COETZER (SUBM)	
CTRPYYNKRRSMRIGRGQALYATKEITGDIRRAYC						
>TM18c	ZA	C	SI/R5X4	u10	COETZER (SUBM)	
CTRPNNNTRRSIRIGPGAAYYANNDIIGDIRQAYC						
>RP1		ZA	C	SI/R5X4	COETZER (SUBM)	
CIRPGNNTRKRVRLGPGQTFYATGRVIRDIRQAHC						
>99ZASW7	ZA	C	CXCR4	u12 	PAPATHANASOPOULOS (2002)	AF411966
CTRPGSNKQRIRNIGPGRAFHTNG-VIGDIRKAYC						
>SW12	ZA	C	SI/CXCR4	u13	CILLIERS (2003)	
CMRPGNNTRKRVRIGPRQTFYAPGGINKDIRQAHC						
>SW20	ZA	C	SI/R5X4	u14	CILLIERS (2003)	
CTRPNNNTRKSIRTGRGQTFYVTGQIIGDVRQAHC						
>SW30c	ZA	C	SI/R5X4	u15	NICD (UNPUBL)	
CTRPNNNTRKSVRIGRGLSFYTTGKVLGNIRQAHC						
>SW30sr	ZA	C	SI/R5X4	u15	CILLIERS (2003)	
CTRPNNNTRKSVRIGRGHAFYTTGKVIGNIRQAHC						
>99ZACM9	ZA	C	R5X4	u16	PAPATHANASOPOULOS (2002)	AF411967
CARPGNNTIKRIRIGPRYAFYAKETIIGDIRQAHC						
>DU36	ZA	C	SI	u17	COETZER (SUBM)	
CTRPDNNISMKRIKGPGRAFVATKGIKGDIRQAHC						
>DU151MarchOO ZA	C	SI/R5X4	u18	MP	NICD (UNPUBL)	
CTRPNSNKRRGVRIGPGLSFFATRKIIGDIRQAHC						
>DU151MayOO ZA	C	SI/R5X4	u18	MP	NICD (UNPUBL)	
CTRPNSNKRRGVRIGPGLAFFATRKIIGDIRQAHC						
>DU151MT2	ZA	C	SI/CXCR4	u18	COETZER (SUBM)	
CTRPNSNKRRGVRIGPGLSFFATRRIIGDIRQAHC						
>DU179MAR99	ZA	C	SI	u19	COETZER (SUBM)	
CTRPGNNTRKSIRIGPGQAFY-TNHIIGDIRQAHC						
>DU179MAY99	ZA	C	SI/R5X4	u19	VAN HARMELEN (2001)	AY043174
CTRPGNNTRKSIRIGPGQAFY-TNHIIGDIRQAYC						
>DU179FEB00	ZA	C	SI/R5X4	u19	COETZER (IN PREP)	
CTRPGNKTRRSIRIGPGQAFYTTNTI-GDIRQASC						
>DU179MAY00	ZA	C	SI/R5X4	u19	COETZER (IN PREP)	
CTRPGNKTIRSIRLGPGQAFY-TNK--GDIRQASC						
>DU179D	ZA	C	SI/CXCR4	u19	COETZER (IN PREP)	
CTRPGNKTIRSIRIGPGRTFY-TNK--GDIRQAYC