News Article | November 2, 2016
No statistical methods were used to predetermine sample size. The experiments were not randomized and the investigators were not blinded to allocation during experiments and outcome assessment. We tested 2,231 samples collected from the MSM cohort in San Francisco in 1978 (ref. 9) and detected 83 HIV-1-positives by Western Blot (3.7% prevalence). Samples were first screened by GS HIV-1/HIV-2 Plus O EIA (Bio-Rad Laboratories) and reactive samples were further tested by WB Genetic Systems HIV-1 Western Blot (Bio-Rad Laboratories). A total of 33 samples of frozen serum from New York City previously identified as positive for antibody to HIV-16, 7, 8 were assayed; and a total of 20 frozen serum samples from San Francisco9, identified as part of the present study as positive for antibody to HIV-1, were assayed. The New York City samples were from 1978 and 1979 though no complete genomic sequences from 1978 were developed. The San Francisco samples were all from 1978. RNA recovered from samples from both NY and SF was generally undetectable when assaying 5-μl aliquots in a Qubit 2.0 fluorometer using the Qubit RNA HS reagents (detection limit, 250 pgμl−1). Additionally, a sample of peripheral blood mononuclear cells (PBMCs) and a sample of serum were both assayed; these had been collected from a single individual in 1983 (Patient 0), and the samples were stored at CDC Atlanta. Other than Patient 0, now deceased, the data recorded were unlinked to individual identifiers and the work was approved by the Human Subjects Protection Program at the University of Arizona. Four panels of degenerate primers (Supplementary Table 1 and Extended Data Fig. 1) were designed using a suite of North American subtype B sequences. We aimed to design primers able to amplify both conserved regions and predictably variable sites. Primers within each panel were designed to generate sequence from the 5′ end of gag to the 3′ end of nef and were designed to amplify overlapping fragments. Two panels ‘HIVL’ (n = 25) and ‘HIVLb’ (n = 22) were designed to amplify fragments of approximately 500–650 bases in length. Two other panels ‘HIVM’ (n = 50) and ‘HIVR’ (n = 46) were designed to amplify fragments of approximately 200–320 bases in length. Nucleic acids from 100-μl aliquots of serum (or PBMCs in the case of Patient 0) were isolated using the QIAamp viral RNA mini kit (Qiagen) with 5 mcg added carrier RNA. Serum samples were then treated with DNase I (Invitrogen, Life Technologies) before reverse transcription. PBMC nucleic acids were left untreated. Proviral DNA from Patient 0’s PBMCs was amplified with all four primer panels and from multiple separate isolations. Amplification was achieved using Invitrogen platinum Taq DNA polymerase high fidelity (Life Technologies) and run for 55 cycles at an annealing temperature of 52 °C. Additionally, attempts were made to amplify longer fragments using PCR supermix high fidelity (Life Technologies) and forward and reverse primers matched from the HIVLb primer panel for long fragment length followed by nesting with primers for slightly shorter fragment length. A single fragment of slightly more than 7,000 bases was generated after multiple attempts with multiple primer combinations and cloned using the Invitrogen TOPO XL PCR cloning kit (Life Technologies). Fragments of individual clones were then amplified using HIVLb forward and reverse primers matched to give approximately 1,000-base overlapping fragments and then sequenced. RNA jackhammering of the serum samples proceeded as follows: aliquots of RNA extract were reverse transcribed using the GoScript reverse transcription system (Promega) using a program of 4 cycles of 50 °C for 30 min followed by 55 °C for 30 min and a final incubation at 85 °C for 10 min. Primers used were pools of reverse primers from widely spaced amplicons (Supplementary Table 1, Extended Data Fig. 1), typically nine or ten primers per pool in a single reaction tube, with the wide spacing abrogating the possibility of incorporation of an internal primer into any given amplicon. Reverse transcription products were then briefly amplified in multiplex reactions in the pool-specific tube (denaturation for 3 min at 94 °C followed by 30 cycles of 94 °C for 30 s, 52 °C for 30 s, 68 °C for 30 s, and a final extension of 68 °C for 5 min) with matching forward primer pools (a ‘preliminary amplification’ step). Sequences were then amplified from individual aliquots taken from the pool-specific tubes, via single primer pairs (denaturation for 3 min at 94 °C followed by 40 cycles of 94 °C for 30 s, 52 °C for 30 s, 68 °C for 30 s, and a final extension of 68 °C for 5 min). Two separate isolates were amplified from each sample in this manner, with a minimum of one amplification with each primer panel per isolate. Five out of the 33 (15%) of the NY sera assayed yielded complete HIV-1 genomic data as did 3 out of the 20 (15%) SF sera, suggesting that levels of viral RNA preservation were very similar in each collection. In Extended Data Fig. 1 we schematically illustrate the RNA jackhammering approach and its advantages over standard RT–PCR procedures for degraded, low input samples. For a conventional RT–PCR approach with a fairly long amplification product we would perform reverse transcription and obtain one potentially amplifiable cDNA product. We would then aliquot ~10% of the reverse transcription product for amplification in a PCR reaction with forward and reverse primers. Even if the single cDNA product made it into the PCR reaction, the desired amplification product would be too long and a PCR amplicon would therefore not be obtained. For RT–PCR with a shorter amplification product, more appropriately sized given the damaged RNA in the sample, there was still a 90% chance that it would be deemed a negative sample since most aliquots will not contain the rare cDNA product. Using multiple primer sets would increase the chance of a PCR-positive result, but most PCR reactions remained negative because most aliquots lack target cDNA. Even with a 10 primer-pair pool and 10 final PCR reactions, there may be no amplified product. The RNA jackhammering approach targets large panels of appropriately short amplicons, uses discrete pools of non-overlapping primers pairs for reverse transcription, and includes a crucial multiplex pre-amplification step to ensure that each aliquot contained ample template molecules for the final PCR amplification (a separate reaction for each primer pair in the entire panel). Sequencing was performed at the University of Arizona Genetics Core using an ABI 3730XL. The Patient 0 sample contained considerable heterogeneity (mixed bases) both in proviral assembly and in viral RNA amplification. Heterogeneity in the NY and SF samples (all sequences derived from viral RNA) was low. In all cases consensus sequences were used in the phylogenetic analyses. Primer sequences were computationally removed from all sequence data before assembling genomic consensus sequences, which yielded coding-complete genomic data with exception of a few small gaps and the 3′ end of the nef gene (Supplementary Table 2). To validate this approach we obtained seed stock samples from the NIH AIDS Reagent program of subtype B viruses from the US (US657) and Haiti (HT599) and applied a jackhammering approach with independent runs of both the HIVM and HIVR primer panels (Extended Data Fig. 8). For US657 we recovered, in total, from both runs combined, 8,194 nt of high quality data. HIVM and HIVR are independent runs with completely different primer sets, yet where the data overlapped, they were >99.9% similar. Moreover, the few heterogeneities did not line up with heterogeneous primers but fell in regions between primers, demonstrating that differences could not be attributed to the incorporation of primers into the recovered sequences. This was expected both because the wide spacing of amplicons within a single pool of primer pairs prevents incorporation of primers within amplified products and because all primer sequences from final amplification products were computationally removed from the sequences before assembly of genomic sequences. There are 3,354 bases in the published US657 sequence. Our data covered about 90% of the 3,354 bases of previously published US657 sequence (GenBank accession number U04908) and all of our individual amplicons in the region of overlap had US657 as the highest BLAST hit and were >99% similar to the published sequence. For HT599 the HIVM and HIVR primer panels developed 8,545 nt of data, 99.6% of the target. The HIVM-derived sequence was >99.9% similar to the HIVR-derived sequence. We recovered 100% of the overlap with the previously published HT599 sequence (2,881 nt, GenBank accession number U08447) with 99.5% similarity. To evaluate discrepancies between the jackhammering-recovered sequences and both US657 and HT599, we compared consensus sequences of combined HIVM and HIVR data with the respective published sequences by adding them to our complete genome alignment and reconstructing a maximum likelihood tree (Extended Data Fig. 8a). As expected, the independently generated sequences from each virus clustered very closely and only had short tips from their common ancestors, resulting from a very small number of substitutions in their overlapping regions. In a root-to-tip analysis (Extended Data Fig. 8b), our sequences (with a target symbol) were associated with somewhat smaller residuals than the published sequences (with a circle), indicating that our data are likely to be more accurate and, importantly, cannot contain primer remnants as this would result in much larger residuals. To construct the data sets for the analyses shown in Fig. 1 and Extended Data Figs 2, 3, 4 we searched the Los Alamos National Laboratories (LANL) HIV database (http://hiv.lanl.gov/) for all available genome-length HIV-1 sequences from Caribbean countries, which had previously been shown to exhibit diverse subtype B lineages that fall basal to a monophyletic ‘pandemic’ clade of subtype B that accounts for most US and other non-Caribbean subtype infections2. These included sequences sampled in Haiti, Dominican Republic, Jamaica and from Haitians who had recently immigrated to the US from Haiti (‘H3’ and ‘H5’ from 1982, ‘H6’ and ‘H7’ from 1983, ‘RF_HAT’ from 1983)2. For sequences H3, H5, H6 and H7 pol sequences were not available, but partial gag and full-length env sequences were available. For the full-genome analyses the pol gene was treated as missing data. We then added a similar number of genomes from the US from a similar time period (1982–2005), plus one each from France and the UK, as well as outgroup sequences of subtype D from the Democratic Republic of the Congo (D.R.C.). We called this the ‘full genome 46’ data set because it contained 46 genomes. The gag, pol and env data sets depicted in Extended Data Fig. 3 were each derived from the respective sub-genomic region of this same set of taxa. The subset of ‘full genome 46’ that contained only those US sequences sampled from 1978–1984 we called ‘full genome 38’. For the env analyses in Fig. 3 and Extended Data Fig. 5 the alignment from ref. 2 was used, with the addition of the sequences generated for the present study, additional Caribbean subtype B sequences from 2000 to 2005, and four early subtype B partial env sequences from San Francisco10. This alignment we called ‘env 105’. The subset that contained only those US sequences sampled from 1978–1984 we called ‘env 74’. For Extended Data Fig. 6 we added to ‘env 105’ a comparable number—relative to those sampled from 1978–1984 from known locations (New York, California, Georgia, Pennsylvania, New Jersey) (Extended Data Fig. 4b)–of randomly sampled sequences from 1997–2007 from NY, SF, and North Carolina (the closest available site with sufficient numbers to stand in for the Georgia ones from the 1978–1984 sample). We called this alignment ‘env 133’. In all cases sequences were manually aligned using Se-Al (http://tree.bio.ed.ac.uk/software/seal/). All sequence alignments, input files, tree files and primer sequences are available at the Dryad Digital Repository (doi:10.5061/dryad.7mv7v). Maximum likelihood phylogenies were reconstructed using RAxML under on a general time-reversible model of substitution with gamma distributed rate variation among sites20. Bootstrap support values were calculated using 1,000 pseudo-replicates. To detect the presence of recombination, we first performed the Phi test21 on every data set (Extended Data Table 1). When the null hypothesis of absence of recombination was rejected (P < 0.05), we subsequently analysed the data set using RDP4 (ref. 22) and produced new alignments in which the minor recombinant regions were deleted from putative recombinants. Re-analyses of these ‘recombination-free’ data sets using the Phi test confirmed the absence of detectable recombination signal (P > 0.05, Extended Data Table 1). Time-measured phylogeographic histories were reconstructed using a Bayesian phylogenetic inference approach implemented in BEASTv1.8.2 (ref. 23). Our full probabilistic model combined sequence substitution over an unknown phylogeny calibrated in time units using a molecular clock process with dated tips24, a coalescent tree prior and a discrete diffusion process among discrete location states25. For the sequence substitution process, we used the same model as for the maximum likelihood reconstructions. We accommodated rate variation among lineages using a lognormal distribution in an uncorrelated relaxed molecular clock model26 and integrated out each sampling date over an uncertainty interval of one year. Visual inspections of root to tip divergence as a function of sampling time using TempEst27 indicated a strong temporal signal with no clear outlier sequences (Extended Data Fig. 9). For most analyses, we flexibly modelled changes in effective population size through time by specifying a Bayesian skygrid non-parametric tree prior with a grid of 50 years and yearly effective population size parameters28. (The notion of ‘effective population size’, or ‘effective infections’ in epidemiological applications, comes from population genetics, and is typically lower than the full (that is, census) population size, reflecting, for example, variance in reproductive success among individuals—transmissions to new hosts in this context). To estimate viral population growth rates in both the Caribbean and US populations, we fitted a ‘nested’ coalescent model to the data set with the largest taxon sampling (env 133). This model fits a constant-logistic demographic function29 to the genealogy excluding the US clade. The initial constant phase was included in the model to accommodate the deep branching between the subtype B sequences and the African subtype D outgroup sequences. Nested within this model, a separate logistic growth model was fitted to the US clade in the genealogy. The process of discrete diffusion among locations was modelled using a general non-reversible substitution model30. In our analyses including the African subtype D outgroup lineages, we set the root state frequency to one for the African state and zero for all other possible discrete states. We obtained estimates of the transitions among locations (Markov jumps) using a stochastic mapping implementation capable of inferring the complete Markov jump history31, 32. We approximate the posterior distribution for our full probabilistic model using Markov chain Monte Carlo (MCMC) sampling. We use BEAGLE in conjunction with BEAST to improve the computational performance of our analyses33. MCMC chains were run for 50,000,000 generations, sampling every 5,000 generations. We diagnosed the runs by examining trace plots and effective samples sizes, and summarized continuous parameters (mean and 95% highest posterior density (HPD) intervals) using Tracer (http://tree.bio.ed.ac.uk/software/tracer/) after discarding a 10% burn-in. Trees were summarized as maximum clade credibility trees using TreeAnnotator and visualized in FigTree (http://tree.bio.ed.ac.uk/software/figtree/). In two specific phylogeographic analyses, we assessed (i) to what extent sequences sampled early in the US epidemic characterize the subtype B diversity in the US clade (Extended Data Fig. 6a) and (ii) to what extent the location state at the origin of the US clade can be estimated using sequences sampled later in the epidemic from three different US states (Extended Data Fig. 6b). For this purpose, we first reconstructed time-measured phylogenies for the env 133 data set using the substitution model, molecular clock model and coalescent model described above and subsequently reconstructed ancestral locations on the inferred posterior distribution of trees. For Extended Data Fig. 6a, we classified US sequences as ‘early’ or ‘late’ depending on whether they were sampled before or after (and including) 1985. For Extended Data Fig. 6b, we first pruned the necessary US sequences from the posterior distributions in order to retain only ‘late’ sequences from New York, North Carolina and California (matching the sampling from New York, Georgia and California in Fig. 3 and Extended Data Fig. 5b). In this case, the support for a NYC ancestral state is likely upheld by the presence of two basal NYC representatives, but location estimates in a star-like tree structure with long tip branches will be critically dependent on how well the diversity of any location is represented in the contemporaneous sampling, as recently noted34. Comparison of phylogeographic estimates before and after deleting minor recombinant regions from putative recombinants (Extended Data Table 1) indicated highly consistent results. All sequence alignments, input files, tree files and primer sequences are available at the Dryad Digital Repository (doi:10.5061/dryad.7mv7v). The HIV-1 sequences reported here have been deposited in GenBank under accession numbers KJ704787, KJ704788, KJ704789, KJ704790, KJ704791, KJ704792, KJ704793, KJ704794, KJ704795, KJ704796 and KJ704797.
Bicego G.T.,Centers for Disease Control and Prevention |
Nkambule R.,Ministry of Health |
Peterson I.,Columbia International University |
Reed J.,CDC Atlanta |
And 9 more authors.
PLoS ONE | Year: 2013
Background:The 2011 Swaziland HIV Incidence Measurement Survey (SHIMS) was conducted as part of a national study to evaluate the scale up of key HIV prevention programs.Methods:From a randomly selected sample of all Swazi households, all women and men aged 18-49 were considered eligible, and all consenting adults were enrolled and received HIV testing and counseling. In this analysis, population-based measures of HIV prevalence were produced and compared against similarly measured HIV prevalence estimates from the 2006-7 Swaziland Demographic and Health. Also, measures of HIV service utilization in both HIV infected and uninfected populations were documented and discussed.Results:HIV prevalence among adults aged 18-49 has remained unchanged between 2006-2011 at 31-32%, with substantial differences in current prevalence between women (39%) and men (24%). In both men and women, between since 2006-7 and 2011, prevalence has fallen in the young age groups and risen in the older age groups. Over a third (38%) of the HIV-infected population was unaware of their infection status, and this differed markedly between men (50%) and women (31%). Of those aware of their HIV-positive status, a higher percentage of men (63%) than women (49%) reported ART use.Conclusions:While overall HIV prevalence remains roughly constant, age-specific changes strongly suggest both improved survival of the HIV-infected and a reduction in new HIV infections. Awareness of HIV status and entry into ART services has improved in recent years but remains too low. This study identifies opportunities to improve both HIV preventive and care services in Swaziland.