Genomic Variation in Human Herpesvirus-8 and Implications for Pathogenesis of Kaposi Sarcoma: an analysis of GenBank sequences
Background: Human herpesvirus 8 (HHV-8) is the etiologic agent for Kaposi sarcoma, Primary Effusion Lymphoma and Multicentric Castleman's Disease, conditions with heterogenous clinical manifestations. Little information exists on diversity in the genomic sequences of HHV-8 which codes for proteins that modulate cell cycle regulation, immune function and angiogenesis and the impact of this diversity on HHV-8-associated disease.
Methods: All HHV-8 sequences from GenBank were aligned in MOSAIK with refinement in Geneious (Biomatters, Ltd). Perl scripts were used to calculate genome sequence coverage, regional variation by Hamming distance and nucleotide positional variation by Shannon entropy score. Hamming distance for each 50 nucleotide window is the pairwise distance between 2 sequences averaged over all sequences. Shannon Entropy score is given by Hn = , where is the probability of each of 4 possible nucleotides at position n, based on the distribution in all sequences.
Results: Six whole genome sequences of HHV-8 (half derived from cultured virus) were available in GenBank, along with 2558 subgenomic fragments, 93% of which were <1kb. Only 6 regions of the HHV-8 genome had >25 sequences available with K1, ORF26/27 and ORF75/K15 accounting for 2140 (83%) of the total available sequences. Across the HHV-8 genome, mean Hamming distance was 0.3. In K1 and K15, mean (range) Hamming distance was 3.7 (0-15.8), and 2.0 (0-6.6), respectively. Hn = 0 for 97.0% of the genome with a maximum of 1.27. For the 4206 positions with Hn > 0, mean (SD) Hn = 0.31 (0.18) and only 17.6% of these fell into non-coding regions. Of the 10% most variable positions (n=469, Hn > 0.562), 322 (68.7%) of these single nucleotide polymorphisms (SNPs) fell within 11 of 24 pre-identified viral oncogenes.
Conclusion: There are scant sequences available for most of the HHV-8 genome. Hamming distance analysis revealed low regional diversity except in the known highly divergent genes of K1 and K15/ORF75. Remaining positional genomic heterogeneity were SNPs in coding regions, including many in genes with putative oncogenic activity. Newer sequencing technologies have great potential to explain the relationship between HHV-8 genomic diversity and disease phenotypes.
B. Hall, None
W. Phipps, None
C. Casper, None
J. Mullins, None