about
False
    Home Help About           view:Sequence FeatureTable           gene: gag gag-pol env tat nef rev vif vpr vpu           Search
About the database

The HIV mutation browser is a database of mutagenesis and mutation data on HIV collected from the scientific literature. The data has been identified and catalogued using computational text-mining methods. A researcher can use the database to find literature describing the phenotype of a mutation, and/or experimental data describing the effect of a mutation.

The resource is a collaboration between the Briggs Group at European Molecular Biology Laboratory (EMBL) in Heidelberg and the Schneider Group at the Luxembourg Centre for Systems Biomedicine (LCSB) in Luxembourg.



Source of mutation data

The content of the database is created by text-mining available HIV literature to find mutagenesis information. A list of articles is retrieved from PubMed using the search terms "HIV" and "Human Immunodeficiency Virus". All articles from this list that are available to us and that we are permitted to analyse computationally (see Publishers section) are downloaded and processed. We currently have permission to process approximately 40% of the literature, including the majority of basic-science publications on HIV.

The current version of the database (version 1.0), identified approximately 275,000 papers of interest in pubmed
  • Number of papers permitted to processed: 129,969
  • Number of papers processed: 129,969
  • Number of papers containing mutational information: 6,094
  • Number of mutations: 119,157
  • Number of distinct mutations: 8,257

The database is updated on a monthly basis to add the latest HIV literature.



Contributing Publishers

Unfortunately the analysis of scientific literature using computational text-mining is prohibited by the majority of publishers access licenses. Thankfully, the majority of the publishing companies and societies that we approached granted us permission to text-mine and index HIV mutation information contained in their literature. We thank the following publishers who permit us to analyse their content.

American Society for Biochemistry and Molecular Biology
American Society For Microbiology
BioMed Central
Elsevier
John Wiley & Sons, Inc.
National Academy of Sciences
Nature Publishing Group
Oxford University Press
Public Library of Science
Society for General Microbiology

Table 1. List of publishers that have given permission to the HIV mutation browser to access, data-mine and display articles.


List of journals text mined for the database
If an article is missing from the database it is possible that have not yet asked for or obtained permission from the publisher to process it for inclusion in the data. If you would like us to add to a publisher or journal to the resource, feel free to contact us at feedback@hivmut.org. We have been denied permission to analyse literature published by the American Chemical Society. Articles from their journals, the most relevant of which is "Biochemistry", are not indexed in the database. The current list of permitted journals is presented in table 2.
  • BMC Biochem.
  • BMC Bioinformatics
  • BMC Biophys
  • BMC Biotechnol.
  • BMC Blood Disord
  • BMC Cancer
  • BMC Cell Biol.
  • BMC Chem Biol
  • BMC Clin Pathol
  • BMC Clin Pharmacol
  • BMC Complement Altern Med
  • BMC Dermatol.
  • BMC Evol. Biol.
  • BMC Fam Pract
  • BMC Gastroenterol
  • BMC Genet.
  • BMC Genomics
  • BMC Health Serv Res
  • BMC Immunol.
  • BMC Infect. Dis.
  • BMC Int Health Hum Rights
  • BMC Med
  • BMC Med Educ
  • BMC Med Ethics
  • BMC Med Genomics
  • BMC Med Imaging
  • BMC Med Inform Decis Mak
  • BMC Med Res Methodol
  • BMC Med. Genet.
  • BMC Microbiol.
  • BMC Mol. Biol.
  • BMC Musculoskelet Disord
  • BMC Nephrol
  • BMC Neurol
  • BMC Neurosci
  • BMC Nurs
  • BMC Oral Health
  • BMC Palliat Care
  • BMC Pediatr
  • BMC Pharmacol.
  • BMC Pregnancy Childbirth
  • BMC Psychiatry
  • BMC Public Health
  • BMC Pulm Med
  • BMC Res Notes
  • BMC Struct. Biol.
  • BMC Surg
  • BMC Syst Biol
  • BMC Urol
  • BMC Womens Health
  • EMBO J.
  • EMBO Rep.
  • J. Biol. Chem.
  • J. Gen. Virol.
  • J. Virol.
  • Nat Protoc
  • Nat Rev Drug Discov
  • Nat. Biotechnol.
  • Nat. Cell Biol.
  • Nat. Chem. Biol.
  • Nat. Genet.
  • Nat. Immun.
  • Nat. Immunol.
  • Nat. Methods
  • Nat. Neurosci.
  • Nat. Rev. Cancer
  • Nat. Rev. Genet.
  • Nat. Rev. Microbiol.
  • Nat. Rev. Mol. Cell Biol.
  • Nat. Rev. Neurosci.
  • Nat. Struct. Biol.
  • Nat. Struct. Mol. Biol.
  • PLoS Biol.
  • PLoS Clin Trials
  • PLoS Comput. Biol.
  • PLoS Curr
  • PLoS Genet.
  • PLoS Med.
  • PLoS Negl Trop Dis
  • PLoS ONE
  • PLoS Pathog.
  • Proc. Natl. Acad. Sci. U.S.A.

Table 2. List of journals that have given permission to the HIV mutation browser to access, data-mine and display articles.



Database Statistics



Protein Statistics

The number of mutations per protein varies widely. Pol which contains the 3 enzymatic chains, a protease, an integrase and a reverse transcriptase is by far the best studied protein.

Gene Publications Distinct
Mutations
Distinct
Positions
gag 1,203 1,531 467
pol 4,383 4,395 1,240
env 1,044 2,329 727
tat 255 236 75
nef 277 385 164
rev 62 175 78
vif 122 282 146
vpr 153 145 59
vpu 73 127 52

Table 3. Distribution of the mutation data across the proteins of the HIV proteome.




Source of mutation

Data from 2,639 different journals is curated in the database. The top 20 journals by the number of mutation annotated is listed below.

Journal Mutations Papers with
mutations
Papers
Journal of virology6,6391,66611,628
Antimicrobial agents and chemotherapy2,4153251,915
The Journal of biological chemistry2,1454581,946
PloS one2,0404717,497
Virology1,6332772,495
Antiviral research1,541121940
Retrovirology1,388143645
Proceedings of the National Academy of Sciences of the United States of America1,1852373,671
Journal of molecular biology1,0831601,372
Journal of clinical microbiology964831,941
PLoS pathogens9611661,009
Journal of clinical virology : the official publication of the Pan American Society for Clinical Virology880921,085
Viruses85948233
The Journal of antimicrobial chemotherapy75875143
Nucleic acids research595991,508
Virus research41552716
The Journal of general virology41065602
Journal of virological methods38651823
AIDS research and human retroviruses3731994
AIDS (London, England)37321199

Table 4. Top 20 journals in the HIV mutation browser by number of mutations annotated.




Text-mining

A paper describing the text-mining algorithm for the HIV Mutation Browser resource is currently in preparation. We will update here upon acceptance of the article.



Source of ancillary data

The HIV Mutation Browser integrates information from several resources to increase the ease of interpretation of the available HIV mutation and mutagenesis data. These sources are:
  • Homologue information
    - HIV Subtype Reference Protein sequences were retrieved from the Los Alamos National Laboratory.
  • Motif information
    - Motifs were retrieved from the ELM database and from UniProt annotation.
  • Protein structure information
    - Protein structures were retrieved from the RCSB Protein Data Bank (PDB).
  • Protein feature information
    - Protein information was retrieved from UniProt.
  • Disorder information
    - Intrinsically disorder predictions for the proteins was calculated using the IUPred algorithm.


Further HIV resources

HIV Drug Resistance database
Viralzone HIV entry
NIH HIV sequence database
NIH HIV-Human protein interaction database


Useful HIV links

PDB guide to the structural biology of HIV (pdf poster 14mb)
Cell SnapShot: HIV-1 Proteins
NIH: Understanding the biology of HIV


Citation

Davey NE*, Satagopam VP*, Santiago-Mozos S*, Villacorta-Martin C, Bharat TA, Schneider R, Briggs JA.
The HIV Mutation Browser: A Resource for Human Immunodeficiency Virus Mutagenesis and Polymorphism Data.
PLoS Comput Biol. 2014 Dec 4;10(12):e1003951. doi: 10.1371/journal.pcbi.1003951. eCollection 2014. [PubMed]

License

Data in this resource can accessed for non-commercial use according to the HIV Mutation Browser UELA.