PubMed is just one database from the National Library of Medicine (NLM). The National Center for Biotechnology Information (NCBI), a division of the NLM, maintains several molecular biology databases. These databases link to one another and to PubMed. This month, I’ll describe how to find information about a gene in PubMed and the Gene database.
Which NCBI resource(s) should I use to find information on a gene?
You can start in either PubMed or Gene, a database of known and predicted genes for a several species. Each record is devoted to a single gene and may provide information on nomenclature, chromosomal location, gene products, phenotypes, and interactions, as well as links to literature, sequences, and other NCBI and external databases. Consider a Gene record a gene’s homepage in NCBI.
I’ll begin in PubMed because it is the database with which you are likely most familiar. In the PubMed search box, you can enter either a gene’s name or symbol. To activate the Gene Sensor (see next question), use the official gene symbol, which can be found at genenames.org, the site for the HUGO Gene Nomenclature Committee (HGNC). The HGNC assigns standardized names to human genes.
What is the PubMed Gene Sensor?
Gene Sensor checks the gene symbol that you enter against symbols in the Gene database and, if a match is found, displays links to information about the gene in NCBI databases at the top of your PubMed search results. These links include: the records(s) for the gene in the Gene database; articles on the gene’s function (GeneRIF; see below); and tests in the Genetic Testing Registry.
Choose the link to the gene’s record in the Gene database. The first option will be for the human gene, with links for other species, if available, following.
What if my initial PubMed search does not activate the Gene Sensor?
If you do not see the Gene Sensor box at the top of your PubMed results, then you can search the Gene database directly by selecting ‘Gene’ from the drop-down menu next to the search box. Enter a gene name or symbol, species, or disease.
How do I find information once I am in a Gene record?
Use the Table of Contents in the right-hand column of the record to navigate to specific information about the gene. Scroll down to the ‘Related information’ section of the right-hand column for links to information about the gene in other NCBI databases.
So how does this help me find PubMed articles about a gene?
In the Related information section of a Gene record, you will notice several links to PubMed. Each of these links retrieves a specific set of articles in PubMed:
- PubMed: Articles that have been indexed with the Medical Subject Heading (MeSH) of the protein that the gene codes for, combined with the subheading ‘genetics’. For example: ‘Hemochromatosis Protein/genetics’[MeSH].
- PubMed (GeneRIF): Articles that focus on the function of a gene. GeneRIFs (reference into function) are identified in three ways: by National Library of Medicine staff; by volunteer collaborators who submit a function, and article(s) describing that function (if you know of, or have authored, an article about a gene’s function, then you can submit a GeneRIF); through reports from HuGE Navigator, a human genome epidemiology knowledge base from the Centers for Disease Control and Prevention. PubMed (GeneRIF) also includes articles that describes a gene’s interactions.
- PubMed (OMIM): Articles cited in Online Mendelian Inheritance in Man (OMIM) records. OMIM is a compendium of human genes and phenotypes.
- PubMed (nucleotide/PMC): Articles identified from shared sequence and PubMed Central links.
Each set of articles is continuously updated. Use these links to retrieve the set of articles that best describes the type of literature you are seeking.
What if I want to find all the literature on a particular gene in PubMed?
If you want to do a comprehensive PubMed search for literature on a gene, then use the Gene record and HGNC (genenames.org) to identify the gene’s current and past names, symbols, and synonyms. Use ‘OR’ to combine these keywords with the MeSH term for the protein that the gene codes for, with the subheading ‘genetics’. Some genes, but not all, genes also have a MeSH term for the gene itself.
For example:
“BRCA1” OR “BRCC1” OR “FANCS” OR “BROVCA1” OR “PPP1R53” OR “breast cancer 1” OR “Genes, BRCA1″[MeSH] OR “BRCA1 Protein/genetics”[MeSH]
You may get a lot of irrelevant results with a comprehensive search because many gene symbols are not unique. Therefore, this search would likely have to be combined with another concept, using ‘AND’.
For example:
(“BRCA1” OR “BRCC1” OR “FANCS” OR “BROVCA1” OR “PPP1R53” OR “breast cancer 1” OR “Genes, BRCA1″[MeSH] OR “BRCA1 Protein/genetics”[MeSH]) AND (“ovarian neoplasms”[MeSH] OR “ovarian neoplasms” OR “ovarian cancer”)