Table 6.

Text phrases.

Text phraseExplanation
“Annotation Information”For genomes that NCBI annotates, Gene represents information about the annotation of each current GeneID. Text phrases will be attached to the gene data if the gene is not annotated well, or if annotation has changed in a complex way. Text phrases will also be attached if there is no defining cDNA or genomic sequence for the gene, or if the GeneID was created after the most recent genome annotation. The goal is to facilitate retrieval of Gene records where the annotation on the RefSeq genomic records, if it exists, should be interpreted with caution. Records that are not known to have annotation issues can be retrieved by including the following in the query:
NOT “Annotation Information” [Text]
Specific sub-categories of annotation information are described below.
Sub-categories of annotation information
“partial on reference assembly“The annotated gene, as suggested by the defining cDNA, is not complete.
“spans an assembly gap”There is a gap in the reference assembly where the defining cDNA should align.
“suggests misassembly”There are order/orientation issues in the reference assembly suggested by the cDNA alignment.
“not annotated on reference assembly”This gene is not annotated on the reference assembly.
“not in current annotation release”This gene is not annotated on any assembly of the current annotation release.
“only annotated on alternate loci in reference assembly”This gene is only annotated on one or more alternate locus assembly-units of the reference assembly.
“only annotated on patches unit in reference assembly”This gene is only annotated on the PATCHES assembly-unit of the reference assembly.
“only annotated on alternate loci and patches unit in reference assembly”This gene is only annotated on one or more alternate locus assembly-units and the PATCHES assembly-unit of the reference assembly
Other text phrases
"Orthologs from Annotation Pipeline"The Homology section for many genes features a link for "Orthologs from Annotation Pipeline." This dataset is computed as part of NCBI's Eukaryotic Genome Annotation Pipeline using a combination of protein sequence similarity and local synteny information. The pipeline determines orthology between the genome assembly that is being annotated and a reference genome, typically human. The collection of pairwise orthology calls is then tracked as a group which may be further supplemented by manual curation. This process provides ortholog information more quickly for newly annotated genomes, and supplements the content available in HomoloGene.
“involved in immune response or antiviral activity”Related to COVID-19
“involved in cytokine storm inflammatory response”Related to COVID-19
“involved in SARS-CoV-2 infection”Related to COVID-19
“relevant for COVID-19 prognosis”Related to COVID-19
“relevant for COVID-19 treatment”Related to COVID-19
“involved in host gene regulation”Related to COVID-19
“involved in host gene recombination”Related to COVID-19
“relevant for disease process”Related to COVID-19

Please note that the double quotes are included in the text phrases shown here because they are mandatory when performing a text phrase search.

From: Gene Help: Integrated Access to Genes of Genomes in the Reference Sequence Collection

Cover of Gene Help
Gene Help [Internet].

NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.