Your browser version may not work well with NCBI's Web applications. More information here...
UniGene: An Organized View of the Transciptome.
Each UniGene entry is a set of transcript sequences that appear to come from the same transcription locus (gene or expressed pseudogene), together with information on protein similarities, gene expression, cDNA clone reagents, and genomic location.


Species UniGene Entries
Chordata
Mammalia
Bos taurus (cow) 42,843
Canis lupus familiaris (dog) 27,853
Equus caballus (horse) 8,348
Homo sapiens (human) 123,396
Macaca fascicularis (crab-eating macaque) 12,404
Macaca mulatta (rhesus monkey) 15,359
Monodelphis domestica (gray short-tailed opossum) 359
Mus musculus (mouse) 78,289
Ornithorhynchus anatinus (platypus) 1,831
Oryctolagus cuniculus (rabbit) 6,576
Ovis aries (sheep) 18,814
Papio anubis (olive baboon) 11,904
Pongo abelii (Sumatran orangutan) 6,996
Rattus norvegicus (Norway rat) 63,434
Sus scrofa (pig) 51,576
Trichosurus vulpecula (silver-gray brushtail possum) 11,771
Actinopterygii
Danio rerio (zebrafish) 51,481
Fundulus heteroclitus (killifish) 4,618
Gadus morhua (Atlantic cod) 15,382
Gasterosteus aculeatus (three spined stickleback) 18,938
Ictalurus punctatus (channel catfish) 24,932
Oncorhynchus mykiss (rainbow trout) 26,861
Oryzias latipes (Japanese medaka) 22,239
Pimephales promelas (fathead minnow) 21,592
Salmo salar (Atlantic salmon) 33,647
Takifugu rubripes (pufferfish) 3,809
Amniota
Anolis carolinensis (green anole) 26,540
Amphibia
Xenopus laevis (African clawed frog) 34,963
Xenopus tropicalis (western clawed frog) 42,282
Ascidiacea
Ciona intestinalis 30,774
Ciona savignyi 7,639
Molgula tectiformis 8,526
Aves
Gallus gallus (chicken) 33,383
Meleagris gallopavo (turkey) 1,285
Taeniopygia guttata (zebra finch) 14,432
Cephalochordata
Branchiostoma floridae (Florida lancelet) 14,487
Hyperoartia
Petromyzon marinus (sea lamprey) 11,083
Echinodermata
Echinoidea
Paracentrotus lividus (common urchin) 8,684
Strongylocentrotus purpuratus (purple sea urchin) 19,639
Arthropoda
Branchiopoda
Daphnia pulex (common water flea) 14,190
Insecta
Acyrthosiphon pisum (pea aphid) 13,918
Aedes aegypti (yellow fever mosquito) 19,345
Anopheles gambiae (African malaria mosquito) 21,387
Apis mellifera (honey bee) 9,758
Bombyx mori (domestic silkworm) 11,574
Culex pipiens (house mosquito) 4,957
Drosophila melanogaster (fruit fly) 17,331
Ixodes scapularis (black-legged tick) 18,275
Nasonia vitripennis (jewel wasp) 15,112
Tribolium castaneum (red flour beetle) 9,053
Malacostraca
Litopenaeus vannamei (Pacific white shrimp) 7,968
Nematoda
Chromadorea
Ancylostoma caninum (dog hookworm) 7,394
Caenorhabditis elegans (nematode) 22,102
Platyhelminthes
Trematoda
Schistosoma japonicum 9,357
Schistosoma mansoni 10,219
Turbellaria
Schmidtea mediterranea 10,183
Mollusca
Gastropoda
Aplysia californica (California sea hare) 24,994
Lottia gigantia 15,602
Cnidaria
Anthozoa
Nematostella vectensis (starlet sea anemone) 19,167
Hydrozoa
Hydra magnipapillata 10,040
Streptophyta
Bryopsida
Physcomitrella patens 18,870
Coniferopsida
Picea glauca (white spruce) 22,472
Picea sitchensis (Sitka spruce) 18,838
Pinus taeda (loblolly pine) 18,921
Eudicotyledons
Aquilegia formosa x Aquilegia pubescens 8,046
Arabidopsis thaliana (thale cress) 30,579
Artemisia annua (sweet wormwood) 9,462
Brassica napus (rape) 26,733
Brassica oleracea 5,617
Brassica rapa (field mustard) 14,497
Capsicum annuum 8,868
Citrus clementina 9,123
Citrus sinensis (Valencia orange) 15,808
Glycine max (soybean) 33,001
Gossypium hirsutum (upland cotton) 21,738
Gossypium raimondii 3,297
Helianthus annuus (sunflower) 12,216
Lactuca sativa (garden lettuce) 7,940
Lotus japonicus 14,493
Malus x domestica (apple) 23,731
Medicago truncatula (barrel medic) 18,098
Nicotiana tabacum (tobacco) 24,069
Populus tremula x Populus tremuloides (hybrid aspen) 9,652
Populus trichocarpa (western balsam poplar) 14,965
Prunus persica (peach) 7,620
Raphanus raphanistrum (wild radish) 18,788
Raphanus sativus (radish) 17,649
Solanum lycopersicum (tomato) 18,228
Solanum tuberosum (potato) 18,784
Theobroma cacao 24,958
Vigna unguiculata (cowpea) 15,740
Vitis vinifera (wine grape) 22,083
Isoetopsida
Selaginella moellendorffii 8,810
Liliopsida
Hordeum vulgare (barley) 23,595
Oryza sativa (rice) 40,978
Panicum virgatum (switchgrass) 20,973
Saccharum officinarum (sugarcane) 15,594
Sorghum bicolor (sorghum) 13,899
Triticum aestivum (Wheat) 40,349
Zea mays (maize) 97,123
Chlorophyta
Chlorophyceae
Chlamydomonas reinhardtii 11,310
Volvox carteri 5,638
Dictyosteliida
Dictyostelium
Dictyostelium discoideum (slime mold) 5,957
Apicomplexa
Coccidia
Toxoplasma gondii 6,623
Ascomycota
Eurotiomycetes
Coccidioides posadasii 7,350
Sordariomycetes
Gibberella moniliformis 5,256
Magnaporthe grisea 13,032
Neurospora crassa 10,180
Basidiomycota
Heterobasidiomycetes
Filobasidiella neoformans 5,021
Oomycetes
Peronosporales
Phytophthora infestans (potato late blight agent) 7,161
Bacillariophyta
Bacillariophyceae
Phaeodactylum tricornutum 8,470
Ciliophora
Oligohymenophorea
Paramecium tetraurelia 14,074
Tetrahymena thermophila 5,971
Annelida
Polychaeta
Alvinella pompejana 14,155

In addition to sequences of well-characterized genes, hundreds of thousands novel expressed sequence tag (EST) sequences have been included. Consequently, the collection may be of use to the community as a resource for gene discovery. UniGene has also been used by experimentalists to select reagents for gene mapping projects and large-scale expression analysis.

However, it should be noted that the procedures for automated sequence clustering are still under development and the results may change from time to time as improvements are made. Feedback from users has been especially useful in identifying problems and we encourage you to report any problems you encounter.

It should also be noted that no attempt has been made to produce contigs or consensus sequences. There are several reasons why the sequences of a set may not actually form a single contig. For example, all of the splicing variants for a gene are put into the same set. Moreover, EST-containing sets often contain 5' and 3' reads from the same cDNA clone, but these sequences do not always overlap.

Currently, sequences from the animals human, rat, mouse, cow, zebrafish, clawed frog, fruitfly and mosquito have been processed. Plant organisms are wheat, rice, barley, maize and cress. These species were chosen because they have the greatest amounts of EST data available and represent a variety of species. Additional organisms may be added in the future.

A representation of the UniGene datasets is available by ftp



Descriptions of the UniGene transcript based and genome based build procedures are available.
UniGene References

Pontius JU, Wagner L, Schuler GD. UniGene: a unified view of the transcriptome. In: The NCBI Handbook. Bethesda (MD): National Center for Biotechnology Information; 2003.

[Full Text] [PDF]

Wheeler DL, et al. Database Resources of the National Center for Biotechnology. Nucl Acids Res 31:28-33;2003.

[PubMed] [Full Text] [PDF]

Schuler GD. Pieces of the puzzle: expressed sequence tags and the catalog of human genes. J Mol Med 75:694-698; 1997.

[PubMed]

Schuler GD, et al. A gene map of the human genome. Science 274:540-546; 1996;

[PubMed] [Full Text]

Boguski MS, Schuler GD ESTablishing a human transcript map. Nature Genetics 10: 369-371; 1995.

[PubMed]