accession
Download a genome data package by Assembly or BioProject accession
datasets download genome - Download a genome data package
datasets download genome [flags]
Download a genome data package. Genome data packages may include genome, transcript and protein sequences, annotation and one or more data reports. Data packages are downloaded as a zip archive.
The default genome data package includes the following files:
datasets download genome accession GCF_000001405.40 --chromosomes X,Y --include genome,gff3,rna
datasets download genome taxon "bos taurus" --dehydrated
datasets download genome taxon human --assembly-level chromosome,complete --dehydrated
datasets download genome taxon mouse --search C57BL/6J --search "Broad Institute" --dehydrated
--annotated Limit to annotated genomes
--api-key string Specify an NCBI API key
--assembly-level string Limit to genomes at one or more assembly levels (comma-separated):
* chromosome
* complete
* contig
* scaffold
(default "[]")
--assembly-source string Limit to 'RefSeq' (GCF_) or 'GenBank' (GCA_) genomes (default "all")
--assembly-version string Limit to 'latest' assembly accession version or include 'all' (latest + previous versions)
--chromosomes strings Limit to a specified, comma-delimited list of chromosomes, or 'all' for all chromosomes
--debug Emit debugging info
--dehydrated Download a dehydrated zip archive including the data report and locations of data files (use the rehydrate command to retrieve data files).
--exclude-atypical Exclude atypical assemblies
--exclude-multi-isolate Exclude assemblies from multi-isolate projects
--filename string Specify a custom file name for the downloaded data package (default "ncbi_dataset.zip")
--from-type Only return records with type material
--help Print detailed help about a datasets command
--include string(,string) Specify the data files to include (comma-separated).
* genome: genomic sequence
* rna: transcript
* protein: amnio acid sequences
* cds: nucleotide coding sequences
* gff3: general feature file
* gtf: gene transfer format
* gbff: GenBank flat file
* seq-report: sequence report file
* none: do not retrieve any sequence files
(default [genome])
--mag string Limit to metagenome assembled genomes (only) or remove them from the results (exclude) (default "all")
--no-progressbar Hide progress bar
--preview Show information about the requested data package
--reference Limit to reference genomes
--released-after string Limit to genomes released on or after a specified date (input format is flexible, YYYY/MM/DD is suggested)
--released-before string Limit to genomes released on or before a specified date (input format is flexible, YYYY/MM/DD is suggested)
--search strings Limit results to genomes with specified text in the searchable fields:
species and infraspecies, assembly name and submitter.
To search multiple strings, use the flag multiple times.
--version Print version of datasets
Download a genome data package by Assembly or BioProject accession
Download a genome data package by taxon (NCBI Taxonomy ID, scientific or common name at any tax rank)