BankIt Submission Help: Feature Table File

BankIt accepts features as a five-column, tab-delimited table file. The feature table specifies the location and type of each feature, and BankIt processes the feature intervals and translates any CDS features into proteins.

The feature table format allows different kinds of features (e.g., gene, mRNA, coding region, tRNA) and qualifiers (e.g., /product, /note) to be annotated. The valid features and qualifiers are restricted to those approved by the International Nucleotide Sequence Database Collaboration.

Preparing the Feature Table File

The first line of the feature table contains the following basic information

>Feature Sequence_ID

The sequence identifier (Sequence_ID) must match the label used to identify each table's corresponding sequence in the nucleotide FASTA file.
Subsequent lines of the table list the features.

Prepare the feature table file in a text editor and save it as plain ascii text (not .rtf or .doc)

Format for a feature table:

As shown in the examples below:

Line 1
Column 1: Start location (first nucleotide) of a feature
Column 2: Stop location (last nucleotide) of a feature
Column 3: Feature name (for example, 'CDS' or 'mRNA' or 'rRNA' or 'gene' or 'exon')

Line2:
Column 4: Qualifier name (for example, 'product' or 'number' or 'gene' or 'note')
Column 5: Qualifier value

Note in the examples below that 'gene' is both a Feature and a Qualifier and must be entered in two separate columns.

The examples below show sample tables and illustrates a number of points about the table format.

>Feature Seq1
<1    >1050    gene
                        gene          ATH1
<1    1009    CDS
                        product       acid trehalase
                        product       Athlp
                        codon_start   2
<1    >1050    mRNA
                        product       acid trehalase

>Feature Seq2
2626  2590    tRNA
2570  2535
                        product       tRNA-Phe

>Feature Seq3
1080  1210  CDS
1275  1315
                        product       actin
                        note          alternatively spliced
1055  1210  mRNA
1275  1340
                        product       actin
1055  1340  gene
                        gene          ACT
1055  1079  5'UTR
1316  1340  3'UTR