what is fasta format

what is fasta format

1 year ago 66
Nature

The FASTA format is a text-based format used in bioinformatics and biochemistry to represent either nucleotide sequences or amino acid sequences. In this format, nucleotides or amino acids are represented using single-letter codes. The format allows for sequence names and comments to precede the sequences. The name FASTA comes from the FASTA software package, but it has now become a near-universal standard in the field of bioinformatics.

A sequence in FASTA format consists of:

  • One line starting with a ">" sign, followed by a sequence identification code. It is optionally followed by a textual description of the sequence.
  • The sequence itself, which can contain returns. It is recommended that each line of sequence be no longer than 80 characters.

The FASTA format is simple and easy to manipulate and parse sequences using text-processing tools and scripting languages. There is no standard file extension for a text file containing FASTA formatted sequences, but some examples of widely used file extensions are .fasta, .fna, or simply .txt.

Read Entire Article