FASTA Format

A sequence in the FASTA format begins with a single-line description, followed by lines of sequence data. The description line is distinguished from the sequence data by a greater-than (">") symbol in the first column. The letters for DNA sequences can be either in upper case or in lower case. Any space, return, or tab in the file will be ignored. Any character not in the standard coding (A, T, C, and G), e.g., Z or 3, will be treated as an unknown base. Finally, an unknown base may be indicated by using X. An example:

>ORFA00005
ATGTATGAAGTTTTAGTGGTTGTTTACTTGTTGGTTGCATTAGGTTTAATTGGCCTGATC
TTAATCCAGCAGGGTAAAGGAGCTGACATGGGGGCCTCATTTGGCGCCGGTGCATCAGGT
ACCTTATTTGGTTCAAGCGGTTCAGGTAACTTCCTGACACGCACAACGGCAATCCTAGCC
ATTGCGTTTTTTACCTTAAGTTTGCTAATTGGCAATTTAAGTGCAAACCACGCAAAAAAT
GAAGATGCATGGAAAAATTTAGGTTCAGACACTGAACAGGTTACCCAACCTGTTGAGCAA
GGAACCGAAAAGTCAGAAACAAAAATTCCTGAC
>ORFA00006
TTGTTTTTTCGGGGGTCAATTTTGGCAACATTAGAATCCAGACTGGCAGACATGCTCAAA
GTGCCTGTGGAAGCATTAGGCTTTCAACTTTGGGGTATTGAATATGTACAAGCCGGTAAA
CATTCCATACTGCGCGTGTTCATTGATGGTGAGAATGGCATCAATATCGAAGATTGTGCC
AACGTAAGTCGCCAAGTCAGTGCTGTGCTAGATGTTGAAGACCCTATTTCTACTGAATAT
ACCTTAGAGGTTTCTTCGCCTGGTGTAGATAGACCGCTGTTCACTGCTGAACAATACGCG
GCCTATGTCGGCGAGGATGTCAAACTTCAACTGACTATGCCTGTCGCGGGCAGTCGTAAT
TTAAAAGGCGCCATTACTCAGGTTGACGGCCAAATGCTGTCGGTGAATGTGAATGGTAAA
GAGCTGGTTGTCGCCTTGGATAATATCCGTAAAGGCAACATCATCGCAAAGTTT
>ORFA00007
GTGGAACGGCCTTTTCATTTGAAACCGCTTCAGCGACTAGCAGAATCTCTTTATTCATTT
TGTCTTGCCTCGTTCACCTTGAAACCATCAAAACTTTGCGATGATGTTGCCTTTACGGAT
ATTATCCAAGGCGACAACCAGCTCTTTACCATTCACATTCACCGACAGCATTTGGCCGTC
AACCTGAGTAATGGCGCCTTT
>ORF00007 acyl carrier protein (acpP)
ttgacggaagcgggcggctcgttgcaccccgttcagccttgcgcccccgcttctgcccgg
cgtgtacactgcgggcacttcagtttcaggaggaatttggtaatggcgacttttgatgac
gtgaaagatgtgattgtggacaagctcggtgtggacgaaggcaaggtgacccccgaagcc
cgcttcgtggaagacctcggcgccgacagcctggaaaccgtggaactgatcatgggcctg
gaagacaaattcggcgtgaccattcccgacgaagccgccgaaaccatccgcaccgtgcag
gccgcggtcgactacatcgacaacaaccag
>ORF00058 conserved hypothetical protein
atgtcagatatgaatgacgttgcccccccgaccttctgtcccgtgtaccgcgccatcggc
gtgttgcaggaaaaatgggtgctgcacatcgtccgcgccctgctggggagcgaaaaggga
ttcaacgagctggcccgcgccgtgggcggctgcaacagcgccaccctgacgcagcgcctg
gagagcctggaagacctgggcatcatcgtcaagcgcaccgaagacggcggcggcaagctc
gcccgcagcgtgtactcgctgacccctgccggacaggaactccagaccgtgattgacgcc
atcgacgcctgggcgcgcgcgcacctcagcgaatccgagccgacgcgctgcgtgggc