A sequence in the FASTA format begins with a single-line description, followed by lines of sequence data. The description line is distinguished from the sequence data by a greater-than (">") symbol in the first column. The letters for DNA sequences can be either in upper case or in lower case. Any space, return, or tab in the file will be ignored. Any character not in the standard coding (A, T, C, and G), e.g., Z or 3, will be treated as an unknown base. Finally, an unknown base may be indicated by using X. An example:
>ORFA00005 ATGTATGAAGTTTTAGTGGTTGTTTACTTGTTGGTTGCATTAGGTTTAATTGGCCTGATC TTAATCCAGCAGGGTAAAGGAGCTGACATGGGGGCCTCATTTGGCGCCGGTGCATCAGGT ACCTTATTTGGTTCAAGCGGTTCAGGTAACTTCCTGACACGCACAACGGCAATCCTAGCC ATTGCGTTTTTTACCTTAAGTTTGCTAATTGGCAATTTAAGTGCAAACCACGCAAAAAAT GAAGATGCATGGAAAAATTTAGGTTCAGACACTGAACAGGTTACCCAACCTGTTGAGCAA GGAACCGAAAAGTCAGAAACAAAAATTCCTGAC >ORFA00006 TTGTTTTTTCGGGGGTCAATTTTGGCAACATTAGAATCCAGACTGGCAGACATGCTCAAA GTGCCTGTGGAAGCATTAGGCTTTCAACTTTGGGGTATTGAATATGTACAAGCCGGTAAA CATTCCATACTGCGCGTGTTCATTGATGGTGAGAATGGCATCAATATCGAAGATTGTGCC AACGTAAGTCGCCAAGTCAGTGCTGTGCTAGATGTTGAAGACCCTATTTCTACTGAATAT ACCTTAGAGGTTTCTTCGCCTGGTGTAGATAGACCGCTGTTCACTGCTGAACAATACGCG GCCTATGTCGGCGAGGATGTCAAACTTCAACTGACTATGCCTGTCGCGGGCAGTCGTAAT TTAAAAGGCGCCATTACTCAGGTTGACGGCCAAATGCTGTCGGTGAATGTGAATGGTAAA GAGCTGGTTGTCGCCTTGGATAATATCCGTAAAGGCAACATCATCGCAAAGTTT >ORFA00007 GTGGAACGGCCTTTTCATTTGAAACCGCTTCAGCGACTAGCAGAATCTCTTTATTCATTT TGTCTTGCCTCGTTCACCTTGAAACCATCAAAACTTTGCGATGATGTTGCCTTTACGGAT ATTATCCAAGGCGACAACCAGCTCTTTACCATTCACATTCACCGACAGCATTTGGCCGTC AACCTGAGTAATGGCGCCTTT >ORF00007 acyl carrier protein (acpP) ttgacggaagcgggcggctcgttgcaccccgttcagccttgcgcccccgcttctgcccgg cgtgtacactgcgggcacttcagtttcaggaggaatttggtaatggcgacttttgatgac gtgaaagatgtgattgtggacaagctcggtgtggacgaaggcaaggtgacccccgaagcc cgcttcgtggaagacctcggcgccgacagcctggaaaccgtggaactgatcatgggcctg gaagacaaattcggcgtgaccattcccgacgaagccgccgaaaccatccgcaccgtgcag gccgcggtcgactacatcgacaacaaccag >ORF00058 conserved hypothetical protein atgtcagatatgaatgacgttgcccccccgaccttctgtcccgtgtaccgcgccatcggc gtgttgcaggaaaaatgggtgctgcacatcgtccgcgccctgctggggagcgaaaaggga ttcaacgagctggcccgcgccgtgggcggctgcaacagcgccaccctgacgcagcgcctg gagagcctggaagacctgggcatcatcgtcaagcgcaccgaagacggcggcggcaagctc gcccgcagcgtgtactcgctgacccctgccggacaggaactccagaccgtgattgacgcc atcgacgcctgggcgcgcgcgcacctcagcgaatccgagccgacgcgctgcgtgggc