Analyze alignments

This page is part of the GeneWarrior Documentation. Go to the main site of GeneWarrior

Alignment analysis

Consensus sequence

The Consensus sequence is the most frequent letter (nucleotide or amino acid) for each column. For protein alignments, in the case of a tie (two or more aminoacids are equally frequent) an X (any amino acid) is chosen as consensus. For DNA alignments, in the case of a tie of two nucleotides the matching ambigous nucleotide is chosen, in the case of three or more equally frequence nucleoties an N (any nucleotide) is chosen as consensus.

Phylogenetic tree

Phylogenetic trees visualize the evolutionary relationship of the sequences in an alignment.
The MUSCLE alignment software and its -maketree option are used to generate a UPGMA tree.
Be aware that this approach allows an approximate idea about the evolutionary relationship of the sequences, but does not substitute a high-quality phylogenetic tree.

Sequence logo

Sequence logos visualize sequence conservation of alignments. Each column in the alignment is depicted as a stack. The total height of the stack symbolizes the conservation; the height of a single letter in the stack symbolizes the frequency of this letter. The y-axis represents the information content in bits.
The WebLogo 3-Software is used to generate Sequence logos.

References

Edgar, R.C. (2004) MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 32(5):1792-1797

Crooks GE, Hon G, Chandonia JM, Brenner SE WebLogo: A sequence logo generator, Genome Research, 14:1188-1190, (2004)

Back to Documentation index