This page is part of the GeneWarrior Documentation. Go to the main site of GeneWarrior

Sequence Alignment

See also What is an Alignment? and How to create an Alignment.

Two or more DNA or protein sequences are arranged in a way that similar characters (and similar regions) are ordered in the same columns. Gaps (marked as -) are introduced to keep the order.
Pairwise alignment (alignment of only two sequences) uses the Needleman-Wunsch algorithm for full length alignments (Global alignments) and Smith-Waterman for local aligment.
Local alignments are useful if you are only interested in the region which contains the highest similarity between the two sequences, whereas global alignments try to align the entire sequence from beginning to end, with the additional rule that the beginning and ends of a sequence can start anywhere (Cost-free end alignment, CFE), thus allowing to align overlapping alignments
Multiple sequence alignments (MSA) are more complex, since they allow more than two sequences to be aligned. GeneWarrior uses MUSCLE, one of the most popular MSA software, using the fast speed parameters.

Used Parameters

Protein alignment

Needleman-Wunsch and Smith-Waterman Alignment uses BLOSUM62 Substitution Matrix for alignment of proteins. See BLAST site for details of the used scores.

Multiple sequence alignment by MUSCLE uses the "fastest" options: -maxiters 1 -diags -sv -distance1 kbit20_3 as described on the MUSCLE documentation.

DNA alignment

Needleman-Wunsch and Smith-Waterman Alignment uses BLAST Substitution Matrix for alignment of nucleotides. See BLAST site for details of the used scores.

Multiple sequence alignment by MUSCLE uses the "fastest" options: -maxiters 1 -diags as described on the MUSCLE documentation.

Reference

Edgar, R.C. (2004) MUSCLE: multiple sequence alignment with high accuracy and high throughput.Nucleic Acids Res. 32(5):1792-1797

Back to Documentation index