Remove problem sequences from an alignment¶
Using omit_bad_seqs
we can eliminate sequences from an Alignment
based on their gap fraction and/or the number of gaps they uniquely introduce.
Let’s create a sample alignment with some gaps.
Removing sequences with more than X% gaps¶
Creating the omit_bad_seqs
app with the argument gap_fraction=0.5
will omit sequences that contain 50% or more gaps.
Removing sequences that contribute many gaps¶
The quantile=0.8
argument omits sequences that are ranked above the specified quantile with respect to the number of gaps uniquely introduced into the alignment. In the following example, sequence s6
is omitted, as it uniquely introduces gaps in the first two positions of the alignment.