Here i will show you a minimal working example of code and as you can see the support values for the tree is always 100.
I am using synthetic sequences of 100bp for 6 elements. The sequences have been generated at random choosing from ATCG for each position with equal probabilities.
The input fasta file is
>S1
CTCAGGCACTAGGAGCTTCTCCAGGGCAAAGTTGTCTACGAATATGCCGACTCAGAAGGTTATCAATACGGTTTACTTATCTGCACGCCATTTTCCTATG
>S2
TAACAAATGTTTCTCGCTCGAACCCCGTGGTGCGAGGGGTCACGATAAAAGGTCCCTCCTTCCACGGCATAATTGCTCCCTTTCTTTTCCGTGGGCAGGA
>S3
TTCCTAAAGGACGACTGGAAGCCGGGTACACGTCAACGGAGCTTCTAGCCCGAGGTCTATAGACCGGTTAATGAACTAGACTTAATGTGGAGCTCGTAGA
>S4
ACTTCGTGCCCAACCACTCTATGAGCAGAGTGCTCGAATGAACCTCAAAAGGATATCCGCATTTACTCTATATAACAAACGGCCGTCCCCCCATCTGTCA
>S5
TGACACTGGGTCATTTTACCCGCACTGATCGCTGGGCAGGTCGGCAATTTCGTCAGAATGCCGTGGCCGCACTGAAAATATCATTACCGGACTGAGTATG
>S6
ATAAGCCGAGGGGTAGCCCTCTATTTCGCACCGTATAGACGAAGTGATAAACTTTCTAACCACGTGTCGCCTATCTCTACCTAGCCACATTTGAGGTGCG
The following python code makes the multiple alignment using MUSCLE algorithm
### alignment
from Bio.Align.Applications import MuscleCommandline
from Bio import AlignIO, SeqIO
import os
muscle_exe = "C:/Scripts/muscle3.exe"
path = os.getcwd()
with open('tmp_in.fa', 'w') as fp:
fp.write(fasta)
replacing \ with /
inp_fp = path.replace('\', '/') + '/tmp_in.fa'
out_fp = path.replace('\', '/') + '/tmp_out.afa'
muscle_cline = MuscleCommandline(muscle_exe, input=inp_fp, out=out_fp)
The following code block takes the previous alignment out_fp and creates a tree using UPGMA algorithm.
The number of bootstrap trees is set to 50 and the consensus is majority based.
from Bio.Phylo.TreeConstruction import DistanceCalculator, DistanceTreeConstructor
from Bio.Phylo.Consensus import *
read the multiple sequence alignment (msa)
with open(out_fp,'r') as afa:
alignment = AlignIO.read(afa, 'fasta')
calculator = DistanceCalculator('identity')
constructor = DistanceTreeConstructor(method='upgma', distance_calculator=calculator)
get the list of bootstrap trees
consensus_tree = bootstrap_consensus(alignment=alignment, times=50, tree_constructor=constructor, consensus=majority_consensus)
print(consensus_tree)
The final consensus tree looks like this
Tree(rooted=True)
Clade()
Clade(branch_length=0.350925925925926, name='S5')
Clade(branch_length=0.05833333333333341, confidence=100.0)
Clade(branch_length=0.05879629629629617, confidence=100.0)
Clade(branch_length=0.26481481481481484, name='S3')
Clade(branch_length=0.020370370370370372, confidence=100.0)
Clade(branch_length=0.24444444444444446, name='S1')
Clade(branch_length=0.24444444444444446, name='S4')
Clade(branch_length=0.031018518518518404, confidence=100.0)
Clade(branch_length=0.2925925925925926, name='S6')
Clade(branch_length=0.2925925925925926, name='S2')
My question is: why does the tree have all support values equal to 100? I am sure there must be an error either on my side or in the package module itself.