EnvNJ: Whole Genome Phylogenies Using Sequence Environments
Contains utilities for the analysis of protein sequences in a phylogenetic context.
Allows the generation of phylogenetic trees base on protein sequences in an alignment-independent way.
Two different methods have been implemented. One approach is based on the frequency analysis of n-grams,
previously described in Stuart et al. (2002) <doi:10.1093/bioinformatics/18.1.100>. The other approach is based on the species-specific neighborhood preference around amino acids. Features include the conversion of a protein set into a vector
reflecting these neighborhood preferences, pairwise distances (dissimilarity) between these vectors,
and the generation of trees based on these distance matrices.
Please use the canonical form
to link to this page.