7.1. What is molecular evolution?#
Molecular Evolution is a domain of research that is concerned with why biological sequences look the way they do. Can we explain the distribution of genetic variation between biological sequences (RNA, DNA, and proteins) in terms of their evolutionary origins? Can we establish whether severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is showing signs of adaptive evolution to its new host - us? These questions are within the domain of molecular evolution.
Since the nucleic acids are the information system of living things and can also be, along with proteins, functional, molecular evolution draws on genetics, molecular and cell biology, biochemistry, and chemistry. At a conceptual level, molecular evolution is closely related to population genetics and, as such, is very much focussed on the process of evolution as it manifests in biological sequences.
At a practical level, using genetic variation as an indicator of biological importance is central to much of modern biology. Researchers who are engaged in molecular studies of specific proteins or genes rely on resources provided by a genome portal (like ucsc or ensembl) to get insights into what a molecule of interest might actually be doing. One of the types of information that these portals present to facilitate understanding is a comparison of the biological sequences from different species. Why? We will answer that question in this topic.
7.2. What is phylogenetics?#
Phylogenetics is concerned with the problem of inferring the relationships between members of a collection of biological sequences. This relationship is displayed as a tree. As the methods employed to estimate relationships in phylogenetics are the same methods used to understand process in molecular evolution, phylogenetics is a sub-discipline (albeit highly specialised) of molecular evolution.
In the Phylogenetic Workflow figure I display the basic workflow for phylogenetic reconstruction. There is substantial overlap with molecular evolutionary analysis in general, but the emphasis here is on the problem of estimating a tree. The steps are:
sample homologous sequences from taxa of interest
align the sequences
choose method to build the phylogenetic tree.
pick a substitution model
Estimate the phylogenetic tree
we may include a technique for estimating the level of uncertainty in that tree