Outline

6.1. Outline#

  • We initiate the topic by emphasising the role of sequences as the hereditary information store.

  • We then introduce \(k\)-mers as a basic quantity for describing biological sequences.

  • Algorithms for counting \(k\)-mers are introduced.

  • The relationship between \(k\)-mers and functional motifs is developed along with descriptions of experiments that are used to identify functional motifs.

  • We introduce Shannon’s entropy for measuring information content of sequences and extend it to identifying binding motifs.

  • The odds-ratio is introduced as a statistical method for identifying enrichment relative to background distribution.

  • We proceed to algorithms for examining whether sequences descend from a common ancestor.

  • The brute force dotplot algorithm for comparing related sequences is contrasted with an elegant dynamic programming algorithm.