Figure 17. Definitions for ortholog, paralog and gene duplication. The solid object in this figure represents the phylogenetic relationship between three monophyletic taxa, A, B, and C; the blue and green lines within the object represent the phylogenetic relationships among the genes in a subset of the genomes of individuals from the three taxa. The most recent common ancestor of the three taxa had one copy of a gene that was duplicated after taxon C diverged from the line leading to taxa A and B, but before taxa A and B diverged from each other. This is the only true duplication here in the sense that the word "duplication" is used in the context of molecular evolution -- in that sense, a gene is duplicated only when additional copies accrue within a single lineage. There are three apparent duplications, the two indicated for the blue gene lineages and an additional one for the green gene lineage, which are due to the two speciation events in the diagram. Note that by virtue of their common ancestry at the root of this branch, all of the genes are homologous to each other. However, after the gene duplication event, genes in the green lineages may evolve independently of genes in the blue lineage both within a taxon and between taxa. Consequently the phylogeny of the genes is different from the phylogeny of the taxa. The terms paralogous and orthologous were coined (Fitch and Margoliash 1970) to make distinctions between different types of phylogenetic relationship between genes. A gene in one taxon is orthologous to a gene in another taxon if the only duplications (in the colloquial sense) leading to differentiation between them were consequences of speciation events. Two genes are paralogous to each other if they differ (at least in part) due to a duplication event (in the more restricted sense). Note that this means that the gene in the blue lineage of taxon A is paralogous to the gene in the green lineage of taxon A (as per the original definition of the word -- Fitch and Margoliash 1970) as well as to the gene in the green lineage of taxon B (only the latter relationship is indicated in the figure). The green gene of taxon A is orthologous to the green gene of taxon B because any differences between them arose merely due to their independent evolution within their respective lineages. Similarly, the blue gene of taxon A is an ortholog of the blue gene of taxon B. According to the original definitions, the blue gene of taxon C is neither an ortholog nor a paralog of any of the genes in taxa A and B. However, retained similarity of function between the blue gene of taxon C and the blue genes of taxa A and B could be construed to make all of them orthologs of each other, and that is how the term is used in this manuscript. For a more thorough treatment of various related terms and the theoretical issues in which they are involved, see Patterson 1988.