Steve Pinker’s hair and the muscles of worms

I’ve been guilty of teaching bean-bag genetics this semester. Bean-bag genetics treats individuals as a bag of irrelevant shape containing a collection of alleles (the “beans”) that are sorted and disseminated by the rules of Mendel, and at its worst, assigns one trait to one allele; it’s highly unrealistic. In my defense, it was necessary — first-year students struggle enough with the basic logic of elementary transmission genetics without adding great complications — and of course, in some contexts, such as population genetics, it is a useful simplification. It’s just anathema to anyone more interested in the physiological and developmental side of genetics.

The heart of the problem is that it ignores the issue of translating genotype into phenotype. If you’ve ever had a basic genetics course, it’s quite common to have been taught only one concept about the phenotype problem: that an allele is either dominant, in which case it is expressed as the phenotype, or it’s recessive, in which case it is completely ignored unless it’s the only allele present. This idea is so 19th century — it’s an approximation made in the complete absence of any knowledge of the nature of genes.

And the “one gene, one trait” model violates everything we do know about the phenotype and genotype. Every gene is pleiotropic — it influences multiple traits to varying degrees. Every trait is multigenic — multiple genes contribute to the expression of every phenotypic detail. The bean-bag model is totally inadequate for describing the relationship of genes to physiology and morphology. Instead of a bean-bag, I prefer to think of the genome as comparable to a power spectrum, an expression of the organism in a completely different domain. But I wrote about that previously, and I’ll make this explanation a little simpler.

Here’s the problem: you can’t always reliably predict the phenotype from the genotype. We have a skewed perspective on the problem, because historically, genetics has first searched for strong phenotypes, and then gone looking for the genetic cause. We’ve been effectively blind to many subtle phenotypic effects, simply because we don’t know how to find them. When we go the other way, and start by mutating known genes and then looking for changes in the phenotype, we’re often surprised to discover no detectable change. One of the classic examples is the work of Elkins (1990), who found that mutating a neural cell adhesion gene, Fasciclin I, did not generate any gross defects. Mutating another gene, a signal transduction gene called Abelson tyrosine kinase, similarly had no visible effects. Mutating the two together, though — and this is a major clue to how these strange absences of effect could work — did produce gross and obvious effects on nervous system development.

Providing another great example, Steve Pinker examined his own genome, and discovered that his genes said he was predisposed to be red-haired and at high risk for baldness. If you’ve seen Steve Pinker, you know he’s neither.

How can this be? As any geneticist will tell you, the background — the other alleles present in the organism — are important in defining the pattern of expression of a specific gene of interest. One simple possibility is that the genome contains redundancy: that a trait such as adhesion of axons in the nervous system or the amount of hair on the head can be the product of multiple genes, each doing pretty much the same thing, so knocking out one doesn’t have a strong effect, because there is a backup present.

i-d2c1a43310cf416258c09fbbfd5f6295-pairwise.jpeg
Genetic interactions provide a general model for incomplete penetrance. Representation of a negative (synergistic) genetic interaction between two genes A and B.

So Steve Pinker could have seen that he has a defective Gene A, which is important in regulating hair, but maybe there’s another Gene B lurking in the system that we haven’t characterized yet, and which can compensate for a missing Gene A, and he has a particularly strong form of it. One explanation for a variable association between an allele and the phenotype, then, is that we simply don’t have all the information about the multigenic cause of the phenotype, and there are other genes that can contribute.

This doesn’t explain all of the observed phenomena, however. Identical twins who share the same complement of alleles also exhibit variability in the phenotype; we also have isogenic animal lines, where every individual has the same genetic complement, and they also show variability in phenotype. This is the problem of penetrance; penetrance is a genetics term that refers to the likelihood that an individual carrying an allele will actually express the phenotype associated with that allele…and it’s not always 100%.

Again, the explanation lies in the other genes present in the organism. No gene functions all by itself; its expression is dependent on a cloud of other proteins — transcription factors, enhancers, chaperones — all of which modulate the gene of interest. We also have to deal with statistical variation in the degree of expression of all those modulatory factors, which vary by chance from cell to cell, and so the actual degree of activation of a gene may follow a kind of bell curve distribution. In the cartoon below, the little diamonds represent these partners; sometimes, just by chance, they’ll be present in sufficiently high numbers to boost Gene B’s output enough to fully compensate for a defective Gene A; in other cases, just by chance, they’re too low in concentration to adequately compensate for the absence.

i-4e599384fc1ad3453f1e71306459fde4-penetrance.jpeg
Genetic interactions provide a general model for incomplete penetrance. A model for incomplete penetrance based on variation in the activity of genetic interaction partners.

What the above cartoon illustrates is the concept of developmental noise, the idea that the cumulative total of statistical variation in gene expression during development can produce significant phenotypic variation in the absence of any differences in the genotype. Developmental noise is a phrase bruited about quite a bit, and there’s good reason to think it’s valid: we can see quantitative variation in gene expression with molecular techniques, for instance. But at the same time we have other concepts, like redundancy and canalization, that work to buffer variation and produce reliable outputs from developmental processes, so we don’t have many good examples where we can directly correlate subtle variation at the molecular level with clear morphological differences.

To test that, we have to go to simple animal models (it turns out that Steve Pinker is a rather intractable experimental animal). And here we have a very nice example in the nematode worm, C. elegans. In these experiments, the investigators were dealing with an isogenic strain — the genetic background was identical in all of the animals — raised in a uniform environment. They were looking at a mutant in the gene tbf9, which causes defects in muscle formation, but only 50% penetrance; that is, half the time, the mutants appeared completely normal, and the other half of the time they had grossly abnormal muscle development.

i-2265898264c55ebafe7177e5f8b47d97-devnoise.jpeg
Genetic interactions provide a general model for incomplete penetrance. Inactivation of the gene tbx-9 in C. elegans results in an incompletely penetrant defect, with approximately half of embryos hatching with abnormal morphology (small arrow).

See the big red question mark? That’s the big question: can we trace the abnormal phenotype all the way back to random fluctuations in the expression of other genes in the animal? Yes, they can, otherwise it would never have been published in Nature and I wouldn’t be writing about it now.

In this case, they have a situation analogous to the Gene A/Gene B cartoons above. Gene B is tbx-9; Gene B is a related gene, a duplicate called tbx-8 which acts as a redundant copy. In the experiments below, they knock out tbx-9 with a mutation, and then measure the quantity of other genes in the system using a very precise technique of quantitative fluorescence. Below, I’ve reproduced the entirety of their summary figure, because it is awesome — I just love the idea of being able to count the number of molecules expressed in a developing system. In order to avoid overwhelming everyone, though, I’ll just describe a couple of the panels to give you the gist of the work.

First, just look at the top left panel, a. It’s a plot of the level of expression of the tbx-8 gene over time, where each line in the plot is a different animal. The lines in black are in the wild type animal, with fully functional copies of bothe tbx-8 and tbx-9, and you should be able to see that there’s a fair amount of variation in expression, about two-fold, in different individuals. The lines in green are from animals mutant for tbx-9; it’s messy, but statistically what happens when tbx-9 is knocked out, more tbx-8 gene product is produced.

Panel e, just below it, shows the complementary experiment: the expression of tbx-9 is shown for both wild type (black) and animals with tbx-8 knocked out. Here, the difference is very clear: tbx-9 levels are greatly elevated in the absence of tbx-8. This shows that tbx-8 and tbx-9 are actually tied together in a regulatory relationship where levels of one rise in response to reduced levels of the other, and vice versa.

i-b9e909958ec9f54fcafe88463aec893a-quanttbx-thumb-500x412-71413.jpeg
(Click for larger image)

Early inter-individual variation in the induction of ancestral gene duplicates predicts the outcome of inherited mutations. a, Quantification of total green fluorescent protein (GFP) expression from a tbx-8 reporter during embryonic development in WT (black) and tbx-9(ok2473) (green) individuals. Each individual is a separate line. a.u., Arbitrary units. b, Boxplot of tbx-8 reporter expression (a) showing 1.2-fold upregulation in a tbx-9 mutant at comma stage (~290 min, P=1.6×3 10-3, Wilcoxon rank test). c, Expression of tbx-8 reporter in a tbx-9(ok2473) background for embryos that hatch with (red) or without (blue, WT) a morphological defect. d, Boxplot of c showing tbx-8 expression is higher in tbx-9 embryos that develop a WT phenotype (blue) compared with those that develop an abnormal (red) phenotype at comma stage (P= 6.1×10-3). e, Expression of a ptbx-9::GFP reporter in WT (black) and tbx-8(ok656) mutant (green). f, Boxplot of tbx-9 reporter showing 4.3-fold upregulation at comma stage (~375 min, P=3.6×10-16). g, Expression of tbx-9 reporter in a tbx-8(ok656) mutant background, colour code as in
c. h, Boxplot of g showing tbx-9 expression is higher in tbx-8 embryos that develop a WT phenotype (P=0.033). i, Expression of a pflh-2::GFP reporter in WT (black) and flh-1(bc374) mutant (green). j, Boxplot of flh-2 reporter expression (i) showing 1.8-fold upregulation in a flh-1 mutant at comma stage (~180 min, P=2.2×10-16). k, Bright-field and fluorescence image of an approximate 100-cell flh-1; pflh-2::GFP embryo. Red arrow indicates the local expression of flh-2 reporter quantified for flh-1 phenotypic prediction.
l, Boxplot showing higher flh-2 reporter expression at approximate 100 cells for WT (blue) compared with abnormal (red) phenotypes (P=0.014). Boxplots show the median, quartiles, maximum and minimum expression in each data set.

Now skip over to the right, to panel c. All of the lines in this plot are of tbx-8 expression in tbx-9 mutants, and again you see a wide variation in levels of gene expression. In addition, the lines are color-coded by whether the worm developed normally (blue), or had the mutant phenotype (red). The answer: worms with low tbx-8 levels were more likely to have the abnormal phenotype than those with high levels.

Panel g, just below it, is the complementary analysis of tbx-9 levels in tbx-8 mutants, and it gives the same answer.

Obviously, though, there is still a lot of variability unaccounted for; having relatively high levels of one or the other of the tbx genes didn’t automatically mean the worm developed a wild-type phenotype. There’s got to be something more that is varying. Look way back to the second cartoon I showed, with the little diamonds representing the cloud of transcription factors and chaperone proteins that modulate gene expression. Could there also be correlated variation there? And yes, there is. The authors looked at a chaperone protein called daf-21 that is associated with the tbx system, and found, in mutants for tbx-9, that elevated levels of daf-21 were associated with wildtype morphology (in blue), while lowered levels of daf-21 were associated with the mutant phenotype.

i-d7b413047f2767e92cdc267d44f8830a-daf21-thumb-500x259-71416.jpeg
(Click for larger image)

Expression of daf-21 reporter in a tbx-9(ok2473) mutant background. Embryos that hatch into phenotypically WT worms (blue) have higher expression than those hatching with a morphological defect (red) at the comma stage (P=1.9×10-3).

I know what you’re thinking: there isn’t a perfect correlation between high daf-21 levels and wildtype morphology either. But when they do double-label experiments, and take into account both daf-21 and tbx-8 levels in tbx-9 mutants, they found that 92% of the animals with greater than median levels of expression of both daf-21 and tbx-8 had wildtype morphology. It’s still not perfect, but it’s pretty darned good, and besides, it’s no surprise that there are probably other modulatory factors with statistical variation lurking in the system.

What should you learn from this? Developmental noise is real, and is a product of statistical variation in the degree of expression of multiple genetic components that contribute to a phenotype. We can measure that molecular variation in living, developing systems and correlate it phenotypic outcomes. None of this is surprising; we expect that the process of gene expression is going to be a bit noisy, especially in these transcriptional regulators that are present in low concentration in the cell, anyway. But the other cool thing we can observe here is that having multiple noisy systems that interact with each other can produce a more reliable, robust signal and contribute to the fidelity of developmental outcomes.


Burga A, Casanueva MO, Lehner B (2011)
Predicting mutation outcome from early stochastic variation in genetic interaction partners. Nature 480(7376):250-3.

Elkins T, Zinn K, McAllister L, Hoffmann FM, Goodman CS (1990)
Genetic analysis of a Drosophila neural cell adhesion molecule: interaction of fasciclin I and Abelson tyrosine kinase mutations. Cell 60(4):565-75.

(Also on FtB)

Techniques to go with the tools

So you read that cool summary of how to build a molecular biology lab for $500. But wait, you don’t know what you’d do with the mobio toys! Here’s how to correct that: go to a workshop.

THE MICHAEL SMITH LABS AND ADVANCED MOLECULAR BIOLOGY LAB PRESENTS OUR MOLECULAR BIOLOGY WORKSHOPS 2012 WINTER/SPRING Session.

ONE WEEK VERSION (5 DAYS) – MOLECULAR TECHNIQUES WORKSHOPS
FEB 13 – 17, 2012 (CAN$1400)
DESCRIPTION: Recently updated: This intense 5 day workshop will focus on a myriad of different techniques used in the molecular manipulation of DNA, RNA and protein, as well as inclusion of exercises in some basic bioinformatics tools. Primarily aimed at researchers who are new to the area, familiar but require a quick updating, or would like more practical bench training.

PHILOSOPHY: Whilst molecular techniques have evolved at a speedy rate over the last few decades, the underlying biochemical principles behind the vast majority of them has actually changed little. This workshop therefore combines opportunities to perform the latest, as well as commonly used older techniques, with particular attention to the chemical nuts and bolts behind them. In all, this allows the researcher to not only gain needed familiarity with the techniques, but also achieve a comfortable theoretical level to allow for both (1) that all important skill of troubleshooting, and (2) the often undervalued skill of judging the utility of “tricks” that aim to speed up, or lower costs of a given methodology.

TECHNIQUES COVERED: Various nucleic acid purification methodologies (silica bead, organic, and/or pI based), restriction digests, ligations, dephosphorylation assays, agarose gel electrophoresis, transformation (including electroporation), PCR, reverse transcriptase assay, real time qPCR, basic bioinformatics, (including blast tools), SDS-PAGE, Western blot analysis, Isoelectric focusing strips, and 2D protein gels.

Full details can be found at http://www.bioteach.ubc.ca/mb-workshops/#molecular

See? You can become a mad scientist for cheap nowadays.

(Also on FtB)

Mad scientists, start drooling

The future is arriving fast. Here are the instructions for assembling a $500 home molecular biology laboratory — you can do it! And it’s getting cheaper all the time!

The widespread and increasing availability of second-hand professional laboratory equipment or inexpensive new commercial surrogates means that it is now unchallenging to set up a fully functional molecular laboratory for less than $500 in equipment costs. Coupled with the presence of sources for all reagents and supplies needed in formats that are safe for general use, the work presented here demonstrates that capacity to set up functional molecular biology teaching modules is well within the reach of even the smallest educational facilities. When coupled with outsourced PCR product Sanger sequencing available from commercial sources at prices approaching $5/reaction, the capacity of such “home labs” to start undertaking research of real potential scientific value–such as surveys of microbial biota in unusual environments–at negligible costs should not be underestimated. Similarly, the potential for setting up labs of this type for medical applications in emerging countries may be worth considering. While current best methods have moved to real-time and array-based high throughput, contamination resistant methods, the methods demonstrated here were “state of the art” for clinical and research molecular diagnostics in the Western world only some 15 years ago.

Hmmm. The kids have flown, I’ve got more space than we know what to do with…maybe this summer I should tinker with setting up something like this.

(Also on FtB)

My turn at Skepticon

Yeah, I gave a talk at Skepticon like several other rascals here at Freethoughtblogs. Now, even if you didn’t make the pilgrimage to Springfield, Missouri, you can watch it too.

It’s a straight science talk with several swipes at creationism, so unfortunately, I don’t think it will make any ice cream salesmen cry.

(Also on FtB)

How many genes does it take to make a squid eye?

This is an article about cephalopods and eye evolution, but I have to confess at the beginning that the paper it describes isn’t all that interesting. I don’t want you to have excessive expectations! I wanted to say a few words about it, though, because it addresses a basic question I get all the time, and while I was at it, I thought I’d mention a few results that set the stage for future studies.

I’m often asked to resolve some confusion: the scientific literature claims that eyes evolved multiple times, but I keep saying that eyes show evidence of common origin. Who is right? Why are you lying to me, Myers? And the answer is that we’re both right.

Eyes evolved independently multiple times: the cephalopod eye evolved about 480 million years ago, and the vertebrate eye is even older (490 to 600 million years), but both evolved long after the last common ancestor of molluscs and chordates, which lived about 750 million years ago. The LCA probably did not have an image-forming eye at all.

And that’s the key point: a true eye is a structure that has an image forming element, a retina, and some kind of morphological organization that allows a distant object to form a pattern of light on that retina. That organization can be something as simple as a cup-shaped depression or pinhole lens, or as elaborate as our camera eye, or an insect’s compound eye, or the mirror eyes of a scallop. An eye is photoreceptors + structure. Eyes have evolved multiple times; they’ve even evolved multiple times within the phylum Mollusca, and different lineages have adopted different strategies for forming images.

i-0f6286e48f36b86c30ae81bdbfc6f415-eye_phylo-thumb-500x210-70144.jpeg
(Click for larger image)

Phylogenetic view of molluscan eye diversification. Camera eyes were independently acquired in the coleoid cephalopod (squids and octopuses) and vertebrate lineages.

The LCA probably didn’t have an eye, but it did have photoreceptors, and the light sensitive cells were localized to patches on the side of the head. It even had two different classes of photoreceptors, ciliary and rhabdomeric. That’s how I can say that eyes demonstrate a pattern of common descent: animals share the same building block for an eye, these photoreceptor cells, but different lineages have assembled those building blocks into different kinds of eyes.

Photoreceptors are fundamental and relatively easy to understand; we’ve worked out the full pathways in photoreceptors that take an incoming photon of light and convert it into a change in the cell’s membrane properties, producing an electrical signal. Making an eye, though, is a whole different matter, involving many kinds of cells organized in very specific ways. The big question is how you evolve an eye from a photoreceptor patch, and that’s going to involve a whole lot of genes. How many?

This is where I turn to the paper by Yoshida and Ogura, which I’ve accused of being a bit boring. It’s an exercise in accounting, trying to identify the number and isolate genes that are associated with building a camera eye in cephalopods. The approach is to take advantage of molluscan phylogeny.

As shown in the diagram above, molluscs are diverse: it’s just the coleoid cephalopods, squid and octopus, that have evolved a camera eye, while other molluscs have mirror, pinhole, or compound eyes. So one immediate way to narrow the range of relevant genes is a homology search: what genes are found in molluscs with camera eyes that are not present in molluscs without such eyes. That narrows the field, stripping out housekeeping genes and generic genes involved in basic cellular processes, even photoreception. Unfortunately, it doesn’t narrow the field very much: they identified 5,707 candidate genes that might be evolved in camera eye evolution.

To filter it further, the authors then looked at just those genes among the 5,707 that were expressed in embryos. Eye formation is a developmental process, after all, so the interesting genes will be expressed in embryos, not adults (a sentiment with which I always concur). Unfortunately, development is a damnably complicated and interesting process, so this doesn’t narrow the field much, either: we’re down to 3,075 candidate genes.

Their final filter does have a dramatic effect, though. They looked at the ratio of non-synonymous to synonymous nucleotide changes in the candidate genes, a common technique for identifying genes that have been the target of selection, and found a grand total of 156 genes that showed a strong signal for selection. That’s 156 total genes that are different between coleoids and other molluscs, are expressed in the embryonic eye, and that show signs of adaptive evolution. That’s manageable and interesting.

They also looked for homologs between cephalopod camera eyes and vertebrate camera eyes, and found 1,571 of them; this analysis would have been more useful if it were also cross-checked against other non-camera-eye molluscs. As it is, that number just tells us some genes are shared, but they could have been genes involved in photoreceptor signalling (among others), which we already expect to be similar. I’d like to know if certain genes have been convergently adopted in both lineages to build a camera eye, and it’s not possible to tell from this preliminary examination.

And that’s where the paper more or less stops (I told you not to get your hopes up too high!) We have a small number of genes identified in cephalopods that are probably important in the evolution of their vision, but we have no idea what they do, precisely, yet. The authors have done some preliminary investigations of a few of the genes, and one important (and with hindsight, rather obvious) observation is that some of the genes are expressed not just in the retina, but in the brain and optic lobes. Building an eye involved not just constructing an image-forming sensor, but expanding central tissues involved in processing visual information.


Fernald RD (2006) Casting a genetic light on the evolution of eyes. Science 313(5795):1914-8.

Yoshida MA, Ogura A (2011) Genetic mechanisms involved in the evolution of the cephalopod camera eye revealed by transcriptomic and developmental studies.. BMC Evol Biol 11:180.

(Also on FtB)

How to examine the evolution of proteins

In my previous post, I described the misguided approach Gauger and Axe have taken to criticizing evolution, and one of the peculiarities of their criticism is that they cited another paper by a paper by Carroll, Ortlund, and Thornton which traced (successfully) the evolutionary history of a class of proteins. Big mistake. As I pointed out, one of the failings of the Gauger/Axe approach is that they’re asking how one protein evolved into a cousin protein, without considering the ancestral history …they make the error of trying to argue that an extant protein couldn’t have directly evolved into another extant protein, when no one argues that they did.

The tactical error is that right there in the very first paragraph of their paper, Carroll, Ortlund, and Thornton point out the fallacy of what the creationists were doing.

Direct comparisons among present-day proteins can sometime yield insights into the sequence and structural mechanisms that underlie functional differences. Such “horizontal” comparisons, however, cannot determine which protein features are ancestral and which are derived, so they are not suited to reconstructing the events that produced functional diversity.

They don’t mention Gauger and Axe, of course — this paper was written before the creationists wrote theirs — but a methodological flaw is still spelled out plainly, the creationists reference it so I presume they read it, and they still charged ahead and did their flawed study, and then had the gall to claim their work was superior.

Ah, silly creationists. They just assume their target audience won’t bother to read the work they’re citing, and isn’t competent to understand it anyway. And they’re usually right.

The crew doing the work in the Carroll paper did not make the same mistakes. They are doing ancestral sequence reconstruction (ASR), so the effort to work backward to trace ancestral states is implicit. The bulk of the paper describes the sequencing of homologous and paralogous genes in more organisms (in this case, especially cartilaginous fishes), and the analysis of synthesized, reconstructed ancestral proteins, so it’s built entirely on an empirical foundation. And their answers actually advance our understanding of the base-by-base changes that led to the evolution of the current set of proteins. I think they were courteous and sensible (and probably, the idea didn’t even occur to them) in not comparing their work to that of the creationists — it would have been less than gracious to point out how ugly, cheap, and cheesy the stuff coming out of the Biologic Institute looks.

What the real scientists were studying is a class of receptors that respond to mineralocorticoid and/or glucocorticoid hormones. These proteins are similar in sequence and structure to one another, and are clearly paralogous: they arose by an ancient gene duplication event, somewhere around 450 million years ago. The two copies have since diverged to have different roles in hormone physiology.

The two receptors are called MR, for mineralocorticoid receptor, and GR, for glucocorticoid receptor.

MRs are activated by adrenal hormones, aldosterone and deoxycorticosterone, and to a lesser exent, cortisol. The receptors are extremely sensitive to the hormones. These hormones are important in regulating salt balance, and you might well imagine that in our fishy ancestors, as well as ourselves, regulating the concentrations of salts in our blood and tissues is a very important function. Deviations can cause death, after all.

GRs are activated by high doses of cortisol; these receptors are much less sensitive, requiring high doses of the hormone to trigger a response. They are important in regulating stress responses: they adjust the immune system and sugar metabolism. These aren’t ‘twitchy’, fast response functions like maintaining salt balance is; they are long-term, ‘last-ditch’ reactions to growing stresses, so functionally it makes sense that activation requires high levels of accumulated hormone.

Using ASR techniques — phylogenetic analysis and estimating the most likely sequence of the ancestral protein — the investigators have put together a picture of the receptor before MR and GR diverged. This protein is called AncCR, for Ancestral Corticosteroid Receptor, and it has been synthesized in the lab, so we know about its properties. AncCR is a lot like MR: it’s sensitive to low concentrations of hormone, and it responds to low concentrations of a broad spectrum of hormones.

The pedigree of these proteins is illustrated below.

i-8821726c17d966da50a695e8a1d903b7-grmr_phylo-thumb-500x280-70023.gif
(Click for larger image)

Simplified phylogeny of corticosteroid receptors. Ancestral sequences are shown at relevant nodes: AncCR, the last common ancestor of all MRs and GRs; AncGR1, the GR ancestor of cartilaginous fishes and bony vertebrates; AncGR2, the GR ancestor of ray- and lobe-finned fishes (including tetrapods); AncMR1, the MR ancestor of cartilaginous fishes and bony vertebrates. (AncGR1.0 and AncGR1.1 are different reconstructions of node AncGR1, inferred from datasets with different taxon sampling.) Black, high sensitivity receptors; gray, low sensitivity receptors. Single and double gray dashes mark functional shifts towards reduced sensitivity and increased specificity, respectively. Support values are the chi-square statistic (1 – p, where p equals the estimated probability that a node could occur by chance alone) calculated from approximate likelihood ratios. The length of branches from AncCR to AncMR1 and to AncGR1, expressed as the mean number of substitutions per site, are indicated in parentheses.

The MRs are similar in function to the AncCR, so they aren’t particularly interesting in this context — there’s no big question about how the MRs retained similar properties to their ancestor. The interesting questions are all about the GRs: what changed to make GRs different from the ancestral protein? What amino acid changes set AncGR1 apart from AncCR?

The investigators have an answer. The first step was the evolution of reduced hormone sensitivity, so that these receptors only responded to very high concentrations of the hormone, and the second step was a loss of sensitivity to the mineralocorticoids, already handled by the MRs, so that they only respond to high doses of cortisol, which at this point became exclusively a stress hormone. And they know exactly which amino acids changed to confer the reduced sensitivity.

They identified three changes: the conversion of a valine at position 43 into an alanine, called V43A; the conversion of an arginine at position 116 into a histidine, R116H; and the conversion of a cysteine at position 71 into a serine, C71S. They also know the effect of the mutations. V43A and R116H each loosen the structure of the receptor so that it’s less sensitive, and when both mutations are present the effect greatly reduces sensitivity about 10,000-fold…too much! They make the mutant hormone too insensitive, and much less insensitive than their reconstructed AncGR1.

The most interesting change is C71S. It basically does nothing to the sensitivity; make the C71S change to AncCR, and you get a receptor protein that is essentially indistinguishable in its response. This is effectively a neutral mutation. It can spread freely through a population with no deleterious or advantageous effect.

C71S does have one significant effect in cooperation with the other two mutations: it buffers both V43A and R116H. When all three mutations are present, the desensitizing effects of V43A and R116H are reduced to produce the level of sensitivity expected for the AncGR1 protein. This means we can reconstruct the order of the amino acid changes in evolution. First came C71S, because it doesn’t cause any particular adaptive change, and because if either V43A or R116H came first, the resulting receptor would be generally non-functional. The existence of C71S first means the subsequent V43A/R116H changes produced receptors that are still functional, but simply operate only at higher concentrations of the hormones.

All of these changes are perfectly compatible with an evolutionary model of their origin. No sudden leaps, no deleterious intermediates are required — everything hangs together beautifully and is backed up by solid empirical evidence. In addition, the work explains the mechanics of receptor-hormone interactions, stuff I haven’t explained here, but if you’re a biochemist, there’s much to savor in the paper.

It’s an amazing contrast to the Gauger and Axe paper, too. No wonder I’m not a creationist!


Carroll SM, Ortlund EA, Thornton JW (2011) Mechanisms for the evolution of a derived function in the ancestral glucocorticoid receptor. PLoS Genet.7(6):e1002117. Epub 2011 Jun 16.

The epigenetics miracle?

Jerry Coyne is mildly incensed — once again, there’s a lot of recent hype about epigenetics, and he doesn’t believe it’s at all revolutionary. Well, I’ve written about epigenetics before, I think it’s an extremely important subject central to our understanding of development, and…I agree with him completely. It’s important, we ought to spend more time discussing it in our classes, but it’s all about the process of gene expression, not about radically changing our concepts of evolution. I like to argue that what multigenerational epigenetic effects do is blur out or modulate the effects of genetic change over time, and it might mask out or highlight allelic variation, but ultimately, it’s all about the underlying genetic differences.

Coyne mentions one journalist who claims that new discoveries in epigenetics would “make Darwin swoon,” which is a bizarre standard. Darwin knew next-to-nothing about genetics — he had his own weird version of Lamarckian inheritance — and wasn’t even equipped to imagine molecular biology, so yes, just about anything in this field would dazzle him. My freshman introductory biology course would blow Charles Darwin away — he’d have to struggle to keep up with the products of American public education.

(Also on FtB)

The greatest science paper ever published in the history of humankind

That’s not hyperbole. I really mean it. How else could I react when I open up the latest issue of Bioessays, and see this: Cephalopod origin and evolution: A congruent picture emerging from fossils, development and molecules. Just from the title alone, I’m immediately launched into my happy place: sitting on a rocky beach on the Pacific Northwest coast, enjoying the sea breeze while the my wife serves me a big platter of bacon, and the cannula in my hypothalamus slowly drips a potent cocktail of cocain and ecstasy direct into my pleasure centers…and there’s pie for dessert. It’s like the authors know me and sat down to concoct a title where every word would push my buttons.

The content is pretty good, too. It’s not perfect; the development part is a little thin, consisting mainly of basic comparative embryology of body plans, with nothing at all really about deployment of and interactions between significant developmental genes. But that’s OK. It’s in the nature of the Greatest Science Papers Ever Written that stuff will have to be revised and some will be shown wrong next month, and next year there will be more Greatest Science Papers Ever Written — it’s part of the dynamic. But I’ll let it be known, now that apparently the scientific community is aware of my obsessions and is pandering to them, that the next instantiation needs more developmental epistasis and some in situs.

This paper, though, is a nice summary of the emerging picture of cephalopod evolution, as determined by the disciplines of paleontology, comparative embryology, and molecular phylogenetics, and that summary is internally consistent and is generating a good rough outline of the story. And here is that story, as determined by a combination of fossils, molecular evidence, and comparative anatomy and embryology.

Cephalopods evolved from monoplacophoran-like ancestors in the Cambrian, about 530 million years ago. Monoplacophorans are simple, limpet-like molluscs; they crawl about on the bottom of the ocean under a cap-like shell, foraging snail-like on a muscular foot. The early cephalopods modified this body plan to rise up off the bottom and become more active: the flattened shell elongated to become a cone-like structure, housing chambers for bouyancy. Movement was no longer by creeping, but used muscular contractions through a siphon to propel the animal horizontally. Freed from its locomotor function, the foot expanded into manipulating tentacles.

i-a4650a364628b045d3f57a3baf0e7529-monoplacophoran.jpeg

These early cephalopods, which have shells common in the fossil record, would have spent their lives bobbing vertically in the water column, bouyed by their shells, and with their tentacles dangling downward to capture prey. They wouldn’t have been particularly mobile — that form of a cone hanging vertically in the water isn’t particularly well-streamlined for horizontal motion — so the next big innovation was a rotation of the body axis, swiveling the body axis 90° to turn a cone into a torpedo. There is evidence that many species did this independently.

i-f8e43ceb7154a2f6b4c1b890e7ac4f64-ceph_rotation-thumb-500x331-67138.jpeg
The tilting of the body axes of extant cephalopods. This was a result of a polyphyletic and repeated trend towards enhanced manoeuverability. The morphological body axes (anterior-posterior, dorso-ventral) are tilted perpendicularly against functional axes in the transition towards extant cephalopods.

We can still see vestiges of this rotation in cephalopod embryology. If you look at early embryos of cephalopods (at the bottom of the diagram below), you see the same pattern: they are roughly disc-shaped, with a shell gland on top and a ring of tentacle buds on the bottom. They subsequently extend and elongage along the embryonic dorsal-ventral axis, which becomes the anterior-posterior axis in the adult.

i-4ad43bc29c4b1d48673f4cd0de6356a5-ceph_comp_embryo-thumb-450x786-67111.jpeg
In extant cephalopods the body axes of the adult stages are tilted perpendicularly versus embryonic stages. As a con- sequence, the morphological anterior-posterior body axis between mouth and anus and the dorso-ventral axis, which is marked by a dorsal shell field, is tilted 908 in the vertical direction in the adult cephalopod. Median section of A: Nautilus, B: Sepia showing the relative position of major organs (Drawings by Brian Roach). C: shared embryonic features in embryos of Nautilus (Nautiloidea) and Idiosepius (Coleoidea) (simplified from Shigeno et al. 2008 [23] Fig. 8). Orientation of the morphological body axes is marked with a compass icon (a, anterior; d, dorsal; p, posterior; v, ventral; dgl, digestive gland; gon, gonad; ngl, nidamental gland).

The next division of the cephalopods occurred in the Silurian/Devonian, about 416 million years ago, and it involved those shells. Shells are great armor, and in the cephalopods were also an organ of bouyancy, but they also greatly limit mobility. At that early Devonian boundary, we see the split into the two groups of extant cephalopods. Some retained the armored shells; those are the nautiloids. Others reduced the shell, internalizing it or even getting rid of it altogether; those are the coleoids, the most successful modern group, which includes the squids, cuttlefish, and octopuses. Presumably, one of the driving forces behind the evolution of the coleoids was competition from that other group of big metazoans, the fish.

The nautiloids…well, the nautiloids weren’t so successful, evolutionarily speaking. Only one genus, Nautilus has survived to the modern day, and all the others followed the stem-group cephalopods into extinction.

The coleoids, on the other hand, have done relatively well. The number of species have fluctuated over time, but currently there are about 800 known species, which is respectable. The fish have clearly done better, with about 30,000 extant species, but that could change — there are signs that cephalopods have been thriving a little better recently in an era of global warming and acute overfishing, so we humans may have been giving mobile molluscs a bit of a tentacle up in the long evolutionary competition.

There was another major event in coleoid history. During the Permian, about 276 million years ago, there was a major radiation event, with many new species flourishing. In particular, there was another split: between the Decabrachia, the ten-armed familiar squid, and the Vampyropoda, a group that includes the eight-armed octopus, the cirroctopodes, and Vampyroteuthis infernalis. The Vampyropoda have had another locomotor shift, away from rapid jet-propelled movement to emphasizing their fins for movement, or in the case of the benthic octopus, increasing their flexibility to allow movement through complex environments like the rocky bottom.

Time for the big picture. Here’s the tree of cephalopod evolution, using dates derived from a combination of the available fossil evidence and primarily molecular clocks. The drawings illustrate the shell shape, or in the case of the coleoids, the shape of the internal shell, or gladius, if they have one.

i-205e51c0850d23812a20e3bd6bf7010f-ceph_lineage-thumb-500x631-67114.jpeg
A molecularly calibrated time-tree of cephalopod evolution. Nodes marked in blue are molecular divergence estimates (see methods in Supplemental Material). The divergence of Spirula from other decabrachiates are from Warnke et al. [43], the remaining divergences are from analyses presented in this paper. Bold lineages indicate the fossil record of extant lineages, stippled lines are tentative relationships between modern coleoids, partly based on previous studies [41, 76, 82] and fossil relationships are based on current consensus and hypoth- eses presented herein. Shells of stem group cephalopods and Spirula in lateral view with functional anterior left. Shells of coleoids in ventral view with anterior down. The Mesozoic divergence of coleoids is relatively poorly resolved compared to the rapid evolution of Cambro- Ordovician stem group cephalopods. Many stem group cephalopod orders not discussed in the text are excluded from the diagram.

The story and the multiple lines of evidence hang together beautifully to make a robust picture of cephalopod evolution. The authors do mention one exception: Nectocaris. Nectocaris is a Cambrian organism that looks a bit like a two-tentacled, finned squid, which doesn’t fit at all into this view of coleoids evolving relatively late. The authors looked at it carefully, and invest a substantial part of the review discussing this problematic species, and decided on the basis of the morphology of its gut and of the putative siphon that there is simply no way the little beast could be ancestral to any cephalopods: it’s a distantly related lophotrochozoan with some morphological convergence. It’s internal bits simply aren’t oriented in the same way as would fit the cephalopod body plan.

So that’s the state of cephalopod evolution today. I shall be looking forward to the Next Great Paper, and in particular, I want to see more about the molecular biology of tentacles — that’s where the insights about the transition from monoplacophoran to cephalopod will come from, I suspect.


Kröger B, Vinther J, Fuchs D (2011) Cephalopod origin and evolution: A congruent picture emerging from fossils, development and molecules: Extant cephalopods are younger than previously realised and were under major selection to become agile, shell-less predators. Bioessays doi: 10.1002/bies.201100001.

A little cis story

I found a recent paper in Nature fascinating, but why is hard to describe — you need to understand a fair amount of general molecular biology and development to see what’s interesting about it. So those of you who already do may be a little bored with this explanation, because I’ve got to build it up slowly and hope I don’t lose everyone else along the way. Patience! If you’re a real smartie-pants, just jump ahead and read the original paper in Nature.

A little general background.

i-d83d945e8def6fc2ca534e29acaec036-svbmap.gif

Let’s begin with an abstract map of a small piece of a strand of DNA. This is a region of fly DNA that encodes a gene called svb/ovo (I’ll explain what that is in a moment). In this map, the transcribed portions of the DNA are shown as gray shaded blocks; what that means is that an enzyme called polymerase will bind to the DNA at the start of those blocks and make a copy in the form of RNA, which will then enter the cytoplasm of the cell and be translated into a protein, which does some work in the activities of that cell. So svb/ovo is a small piece of DNA which, in the normal course of events, will make a protein.

Most of the DNA here is not transcribed. Much of it is junk — changing the sequence of those areas has no effect on the protein, and has no effect on the appearance or function of the organism. Some of it, though, is regulatory DNA, and its sequence does matter. The white boxes labeled DG2, DG3, Z, A, E, and 7 are regions called enhancers — they are not translated into protein, but their sequence affects the expression of svb/ovo. One way to think of them is that they are small parking spots for other proteins that will bind to the DNA sequences in each enhancer. These protein/DNA complexes will then fold around to make a little landing zone for the polymerase, to encourage transcription of the svb/ovo gene. This is why this is called regulatory DNA: it doesn’t actually make the svb/ovo protein itself, but it’s important in controlling when and where and how much of the svb/ovo protein will be made.

Now for some jargon; sorry, but you have to know what it is to follow along in the literature. Those little white boxes of regulatory DNA are often called cis factors, because they have to be located on the same strand of DNA as the protein-coding gene in order to work. In general, when we’re talking about cis factors, we’re talking about non-coding regulatory DNA. The complement of that is the actual coding sequence, the little gray boxes in the diagram, and those have the general name of trans factors.

There is a bit of a debate going on about the relative importance of cis and trans mutations in evolution. Proponents of the cis perspective like to point out that cis mutations can be wonderfully subtle and specific; you can make a change in an enhancer and only modify the expression of the gene in one tissue, or even a small part of one tissue, while changing a trans factor causes changes in every tissue that uses that gene product. Also, most of the cis proponents are evo-devo people, scientists who study the small variations in timing and magnitude of gene expression that lead to differences in form, so of course the kinds of changes that affect the stuff we study must be the most important.

Proponents of the trans view can point out that small changes in the coding regions of genes can also produce subtle shifts in what the genes do, and that mutations can also produce very large effects. Those cis changes appear to be little tweaks, while trans changes can run the gamut from non-existent/weak to strong, and so have great power. They also like to point out that most of the data in the literature documents trans changes between species, and that a lot of the evo-devo stuff is speculative.

It’s a somewhat silly debate, because we all know that both cis and trans effects are going to be found important in evolution, in different ways in different organisms, and that arguing about which is more important is kind of pointless — it will depend on which feature and which species you’re looking at. But the debate is also useful as a goad to urge people to look more at the subtleties and ask more questions about those enhancers, as in the paper I’m about to describe.

What is this svb/ovo gene?

i-6f9a564cbbfe589bd569f327224e5ab5-flybutt.jpeg

This is a drawing of just the back end of a fly larva, and what you should be able to see is that they’re very hairy. Dorsally, there’s a collection of small hairs called trichomes, and ventrally there are some thicker, stouter hairs called denticles. If you destroy the svb/ovo coding region, these hairs don’t form — svb is an important gene for organizing and making hairs on the cuticle of the fly. It’s name should make sense: svb is short for shavenbaby. The gene is responsible for making hairs, but when you break it with a mutation you get embryos and larvae lacking those hairs, a shaven baby.

It also has the synonym of ovo, because it has another important function in the maturation of oocytes, something I’ll skip over entirely. All you need to know is that svb/ovo is actually a large complex gene with multiple functions, and all we care about right now is its function of inducing hair development.

Now let’s look at embryos of two different species of fruit flies, Drosophila melanogaster at the top, and Drosophila sechellia at the bottom. D. melanogaster is clearly hairier than D. sechellia, and you might be wondering if svb is the gene making a difference here, and if you’re following the debate, you might be wondering whether this is a change in the trans coding region or the cis regulatory region.

i-a1c398554f3f1549dcb739ad80f4e8d3-twoflies.jpeg

One way to figure this out is to sequence and compare maps of the svb region in multiple fly species and ask where the actual molecular differences are. This isn’t trivial: D. melanogaster and D. sechellia have been diverging for half a million years, and there have been lots of little changes all over the place, many of them expected to be neutral. What was done to narrow the search was to compare the sequences of five different Drosophila species with hairy embryos to the relatively naked D. sechellia, and ask which changes were unique to the less hairy form.

A hotspot lit up in the comparison: there is one region, about 500 base pairs long, in the enhancer labeled “E” in the diagram at the top of the page, which contained 13 substitutions and one deletion unique to D. sechellia, in 7 clusters. This is very suggestive, but not definitive; these are consistent differences, but we don’t know yet whether these molecular differences cause the differences in hairiness. For that, we need an experiment.

The experiment.

This is the cool part. The investigators built constructs containing the E enhancer coupled to the svb gene and a reporter tag, and inserted those into fly embryos and asked how they affected expression; so they could effectively put the D. sechellia enhancer into D. melanogaster, and the D. melanogaster enhancer into D. sechellia, and ask if they were sufficient to drive the species-specific pattern of svb expression. The answer is yes, mostly: they weren’t perfect copies of each other, suggesting that there are other elements that contribute to the pattern, but the D. sechellia enhancer produced reduced expression in whatever fly carried it, while the D. melanogaster enhancer produced greater expression.

But wait, there’s more! The species differences were caused by differences in 7 clusters within the E enhancer. The authors built constructs in which the mutations in each of the 7 clusters was uniquely and independently inserted, so they could test each mutational change one by one. The answer here was that each of the seven mutations that led to the D. sechellia pattern had a similar effect, reducing very slightly the level of svb expression. Furthermore, they had a synergistic effect: the reduction in hairs when all 7 mutations were present was not simply the sum of the individual effects of each mutation alone.

What does it all mean?

One conclusion of this work is that here is one more clear example of a significant morphological difference between species that was generated by molecular modification of cis regulatory elements. Hooray, one more data point in the cis/trans debate!

Another interesting observation is that this is a phenotype that was built up gradually, by a set of small changes to an enhancer element. D. sechellia gradually lost its trichome hairs by the accumulation of single-nucleotide changes in regulatory DNA, each of which contributed to the phenotype — a very Darwinian pattern of change.

By modifying the regulatory elements, evolution can generate distinct, focused variations. Knocking out the entirety of the svb gene is disastrous, not only removing hairs but also seriously affecting fertility. The little tweaks provided by changes to the enhancer region mean that morphology can be fine-tuned by chance and selection, without compromising essential functions like reproduction. In the case of these two species of flies, D. sechellia can have a functional reproductive system, the full machinery to make functional hairs, but at the same time can turn off dorsal trichomes while retaining ventral denticles.

It all fits with the idea that fundamental aspects of basic morphology are going to be defined, not by the raw materials used to build them, but by the regulation of timing and quantity of those gene products — that the rules of development are defined by the regulatory activity of genes, not entirely by the coding sequences themselves.


Frankel N, Erezyilmaz DF, McGregor AP, Wang S, Payre F, Stern DL (2011) Morphological evolution caused by many subtle-effect substitutions in regulatory DNA. Nature 474(7353):598-603.

The saga of Junk DNA

So you’re tantalized by this strange obsession creationists have with junk DNA. It offends them mightily, I think because they find comfort in the idea that everything in the universe must have a purpose, because if it doesn’t, maybe that means they are nothing more than spots of dandruff on a dead rock hurtling blindly through space, and we can’t have that then.

It’s true that the odious Jonathan Wells has written a whole book declaring that everything in the genome has a glorious function implicitly designed by his god, the Rev. Sun Myung Moon. Larry Moran has begun the process of dismantling Wells, with, so far, three posts critiquing his claims, all well worth reading.