Back to Basics

A friend asked for an explainer on Bayesian statistics, and I instinctively reached for Yudkowsky’s only to find this at the top:

This page has now been obsoleted by a vastly improved guide to Bayes’s Theorem, the Arbital Guide to Bayes’s Rule. Please read that instead. Seriously. I mean it.

You can see why once you’ve clicked the link; it asks for your prior experience, then tailors the explanation appropriately. There’s also some good diagrams, and it tries to explain the same concept multiple ways to hammer the point home. Their bit on p-values is on-point, too.

Speaking of stats, I’ve also been drawn back into a course on probability I started years ago. MIT OpenCourseware has a lot of cool offerings, but this entry on probability has been worth my attention. While E.T. Jaynes’ Probability Theory still has my favourite treatment of the subject, the video lectures are easier to parse and proceed at a faster clip.

An Unexpected Non-Surprise

Shoot, are you Canadian or not? For those not, I’ll give a one-paragraph primer on Canadian politics.

There are two main parties on the federal level: the Liberals (centre-Left) and the Conservatives (centre-Right), who swap power every five to ten years. There are also a constellation of lesser parties, of which the New Democratics (Left) have enjoyed the most success in recent years. Each province has localised versions of the federal parties, though their ideological relations can vary widely from place to place; for instance, Alberta’s New Democratic Party is currently fighting with their British Columbia counterparts over an oil pipeline, British Columbia’s right-wing party are the Liberals, and Quebec…. they’re off doing their own thing.

Ontario, our largest province, tends to follow the federal system quite closely. The Ontario Liberal party has been in power for nearly fifteen years, and as you’d expect that’s led to corruption scandals. Given that centrists hold an electoral edge, and the two Left-ish parties split the progressive vote, the most likely outcome would be a switch from the centre-Left Liberals to the centre-Right Progressive Conservatives during this year’s election.

Except, mere months before the election, the PC leader stepped down over sexual assault allegations. This triggered a scramble for a new leader, which turned ugly when the disgraced leader tried to regain power, only to bow out again as the negative press mounted. Nonetheless, that wasn’t enough to tank the PC’s poll numbers and their success at the polls seemed much more likely than not.

So you can imagine most people’s disgust when a Trump clone by the name of Doug Ford managed to win the PC leadership in a mysterious upset. Oh joy, a “belligerent bully” who wished to run government like a business on policies that made no sense would be in control of our largest province. I was in a state of despair.

Then, unexpectedly, the least surprising thing happened.

CBC's Ontario poll tracker, showing a PC nosedive and an NDP rise.

The controversial leader generated controversy and acrimony, which has led to a dip in popularity for the PC’s (official colour: blue). Rather than switch to the scandal-plagued Liberals, though, PC voters are switching to the cleaner NDP (official colour: orange). Fearing a split of the progressive vote could put the PC’s into power anyway, Liberal voters (official colour: red) are shifting over to the NDP too. (Green party voters are off doing their own thing).

That movement hasn’t been enough to assure an NDP victory, as CBC’s Poll Tracker still figures the PC’s have a 95% chance of a majority or minority government. But with two and a half weeks to go, no shortage of new PC scandals, and press coverage that’s overjoyed at the change in fortunes, this could create a bandwagon effect that puts a coalition of two Left-ish parties into power.

I’m hoping the Ontario political landscape follows the usual rules of politics, and rejects the typical Liberal-Conservative tango.

Sam Harris “Corrects” the Record

Whelp, less than thirty-eight hours after my blog post Sam Harris finally deleted that old video. However, spotting that made me realise I’d missed his explanation for why he edited the episode. That was probably by design, the description to the podcast episode drops no hint that it’s there.

As before, I’ve done some light editing, but also included time-stamps so you can check my work.

[3:17] Just a little housekeeping for today’s episode. A few episodes back, I presented audio from an event I did with Christian Picciolini in Dallas, and that was a fun event, I enjoyed speaking with Christian a lot […]

[3:51] But unfortunately, in that podcast, Christian said a few things that don’t seem to have been strictly true, and as the weeks have passed and that podcast has continued streaming I’ve heard from two people who consider his remarks to have been unfairly damaging to their reputations. This is a problem that I am quite sensitive to, given what gets done to me, by my critics. Somewhat ironically, Christian seems to rely on the Southern Poverty Law Center for much of his information, but this is an organization, as many of you know, which is undergoing a full moral and intellectual self-immolation. In fact, Christian is confused enough about the stature of that organization that he retweeted an article from the SPLC website wherein I am described as a racist recruiter for the alt-right. […]

We’re barely a minute in, and Harris has badly distorted the record. One of Picciolini’s tweets references the Southern Poverty Law Center, true, but he also cites tweets by David Duke, The Daily Stormer, Wikipedia (and before you start, I checked the citations and it’s legit), Joe Rogan, YouTube recordings of Molyneux, and his experiences talking to families with Molyneux-obsessed members. Yet that one reference to the SPLC somehow translates into “much of his information?”

Harris also misrepresents that SPLC article. They didn’t declare him to be a recruiter, self-declared members of the alt-Right said that Sam Harris helped lead them to the alt-Right.

[Read more…]

Another One

One downside to floating around atheist/skeptic communities so long is that I’ve seen a lot of people leave. The most painful departures are the ones where people get frustrated with their peers over their inaction or inability to comprehend, or burn out from having to explain concepts that shouldn’t need explaining. These cases always leave me self-conscious of my own silence and inaction, wondering if I’d help stem the tide if I became more active.

For four years I have written and given talks about the same core message: Movement atheism must expand its ambitions to include the interests and needs of communities systematically disenfranchised in ways far more harrowing than merely existing as an atheist (within the US context).

Little changed in the five years that I was a part of organized secular communities. To what degree things have changed is debatable. For me, the point is that these spaces haven’t evolved to the point that they are welcoming or even ideal for certain groups of people.

Over time, I was able to better understand how this resistance to change that infests these spaces has a lot to do with select donors sustaining these spaces as well as those occupying executive and board leadership positions.

The thing is, the writing was always on the wall. It just took a considerable amount of time for me to admit it to myself.

Ouch. Sincere Kirabo has come a long way since 2015, when he earned a scholarship from American Atheists. He’s also been a writer for The Establishment, Huffington Post, the Good Men Project, and Everyday Feminism. He was the Social Justice Coordinator for the American Humanist Association until a week ago. He remains frustrated with the inaction of leaders within the atheist/skeptic movement. How could you not, when you’re dealing with shit like this:

Sexism continues to be a huge problem with this movement. And by “problem” I mean that it exists and most men choose to either deny it, minimize it, or blame victims.

I’m directly and indirectly connected to countless women who were once a part of this movement and have since left. It’s sad and disgusting and infuriating that there are whisper networks within secular circles so that women can warn each other about certain men rumored to be sexual harassers or abusers.

And yes, several women have reported instances of sexual misconduct and even rape to me. I’m not at liberty to discuss these incidents in any detail, but I will say that all three cases involve men who were at one point connected to organized humanism or atheism.

A lot of people have dropped out of the movement for a lot less.

My bandwidth has been depleted and there’s no way that I can fully recover and advance the causes that mean the most to me until I remove myself from spaces that preserve/propagate elitist rationalism, complacency, and general white nonsense.

I know some are interested in knowing what’s next for me. At this time, all I will say is that I and several others are in the process of building a platform dedicated to cultivating Black humanist culture with a focus on creating a world that honors the “radical” idea of free Black people. Details will follow in the near future.

… so it’s to Kirabo’s credit that he’s not dropping out of the movement. He’s switched from working for change within existing orgs, to creating his own organisations that are less problematic. I heartily approve, though had Kirabo dropped out instead I’d also approve. Even if the change in tactics doesn’t work, it’ll at least create a safe space for people to promote secularism without having to hold their nose over casual bigotry.

The Two Cultures, as per Steven Pinker

As I mentioned before, C.P. Snow’s “Two Culture” lecture is light on facts, which makes it easy to mould to your whims. Go back and re-read that old post, absorb C.P. Snow’s version of the Two Cultures, then compare it to Pinker’s summary:

A final alternative to Enlightenment humanism condemns its embrace of science. Following C.P. Snow, we can call it the Second Culture, the worldview of many literary intellectuals and cultural critics, as distinguished from the First Culture of science.[12] Snow decried the iron curtain between the two cultures and called for greater integration of science into intellectual life. It was not just that science was, “in its intellectual depth, complexity, and articulation, the most beautiful and wonderful collective work of the mind of man.” Knowledge of science, he argued, was a moral imperative, because it could alleviate suffering on a global scale …

[Pinker, Steven. Enlightenment Now: The Case for Reason, Science, Humanism, and Progress. Penguin, 2018. Pg. 33-34]

C.P. Snow went out his way to criticise scientists for failing to incorporate literature into their lives, and never ranked one culture as superior to another. Nor did he label them “First Culture” or “Second Culture.” And it wasn’t increased knowledge of science in general that would remove suffering, it was the two cultures intermixing. Pinker is presenting a very different argument than C.P. Snow, at least on the face of it.

But hang on, there’s a footnote right in the middle of that passage….

[12] Snow never assigned an order to his Two Cultures, but subsequent usage has numbered them in that way; see, for example, Brockman 2003.

[Pg. 456]

How is it “following C.P Snow” to call it “Second Culture,” when you acknowledge C.P. Snow never called it “Second Culture?!” What’s worse, look at the page numbers: that acknowledgement comes a full four hundred pages after the misleading phrasing. How many people would bother to flip that far ahead, let alone make the connection to four hundred pages ago? But all right, fine, maaaybe Steven Pinker is just going with the flow, and re-using a common distortion of C.P Snow’s original argument. The proof should lie in that citation to Brockman [2003], which fortunately is available via Google Books. In fact, I can do you one better: John Brockman’s anthology was a mix of work published in Edge magazine and original essays, and the relevant parts just happen to be online.

Bravo, John! You are playing a vital role in moving the sciences beyond a defensive posture in response to turf attacks from the “postmodernists” and other leeches on the academies. You celebrate science and technology as our most pragmatic expressions of optimism.

I wonder, though, if it’s enough to merely point out how hopelessly lost those encrusted arts and humanities intellectuals have become in their petty arms race of cynicism. If we scientists and technologists are to be the new humanists, we must recognize that there are questions that must be addressed by any thinking person which do not lie within our established methods and dialogs. …

While “postmodern” academics and “Second Culture” celebrity figures are perhaps the most insufferable enemies of science, they are certainly not the most dangerous. Even as we are beginning to peer at biology’s deepest foundations for the first time, we find ourselves in a situation in which vast portions of the educated population have turned against the project of science in favor of pop alternatives usually billed as being more “spiritual.”

It appears exactly once in that reference, which falls well short of demonstrating common usage. Even more damning is that Pinker’s citation references the 2003 edition of the book. There’s a 2008 version, and it doesn’t have a single reference to a “Second Culture.” I’ve done my own homework, and I can find a thesis from 2011 which has that usage of “Second Culture,” but falsely attributes it to Snow and never brings it up past the intro. There is an obscure 1993 book which Pinker missed, but thanks to book reviews I can tell it labels science as the “Second Culture,” contrary to how Pinker uses the term. Everything else I’ve found is a false positive, which means Pinker is promoting one mention in one essay by one author as sufficient to show a pattern.

And can I take a moment to call out the contrary labelling here: how, in any way, is science “First” relative to literature? Well before Philosophical Transactions began publishing, we’d already had the Ramayana, the Chu Ci anthology, the Epic of Gilgamesh, The Illiad, Beowulf, and on and on. Instead, Pinker and friends are invoking “Second” as in “Secondary,” lesser, inferior. Unlike de Beauvoir, though, they’re not doing it as a critique, they honestly believe in the superiority of science over literature.

Pinker didn’t invent this ranking, nor was he the first to lump all the humanities in with the literary elites. I think that honour belongs to John Brockman. Consider this essay of his; read very carefully, and you’ll see he’s a little confused on who’s in the non-scientific culture.

Ten years later, that fossil culture is in decline, replaced by the emergent “third culture” of the essay’s title, a reference to C. P. Snow’s celebrated division of the thinking world into two cultures—that of the literary intellectual and that of the scientist. …

In the twentieth century, a period of great scientific advancement, instead of having science and technology at the center of the intellectual world—of having a unity in which scholarship includes science and technology just as it includes literature and art—the official culture kicked them out. The traditional humanities scholar looked at science and technology as some sort of technical special product—the fine print. The elite universities nudged science out of the liberal arts undergraduate curriculum, and out of the minds of many young people, who abandoned true humanistic inquiry in their early twenties and turned themselves into the authoritarian voice of the establishment. …

And one is amazed that for others still mired in the old establishment culture, intellectual debate continues to center on such matters as who was or was not a Stalinist in 1937, or what the sleeping arrangements were for guests at a Bloomsbury weekend in the early part of the twentieth century. This is not to suggest that studying history is a waste of time. History illuminates our origins and keeps us from reinventing the wheel. But the question arises: history of what? Do we want the center of culture to be based on a closed system, a process of text in/text out, and no empirical contact with the world in between?

A fundamental distinction exists between the literature of science and those disciplines in which the writing is most often concerned with exegesis of some earlier writer. In too many university courses, most of the examination questions are about what one or another earlier authority thought. The subjects are self-referential. …

The essay itself is a type specimen of science cheer-leading, which sweeps all the problems of science under the carpet; try squaring “Science is nothing more nor less than the most reliable way of gaining knowledge about anything” with “Most Published Research Findings Are False,” then try finding a published literary critic doing literary criticism wrong. More importantly, Brockman’s December 2001 essay reads a lot like Pinker’s February 2018 book, right down to the “elite” and “authoritarian” “liberal arts” universities turning their back on science. Brockman was definitely ahead of his time, and while only three of his works show up in Pinker’s citation list he’s definitely had a big influence.

This also means Pinker suffers from the same confusion as Brockman. Here’s some of the people he considers part of the Second Culture:

It’s an oddball list. Karl Popper is a member, probably by accident. Adorno was actually an opponent of Heidegger and Popper’s views of science. Essayists (Wieseltier and Gopnik) rub shoulders with glaciologists (Carey, Jackson), sociologists (Bauman), and philosophers (Foucault, Derrida). It’s dominated by the bogey-people of the alt-right, none of whom can be classified as elite authors.

Stranger still, Thomas Kuhn isn’t on there. Kuhn should have been: he argued that science doesn’t necessarily follow the strength of the evidence. During Kuhn’s heyday, many physicists thought that Arthur Eddington’s famous solar eclipse data fell short of proper science. The error bars were very large, the dataset was small, and some contrary data from another telescope was ignored; nonetheless, scientists during Eddington’s heyday took the same dataset as confirmatory. Why? They wanted General Relativity to be true, because it offered an explanation for why light seemed to have a fixed speed and Mercury precessed the way it did. Kuhn called these “puzzles,” things which should be easily solvable via existing, familiar knowledge. Newtonian Mechanics violated that “easy” part of the contract, GR did not, so physicists abandoned ship even in the face of dodgy data. Utility was more important than truth-hood.

Conversely, remember the neutrinos that seemed to run faster than light? If science advanced by falsification, physicists should have abandoned General Relativity in droves; instead, they dismissed the finding and asked the scientists who ran the experiment to try again. In this case, they didn’t want GR to be false, so contrary evidence was rejected. That might seem like a cheap example, since the experimental equipment was shown to be the real problem, but consider that we already knew GR was false because it’s incompatible with Quantum Mechanics. Neither theory can be true at the same time, which means there’s a third theory out there which has a vague resemblance to both but has radically different axioms. Nonetheless no physicist has stopped using GR or QM, because both are effective at solving puzzles. Utility again trumps truth-hood.

Kuhn argued that scientists proposed frameworks for understanding the world, “paradigms,” which don’t progress as we think they do. For instance, Newtonian Mechanics says the International Space Station is perpetually falling towards Earth, because the mass of both is generating attractive forces which cause a constant acceleration; General Relativity says the ISS is travelling in a straight line, but appears to orbit around the Earth because it is moving through a spacetime curved by the energy and mass of both objects. These two explanations are different on a fundamental level, you can’t transform one into the other without destroying some axioms. You’ve gotta chose one or the other, and why would you switch ever switch back? Kuhn even rejected the idea that the next paradigm is more “truthful” than another; again, utility trumps truth-hood.

It’s opposed to a lot of what Pinker is arguing for, and yet:

The most commonly assigned book on science in modern universities (aside from a popular biology textbook) is Thomas Kuhn’s The Structure of Scientific Revolutions. That 1962 classic is commonly interpreted as showing that science does not converge on the truth but merely busies itself with solving puzzles before flipping to some new paradigm which renders its previous theories obsolete, indeed, unintelligible. Though Kuhn himself later disavowed this nihilist interpretation, it has become the conventional wisdom within the Second Culture. [22]

[Enlightenment Nowpg. 400]

Weird, I can find no evidence Kuhn disavowed that interpretation in my source:

Bird, Alexander, “Thomas Kuhn“, The Stanford Encyclopedia of Philosophy (Fall 2013 Edition), Edward N. Zalta (ed.) URL = <https://plato.stanford.edu/archives/fall2013/entries/thomas-kuhn/>

Still, Pinker is kind enough to source his claim, so let’s track it down…. Right, footnote [22] references Bird [2011], which I can find on page 500…

Bird, A. 2011. Thomas Kuhn. In E. N. Zalta, ed., Stanford Encyclopedia of Philosophy . https://plato.stanford.edu/entries/thomas-kuhn/.

He’s using the same source?! I mean, score another point for Kuhn, as he thought that people with different paradigms perceive the same data differently, but we’ve still got a puzzle here. I can’t be sure, but I have a theory for why Pinker swept Kuhn under the rug. From our source:

Feminists and social theorists (…) have argued that the fact that the evidence, or, in Kuhn’s case, the shared values of science, do not fix a single choice of theory, allows external factors to determine the final outcome (…). Furthermore, the fact that Kuhn identified values as what guide judgment opens up the possibility that scientists ought to employ different values, as has been argued by feminist and post-colonial writers (…).

Kuhn himself, however, showed only limited sympathy for such developments. In his “The Trouble with the Historical Philosophy of Science” (1992) Kuhn derides those who take the view that in the ‘negotiations’ that determine the accepted outcome of an experiment or its theoretical significance, all that counts are the interests and power relations among the participants. Kuhn targeted the proponents of the Strong Programme in the Sociology of Scientific Knowledge with such comments; and even if this is not entirely fair to the Strong Programme, it reflects Kuhn’s own view that the primary determinants of the outcome of a scientific episode are to be found within science.

Oh ho, Kuhn thought it was unlikely that sexism or racism could warp science! That makes him the enemy of Pinker’s enemies, and therefore his friend. Hence why Pinker finds it useful to bring up Kuhn, despite their contrary views of science, and for that matter why Pinker can look at Snow’s arguments and see his own: utility trumps truth-hood.

The Return of COINTEL-PRO

You remember them, right? A secretive group within the FBI who targeted “domestic subversives” like Martin Luther King Jr. and Roberta Salper, with tactics that ranged from surveillance to blackmail to false flag ops and entrapment. Even the modern FBI agrees it was both unethical and unlawful.

Rakem Balogun thought he was dreaming when armed agents in tactical gear stormed his apartment. Startled awake by a large crash and officers screaming commands, he soon realized his nightmare was real, and he and his 15-year-old son were forced outside of their Dallas home, wearing only underwear.

Handcuffed and shaking in the cold wind, Balogun thought a misunderstanding must have led the FBI to his door on 12 December 2017. The father of three said he was shocked to later learn that agents investigating “domestic terrorism” had been monitoring him for years and were arresting him that day in part because of his Facebook posts criticizing police.

This isn’t on the same level, but it’s close. FBI officials monitored, arrested, and prosecuted Rakem Balogun for the high crime of being angry enough at how black people are treated in the USA to organize and agitate.

Authorities have not publicly labeled Balogun a BIE [Black Identity Extremist], but their language in court resembled the warnings in the FBI’s file. German said the case also appeared to utilize a “disruption strategy” in which the FBI targets lower-level arrests and charges to interfere with suspects’ lives as the agency struggles to build terrorism cases.

“Sometimes when you couldn’t prove somebody was a terrorist, it’s because they weren’t a terrorist,” he said, adding that prosecutors’ argument that Balogun was too dangerous to be released on bail was “astonishing”. “It seems this effort was designed to punish him for his political activity rather than actually solve any sort of security issue.”

The official one-count indictment against Balogun was illegal firearm possession, with prosecutors alleging he was prohibited from owning a gun due to a 2007 misdemeanor domestic assault case in Tennessee. But this month, a judge rejected the charge, saying the firearms law did not apply.

Ruined his life for it, too; he lost his job, house, and car because of overzealous FBI agents. Amazingly, their crusade lacks the weight of evidence.

The government’s own crime data has largely undermined the notion of a growing threat from a “black identity extremist” [BIE] movement, a term invented by law enforcement. In addition to an overall decline in police deaths, most individuals who shoot and kill officers are white men, and white supremacists have been responsible for nearly 75% of deadly extremist attacks since 2001.

The BIE surveillance and failed prosecution of Balogun, first reported by Foreign Policy, have drawn comparisons to the government’s discredited efforts to monitor and disrupt activists during the civil rights movement, particularly the FBI counterintelligence program called Cointelpro, which targeted Martin Luther King Jr, the NAACP and the Black Panther party.

OK, if I keep talking about this I’ll just wind up quoting the entire article. Go read it and witness the injustice yourself.

Continued Fractions

If you’ve followed my work for a while, you’ve probably noted my love of low-discrepancy sequences. Any time I want to do a uniform sample, and I’m not sure when I’ll stop, I’ll reach for an additive recurrence: repeatedly sum an irrational number with itself, check if the sum is bigger than one, and if so chop it down. Dirt easy, super-fast, and most of the time it gives great results.

But finding the best irrational numbers to add has been a bit of a juggle. The Wikipedia page recommends primes, but it also claimed this was the best choice of all:\frac{\sqrt{5} - 1}{2}

I couldn’t see why. I made a half-hearted attempt at digging through the references, but it got too complicated for me and I was more focused on the results, anyway. So I quickly shelved that and returned to just trusting that they worked.

That is, until this Numberphile video explained them with crystal clarity. Not getting the connection? The worst possible number to use in an additive recurrence is a rational number: it’ll start repeating earlier points and you’ll miss at least half the numbers you could have used. This is precisely like having outward spokes on your flower (no seriously, watch the video), and so you’re also looking for any irrational number that’s poorly approximated by any rational number. And, wouldn’t you know it…

\frac{\sqrt{5} - 1}{2} ~=~ \frac{\sqrt{5} + 1}{2} - 1 ~=~ \phi - 1

… I’ve relied on the Golden Ratio without realising it.

Want to play around a bit with continued fractions? I whipped up a bit of Go which allows you to translate any number into the integer sequence behind its fraction. Go ahead, muck with the thing and see what patterns pop out.

Abductive and Inferential Science

I love it when Professor Moriarty wanders back to YouTube, and his latest was pretty good. He got into a spot of trouble at the end, which led me to muse on writing a blog post to help him out. I’ve already covered some of that territory, alas, but in the process I also stumbled on something more interesting to blog about. It also effects Sean Carroll’s paper, which Moriarty relied on.

The fulcrum of my topic is the distinction between inference and abduction. The former goes “I have a hypothesis, what does the data say about it?,” while the latter goes “I have data, can I find a hypothesis which explains it?” Moriarty uses this as a refutation of falsification: if we start from the data instead of the hypothesis, we’re not trying to falsify anything! To add salt to the wound, Moriarty argues (and I agree) that a majority of scientific activity consists of abduction and not inference; it’s quite common for scientists to jump from one topic to another, essentially engaging in a tonne of abductive activity until someone forces them to write up a hypothesis. Sean Carroll doesn’t dwell on this as much, but his paper does treat abduction and inference as separate things.

They aren’t separate, at least when it comes to the Bayesian interpretation of statistics. Let’s use a toy example to explain how; here’s a black box with a clear cover:

import ("math/rand")

func blackbox() float64 {

     x := rand.Float64()
     return (4111 + x*(4619 + x*(3627 + x*(7392*x - 9206)))/1213
     }

Each time we turn the crank on this function, we get back a number of some sort. The abductive way to analyse this is pretty straightforward: we grab a tonne of numbers and look for a hypothesis. I’ll go for the mean, median, and standard deviation here, the minimum I’ll need to check for a Gaussian distribution.

Samples = 1000001
Mean    = 5.61148
Std.Dev = 1.40887
Median  = 5.47287

Looks like there’s a slight skew downwards, but it’s not that bad. So I’ll propose that the output of this black box follows a Gaussian distribution, with mean 5.612 and standard deviation 1.409, until I can think of a better hypothesis which handles the skew.

After we reset for the inferential analysis, we immediately run into a problem: this is a black box. We know it has no input, and outputs a floating-point number, and that’s it. How can we form any hypothesis, let alone a null and alternative? We’ve no choice but to make something up. I’ll set my null to be “the black box outputs a random floating-point number,” and the alternative to “the output follows a Gaussian distribution with a mean of 0 and a standard deviation of 1.” Turn the crank, aaaand…

Samples            = 1000001
log(Bayes Factor)  = 26705438.01142
  (That means the most likely hypothesis is H1 (Gaussian distribution, mean = 0, std.dev = 1))

Unsurprisingly, our alternative does a lot better than our null. But our alternative is wrong! We’d get that impression pretty quickly if we watched the numbers streaming in. There’s an incredible temptation to take that data to refine or propose a new hypothesis, but that’s an abductive move. Inference is really letting us down.

Worse, this black box isn’t too far off from the typical science experiment. It’s rare any researcher is querying a black box, true, but it’s overwhelmingly true that they’re generating new data without incorporating other people’s datasets. It’s also rare you’re replicating someone else’s work; most likely, you’re taking existing ideas and rearranging them into something new, so prior findings may not carry forward. Inferential analysis is more tractable than I painted it, I’ll confess, but the limited information and focus on novelty still favors the abductive approach.

But think a bit about what I did on the inferential side: I picked two hypotheses and pitted them against one another. Do I have to limit myself to two? Certainly not! Let’s rerun the analysis with twenty-two hypotheses: the flat distribution we used as a null before, plus twenty-one alternative hypotheses covering every integral mean from -10 to 10 (though keeping the standard deviation at 1).

Samples                                 = 100001
log(likelihood*prior), H0               = -4436161.89971
log(likelihood*prior), H1, mean = -10   = -12378220.82173
log(likelihood*prior), H1, mean =  -9   = -10866965.39358
log(likelihood*prior), H1, mean =  -8   = -9455710.96544
log(likelihood*prior), H1, mean =  -7   = -8144457.53730
log(likelihood*prior), H1, mean =  -6   = -6933205.10915
log(likelihood*prior), H1, mean =  -5   = -5821953.68101
log(likelihood*prior), H1, mean =  -4   = -4810703.25287
log(likelihood*prior), H1, mean =  -3   = -3899453.82472
log(likelihood*prior), H1, mean =  -2   = -3088205.39658
log(likelihood*prior), H1, mean =  -1   = -2376957.96844
log(likelihood*prior), H1, mean =   0   = -1765711.54029
log(likelihood*prior), H1, mean =   1   = -1254466.11215
log(likelihood*prior), H1, mean =   2   = -843221.68401
log(likelihood*prior), H1, mean =   3   = -531978.25586
log(likelihood*prior), H1, mean =   4   = -320735.82772
log(likelihood*prior), H1, mean =   5   = -209494.39958
log(likelihood*prior), H1, mean =   6   = -198253.97143
log(likelihood*prior), H1, mean =   7   = -287014.54329
log(likelihood*prior), H1, mean =   8   = -475776.11515
log(likelihood*prior), H1, mean =   9   = -764538.68700
log(likelihood*prior), H1, mean =  10   = -1153302.25886
  (That means the most likely hypothesis is H1 (Gaussian distribution, mean = 6, std.dev = 1))

Aha, the inferential approach has finally gotten us somewhere! It’s still wrong, but you can see the obvious solution: come up with as many hypotheses as you can to explain the data, before we look at it, and run them all as the data rolls in. If you’re worried about being swamped by hypotheses, I’ve got a word for you: marginalization. Bayesian statistics handles hypotheses with parameters by integrating over all of them; you can think of these as composites, a mash of point hypotheses which collectively do a helluva lot better at prediction than any one hypothesis in isolation. In practice, then, Bayesians have always dealt with large numbers of hypotheses simultaneously.

The classic example of this is conjugate priors, where we carefully combine hyperparameters to evaluate a potentially infinite family of probability distributions. In fact, let’s try it right now: the proper conjugate here is the Normal-Inverse-Gamma, as we’re tracking both the mean and standard deviation of Gaussian distributions.

Samples = 1000001
μ       = 5.61148
λ       = 1000001.00000
α       = 500000.50000
β       = 992457.82655

median  = 5.47287

That’s a good start, μ lines up with the mean we calculated earlier, and λ is obviously the sample count. The shape of the posteriors is still pretty opaque, though; we’ll need to chart this out by evaluating the Normal-Inverse-Gamma PDF a few times.Conjugate posterior for the collection of all Gaussian distributions which could describe the data.Excellent, the inferential method has caught up to abduction! In fact, as of now they’re both working identically. Think: what’s the difference between a hypothesis you proposed before collecting the data, and one you proposed after? In frequentism, the stopping problem implies that we could exit early and falsely reject our null, when data coming down the pipe would have pushed it back to “fail to reject.” There, the choice of hypothesis could have an influence on the outcome, so there is a difference between the two cases. This is made worse by frequentism’s obsession over one hypothesis above all others, the null.

Bayesian statistics is free of that problem, because every hypothesis is judged on their relative likelihood in reference to a dataset shared by all hypotheses. There is no stopping problem baked into the methodology. Whether I evaluate any given hypothesis before or after I collect the data is irrelevant, because either way it has to cope with all the data. This also frees me up to invent hypotheses whenever I wish.

But this also defeats the main attack against falsification. The whole point of invoking abduction was to save us from asserting any hypotheses in the beginning; if there’s no difference in when we invoke our hypotheses, however, then falsification might still apply.

Here’s where I return to giving Professor Moriarity a hand. He began that video by saying scientists usually don’t engage in falsification, hence it cannot be The Scientific Method, but ended it by approvingly quoting Feynman: “We are trying to prove ourselves wrong as quickly as possible, because only in that way can we find progress.” Isn’t that falsification, right there?

This is yet another area where frequentist and Bayesian statistics diverge. As I pointed out earlier, frequentism is obsessed with falsifying the null hypothesis and trying to prove it wrong. Compare and contrast with what past-me wrote about Bayes Factors:

If data comes up that doesn’t square well with a hypothesis, its certainty takes a hit. But if we’re comparing it to another hypothesis that also doesn’t predict the data, the Bayes Factor will remain close to 1 and our certainties won’t shift much at all. Likewise, if both hypotheses strongly predict the data, the Factor again stays close to 1. If we’re looking to really shift our certainty around, we need a big Bayes Factor, which means we need to find scenarios where one hypothesis strongly predicts the data while the other strongly predicts this data shouldn’t happen.

Or, in other words, we should look for situations where one theory is… false. That sounds an awful lot like falsification!

But it’s not the same thing. Scroll back up to that Normal-Inverse-Gamma PDF, and pick a random point on the graph. The likelihood at that point is less than the likelihood at the maximum point. If you were watching those two points as we updated with new data, your choice would have gradually gone from about equally likely to substantially less likely. Your choice is more likely to be false, all things being equal, but it’s also not false with a capital F. Maybe the first million data points were a fluke, and if we continued sampling to a billion your choice would roar back to the top? This is the flip-side of having no stopping problem: the door is always left open a crack for any crackpot hypotheses to make a comeback.

Now look closely at the scale of the vertical axis. That maximal likelihood is well above 100%! In fact it’s somewhere around 4,023,000% by my calculations. While the vast majority are dropping downwards, there’s an ever-shrinking huddle of points that are becoming more likely as data is added! Falsification should only make things less likely, however.

Under Bayesian statistics, falsification is treated as a heuristic rather than a core part of the process. We’re best served by trying to find areas where hypotheses differ, yet we never declare one hypothesis to be false. This saves Moriarty: he’s both correct in disclaiming falsification, and endorsing the process of trying to prove yourself wrong. The confusion between the two stems from having to deal with two separate paradigms that appear to have substantial overlap, even though a closer look reveals fundamental differences.