The Neo-Nazi in Chief

Ah, that was a great hike! What did I miss while I was gone…

oh. Oh dear. This should have been a slam-dunk: Nazis just committed acts of violence against unarmed protesters, and everyone hates Nazis. Yet all Trump could manage was this?!

We condemn in the strongest possible terms this egregious display of hatred, bigotry and violence on many sides. On many sides.

It’s been going on for a long time in our country. Not Donald Trump, not Barack Obama. This has been going on for a long, long time.

Maybe his cabinet did better? Let’s look at Mike Pence and Jeff Sessions

I stand with @POTUS against hate & violence. U.S is greatest when we join together & oppose those seeking to divide us. #Charlottesville

I have been in contact with our Department of Justice agents assisting at the scene and state officials,” Sessions said. “We will continue to support our state and local officers on the ground in any way possible. We stand united behind the president in condemning the violence in Charlottesville and any message of hate and intolerance. This kind of violence is totally contrary to American values and can never be tolerated. I want to thank all law enforcement personnel in the area for their commitment to protecting this community and the rule of law.

Except police officers stood aside while neo-Nazis attacked protesters, and they did nothing when heavily armed neo-Nazis showed up. Great dodge there, Sessions. Though, on that note: how did Republicans who aren’t part of the White House react? They haven’t exactly been kind to minorities in the past, so maybe they too would sympathise with the neo-Nazis and pull their punches?

The president’s vagueness stood in contrast to his frequent contention, echoing many on the right, that “radical Islamic terrorism” cannot be defeated if political leaders are not willing to specifically call it that.

Among the prominent Republicans who took to Twitter to specifically condemn the neo-Nazis’ violence in Virginia, were House Speaker Paul D. Ryan, Sens. Marco Rubio of Florida and Orrin G. Hatch of Utah, former Republican Party Chairman Ed Gillespie, who is now running for governor in Virginia, and Ronna Romney McDaniel, the current chairwoman of the Republican National Committee.

No really, with the notable exception of Mitch McConnell they had no problems outright condemning neo-Nazis. Even Orrin Hatch was harsh.

 

“Very important for the nation to hear @potus describe events in #Charlottesville for what they are, a terror attack by #whitesupremacists,” tweeted Sen. Marco Rubio, R-Fla. […]

In a statement, Sen. John McCain, R-Ariz., went so far as to address “neo-Nazis” along with white supremacists, saying that the groups “are, by definition, opposed to American patriotism and the ideals that define us as a people and make our nation special.” […]

Speaker of the House Paul Ryan was among the high-ranking Republicans to speak out against the rally, calling it “repugnant” and “vile.” “The views fueling the spectacle in Charlottesville are repugnant. Let it only serve to unite Americans against this kind of vile bigotry,” Ryan wrote. […]

Sen. Tim Scott, R-S.C., the lone African-American Republican in the senate, also called the attack “domestic terror” and encouraged it to be “condemned.” “Otherwise hate is simply emboldened,” wrote Scott. […]

Sen. Ted Cruz slammed the violence associated with the rally and its aftermath in a strongly worded Facebook post.

“The Nazis, the KKK, and white supremacists are repulsive and evil, and all of us have a moral obligation to speak out against the lies, bigotry, anti-Semitism, and hatred that they propagate,” Cruz wrote in the statement.

“Having watched the horrifying video of the car deliberately crashing into a crowd of protesters, I urge the Department of Justice to immediately investigate and prosecute this grotesque act of domestic terrorism.”

OK, so this is definitely something specific to Trump’s White House. If it were an isolated incident, we might be able to dismiss it as a fluke, but instead this incident joins a long list of curious behaviour towards neo-Nazis from Trump and his associates.

  • “It’s this constant, “Oh, it’s the white man. It’s the white supremacists. That’s the problem.” No, it isn’t, Maggie Haberman. Go to Sinjar. Go to the Middle East, and tell me what the real problem is today. Go to Manchester.”
  • A Trump administration effort to exclude violent white supremacists from a government anti-terrorism program and focus efforts solely on Islamist extremism drew a sharp backlash Thursday …”
  • The State Department drafted its own statement last month marking International Holocaust Remembrance Day that explicitly included a mention of Jewish victims, according to people familiar with the matter, but President Donald Trump’s White House blocked its release.The existence of the draft statement adds another dimension to the controversy around the White House’s own statement that was released on Friday and set off a furor because it excluded any mention of Jews.”
  • “But Trump did not become the object of white nationalist affection simply because his positions reflect their core concerns. Extremists made him their chosen candidate and now hail him as “Emperor Trump” because he has amplified their message on social media—and, perhaps most importantly, has gone to great lengths to avoid distancing himself from the racist right. With the exception of Duke [HJH: which he later clawed back], Trump has not disavowed a single endorsement from the dozens of neo-Nazis, Klansmen, white nationalists, and militia supporters who have backed him. The GOP nominee, along with his family members, staffers, and surrogates, has instead provided an unprecedented platform for the ideas and rhetoric of far-right extremists, extending their reach. And when challenged on it by the press, Trump has stalled, feigned ignorance, or deflected—but has never specifically rejected any of these other extremists or their ideas.”
  • Rich Higgins wrote a neo-Nazi influenced memo (“globalists and bankers” are two of their things), which almost got him canned until the Trumps intervened and stalled the firing. got him fired by McMaster. The Trumps loved the memo, though, and the president was outraged by the firing. McMaster has been isolated, as a result.
  • Steve Bannon.

And as hinted at earlier, neo-Nazis view Trump as on their side.

Richard Spencer: Did Trump just denounce antifas? Or did Trump denounce the state police that cracked down on peacefully and lawfully assembled demonstrators?

Paul Nehlen: Like Pres. Trump, I condemn hatred and bigotry on all sides. Violent, illegal antifa attacks on lawful assemblies are especially repugnant.

I’m sorry, America, but you’ve elected a Nazi to lead your country. You might want to do something about that before he nukes the joint.

Stochastic Supertasks

I really loved this illustration of the paradoxes of infinity from Infinite Series, so much so that I’ll sum it up here.

What if you could do one task an infinite number of times over a finite time span? The obvious thing to do, granted this superpower, is start depositing money into your bank account. Let’s say you decide on plunking in one dollar at a time, an infinite number of times. Not happy at having to process an infinite number of transactions, your bank decides to withdraw a 1 cent fee after every one. How much money do you have in your bank account, once the dust settles?

Zero bucks. Think about it: at some point during the task you’ll have deposited N dollars in the account. The total amount the bank takes, however, keeps growing over time and at some point it’s guaranteed to reach N dollars too. This works for any value of N, and so any amount of cash you deposit will get removed.

In contrast, if the bank had decided to knock off 1 cent of your deposit before it reached your bank account, you’d both have an infinite amount of cash! This time around, there is no explicit subtraction to balance against the deposits, so your funds grow without bounds.

Weird, right? Welcome to the Ross–Littlewood paradox. My formulation is a bit more fun than the original balls-and-urns approach, but does almost the same job (picture the balls as pennies). It does fail when it comes to removing a specific item from the container, though; in the Infinite Series video, the host challenges everyone to figure out what happens if you remove the median ball from the urn after adding nine, assuming each ball has a unique number assigned to it. Can’t do that with cash.

My attempt is below the fold.

[Read more…]

Building a Science Detector

Oh, let us count the ways

The Defense Advanced Research Projects Agency (DARPA) Defense Sciences Office (DSO) is requesting information on new ideas and approaches for creating (semi)automated capabilities to assign “Confidence Levels” to specific studies, claims, hypotheses, conclusions, models, and/or theories found in social and behavioral science research. These social and behavioral science Confidence Levels should rapidly enable a non-expert to understand and quantify the confidence they can have in a specific research result or claim’s reliability, reproducibility, and robustness.

First off, “confidence levels?” We’ve already got “confidence intervals,” and there’s been a decades-long push to use them in place of hypothesis testing.[1][2] This technique is fully compatible with frequentism (though over there it doesn’t mean what you think it does), and it even predates null-hypothesis significance testing! Alas, scientists find “we calculate a Cohen’s d of 0.3 +- 0.1” less satisfying to type than “we have refuted the null hypothesis.” The former shows a pretty weak effect, while the latter comes across as bold and confident. If those won’t do, what about meta-analyses? [3]

Second, these “confidence levels” would only apply to published research. Most research never gets published, yet those results are vital to understanding how strong any one finding is.[4] We can try to estimate the rate of unpublished works, and indeed over the decades many people have tried, but there is no current consensus on how best to compensate for the problem.[5][6][7]

Thirdly, “social and behavioral science?” The replication crisis extends much farther, into biomedicine, chemistry, and so on. Physics doesn’t get mentioned much, and there’s a reason for that (beyond their love of confidence intervals). Emphasis mine:

Even if you adjust the acceptable P value, a test of statistical significance, from 0.05 to 0.005—the lower it is, the more significant your data—that won’t deal with, let’s say, bias resulting from corporate funding. (Particle physicists demand a P value below 0.0000003! And you gotta get below 0.00000005 for a genome-wide association study.)

Just think on that. “p < 0.0000003″ means “if the null hypothesis is true, we would find a more extreme result in less than 1 in 3,333,333 trials on data like what we have observed.” If you wanted to see one of those exceptions, you’d have to do one experiment a day for 6,326 years just to have a better than 50/50 chance of spotting it. For comparison, the odds of a particular US citizen being struck by lightening over a year are 1 in 700,000; worldwide, the yearly odds of death by snake bite are about 1 in 335,000; and over the lifetime of a US citizen, the odds of them dying by dog attack are 1 in 112,400. p < 0.0000003 is a ridiculously high bar to leap, which means either a) false positives are easy to generate in physics, either via the law of large numbers or shoddy statistical techniques, or b) the field has been bitten so many times by results that can’t be replicated, even when they were real, that they’ve cranked the bar ridiculously high, or c) both.

Fourth, confidence isn’t everything. The Princeton Engineering Anomalies Research lab did studies where people tried to psychically bias random number generators. Over millions of trials, they got extremely significant results… but the odds of success were still around 50.1% vs. the expected 50%. Were they now confident that psychic abilities exist, or merely that luck and reporting bias could introduce a subtle skew into the data? Compacting those complexities into a number or label that a lay-person can understand is extremely difficult, perhaps impossible.

Basically, what DARPA is asking for has been hashed out in the literature for decades, and the best recommendations have been ignored.[8] They may have deep pockets and influence, but what DARPA wants requires a complete overhaul in how science is conducted across the globe, spanning everything from journals to how universities are organized.[9] When even quite minor tweaks to the scientific process are met with stiff opposition, pessimism seems optimistic.


[1] Gardner, Martin J., and Douglas G. Altman. “Confidence intervals rather than P values: estimation rather than hypothesis testing.” Br Med J (Clin Res Ed)292.6522 (1986): 746-750.

[2] Rozeboom, William W. “The fallacy of the null-hypothesis significance test.” Psychological bulletin 57.5 (1960): 416.

[3] Egger, Matthias, et al. “Bias in meta-analysis detected by a simple, graphical test.” Bmj 315.7109 (1997): 629-634.

[4] Rosenthal, Robert. “The file drawer problem and tolerance for null results.” Psychological bulletin 86.3 (1979): 638.

[5] Franco, Annie, Neil Malhotra, and Gabor Simonovits. “Publication bias in the social sciences: Unlocking the file drawer.” Science 345.6203 (2014): 1502-1505.

[6] Rosenberg, Michael S. “The file-drawer problem revisited: a general weighted method for calculating fail-safe numbers in meta-analysis.” Evolution 59.2 (2005): 464-468.

[7] Simonsohn, Uri, Leif D. Nelson, and Joseph P. Simmons. “P-curve: a key to the file-drawer.” Journal of Experimental Psychology: General 143.2 (2014): 534.

[8] Sedlmeier, Peter, and Gerd Gigerenzer. “Do studies of statistical power have an effect on the power of studies?.” Psychological bulletin 105.2 (1989): 309.

[9] Rawat, Seema, and Sanjay Meena. “Publish or Perish: Where Are We Heading?” Journal of Research in Medical Sciences : The Official Journal of Isfahan University of Medical Sciences 19.2 (2014): 87–89. Print.

All the President’s Bots

Trump appears cranky. It’s raining New Jersey, so he can’t golf work, which leaves him with no choice but to hate-watch CNN. Vets are angry with him, his policies are hurting his base, the polls have him at his lowest point since taking office, foreign diplomats view him as a clown, and he has nothing to show for his first six months.

He still has friends, though.

"@1lion: brilliant 3 word response to Hilary's 'I'm With You' slogan. @realDonaldTrump twitter.com/seanhannity/"Aww, at least one person likes Trump!

ilion on Twitter: STILL hasn't made a single Tweet.

… or maybe not? As an old Cracked article pointed out, Trump had a habit of quoting Tweets that didn’t exist, from people who just joined Twitter or were obvious bots. It was an easy way to make himself look more popular than he was, and stroke his ego. He put this to rest after winning the presidency, but that appears to be changing.

In a tweet on Saturday, President Donald Trump expressed thanks to Twitter user @Protrump45, an account that posted exclusively positive memes about the president. But the woman whose name was linked to the account told Heavy that her identity was stolen and that she planned to file a police report. The victim asserted that her identity was used to sell pro-Trump merchandise.

Although “Nicole Mincey” was the name displayed on the Twitter page, it was not the name used to create the account. The real name of the victim has been withheld to protect her privacy.

The @Protrump45 account also linked to the website Protrump45.com which specialized in Trump propaganda. All of the articles on the website were posted by other Twitter users, which also turned out to be fakes. Mashable noted that the accounts were suspected of being so-called “bots” used to spread propaganda about Trump. Russia has been accused of using similar tactics with bots during the 2016 campaign.

The “Nicole Mincey” scam was remarkably advanced, backed up by everything from paid articles pretending to be journalism to real-life announcers-for-hire singing her praises.

So the latest thing in the Trump resistance is bot-hunting. It’s pretty easy to do, once you’ve seen someone else do it, and the takedown procedure is also a breeze. It also silences a lot of Trump’s best friends.

If only we could do the same to Trump.

Faux Calls for Debate

Much has been written about that (sadly too common) “manifesto” from a Google employee. I’d chip in with a dissection of their claims about the biological capabilities of women, but I’ve already done so, multiple times. I also haven’t seen anyone bring up that women were essential to early computation, but again I’ve touched on that angle already.

So instead, I’ll settle for boosting one of the more interesting takes. Take it away, Dr. NerdLove.

This isn’t really about the memo itself but more in how people treat others, esp. other people who they fundamentally oppose.

First: the positioning oneself as being rational. In cases like this? It’s enshrining “This doesn’t affect me” as “value-neutral”.

It’s easy to say “This is is a topic that should be debated” when it’s not something that will touch you at all.

The superiority of Genesis vs Super Nintendo is a debate. EMACS vs [Vim] is a debate. Virtue of flat currency vs. fiat currency: debate.

“Your gender means that you’re biologically incapable of doing this job and should never have been hired” isn’t a fucking debate.

(Especially when it’s someone who doesn’t understand biology, gender roles, etc. saying this)

It’s easy for someone who has no skin in the game to say it should be “debated” because the outcome doesn’t affect them at *all*.

Let us also be real: this person doesn’t actually mean “debate”. Just as when eggs demand debate on Twitter, they don’t mean debate.

They aren’t asking for a structured discussion about whether someone has the biological capability to write code. They want to gish-gallop.

This isn’t going to be about changing minds or opinions. This is about playing to an audience and making enough noise to shut the others up.

This is why it’s so often framed as “if you won’t discuss this with me, your ideas are weak”. It’s about endurance and frustration.

In this case “whomever quits participating first” is framed as the “loser”. Not “there’s no point b/c this isn’t in good faith”.

These “debates” inevitably turn into goal-post shifts and fire-hose torrents of bullshit (gish-galloping) until the other person leaves.

Dr. NerdLove has more detail, but you get the basic gist: that manifesto isn’t calling for debate, any more than the creationists who make the same maneuver. This is about legitimizing a viewpoint that is illegit on its face, and hounding you until you pack up and leave. Don’t fall for it.

Sex Around the World

Oh, Jerry Coyne. I’m amused with his defense of a sex binary

In Drosophila and humans, the two species with which I’m most familiar, the behavior, appearance, and primary and secondary sex characteristics are determined almost completely by whether the chromosomal constitution is male (XY) or female (XX).

… since, like most such “scientific” defenses, he immediately turns around and shoots it in the foot.

Yes, there are a few exceptions, like AIS, but the various forms of that syndrome occur between 1 in every 20,000 to 1 in only 130,000 births.  Is that “too many examples” to all0w us to say that biological sex is not connected with chromosomes? If you look at all cases of intersexuality that occur in people with XX or XY chromosomes (we’re not counting XOs or XXYs or other cases of abnormal chromosomal number), the frequency of exceptions is far less than 1%. That means that, in humans as in flies, there is almost a complete correlation between primary/secondary sex characteristics and chromosome constitution.

Ah yes, chromosomes determine human sex except in the 0.05% to 1.7% of cases where they don’t. Brilliant logic, that.

But it’s easy to get trapped by your filter bubble. The internet is a lot bigger than North America, after all, and other places have their own view of sex. Take Sweden, for instance, where it’s  government policy to avoid teaching gender stereotypes. One kindergarten made headlines not too long ago by declaring itself “gender-neutral.” As the founder put it,

00:10:10,909 –> 00:11:03,329
I’m going to show you what we call the “whole life spectra.” We tend to divide this life spectra into two pieces, one for boys and one for girls. More often pink is for girls, and blue is for boys. When we call a boy “cool” and “strong,” and to girls we more often say that they should be “helpful,” “nice,” “cute,” we have different expectations [for how they behave]. We take away this border, and we don’t separate into “boyish” and “girlish,”  we give the whole life spectra to everyone. So we are not limiting, we are just adding. We are not changing the children, we are changing our own thoughts.

That video is worth watching, as it follows around two gender non-conforming kids with an intersex “ma-pa.” The few bigots on screen seem right out of 1984, claiming that expanding or eliminating gender stereotypes somehow constrains kids in some mysterious fashion. Every kid, in contrast, is either at ease with gender role fluidity or made uncomfortable when asked to label their gender.

But even Sweden appears behind the curve when contrasted with the Khawaja Sira of South Asia.

For centuries, South Asia has had its own Khawaja Sira or third gender culture. The community, identifying as neither male nor female, are believed by many to be “God’s chosen people,” with special powers to bless and curse anyone they choose. The acceptance of Khawaja Sira people in Pakistan has been held up internationally as a symbol of tolerance, established long before Europe and America had even the slightest semblance of a transgender rights movement.

But the acceptance of people defining their own gender in Pakistan is much more complicated. The term transgender refers to someone whose gender identify differs from their birth sex. This notion is yet to take root in Pakistan and the transgender rights movement is only beginning to assert itself formally. Now, some third gender people in Pakistan say the modern transgender identity is threatening their ancient third gender culture.

The problem is that the Khawaja Sira are allowed to exist within South Asian culture because they renounce both male and female gender roles, thus don’t challenge either. Trans* people, on the other hand, reject the role assigned to the Khawaja Sira and invoke the male or female one instead. This upsets every gender’s apple cart. It doesn’t help either that the Khawaja Sira in Pakistan have recently fallen onto hard times, facing increasing bigotry and hate; the increasing number of trans* people feels like an invasion of “Western” ideals, at a time when their community is ill-equipped to cope.

But do you remember hearing about Oyasiqur Rhaman, the atheist blogger murdered in Bangladesh? His murderers were outed by a courageous “hijra,” which is similar in meaning to “Khawaja Sira” but not quite the same.

Transgender people occupy an unusual social stratum in South Asia, where conservative societies still consider same-sex intercourse to be a crime but also allow the existence of a third gender — a well-established category that dates back to the age of the “Kama Sutra.” Nepal, Pakistan, Bangladesh and India have all legally recognized the existence of a third gender, including on passports and other official documents.

In India, in fact, “kinnar” freely mixes gender identity with non-binary sex. Compare and contrast this with Mexico’s “muxes,” who are called a third gender but in practice act more like trans* women, and Balkan sworn virgins who are more like trans* men. There’s no intersex component to the latter two, so lumping everybody under the banner of “third gender” or “transgender” is quite misleading.

Our binary view of sex and gender seem terribly archaic (which is ironic, as it may be a recent invention). It should not be controversial in North America to have a non-conforming parent or be raised in a genderless environment, yet it is. We could learn a thing or two from the rest of the world, especially when it comes to sex.

“Science Is Endangered by Statistical Misunderstanding”

He’s baaaaaack. I’ve linked to David Colquhoun’s previous paper on p-values,[1] but I’ve since learned he’s about to publish a sequel.

Despite decades of warnings, many areas of science still insist on labelling a result of P < 0.05 as “significant”.   This practice must account for a substantial part of the lack of reproducibility in some areas of science. And this is before you get to the many other well-known problems, like multiple comparisons, lack of randomisation and P-hacking. Science is endangered by statistical misunderstanding, and by university presidents and research funders who impose perverse incentives on scientists. [2]

[Read more…]

Transgender People, Sexuality, and Attraction

Shiv’s knocked it out of the park again. Her post is a long and thorough discussion of Laci Green‘s batshittery, and defies easy summation. Maybe this paragraph is representative of the whole?

Remember the earlier contrast between theory and practice when it comes to trans women’s bodies? Jones also does sex work. The primary audience for her porn is straight men. Jones has spent the past few weeks being inundated with abuse over the idea of anyone finding her attractive all the while people are shelling out cash for her sexualized pics and clips. Performative declarations about how unfuckable I am brush up against my reality, where I spend my weekends doing naked things that are illegal in many countries across the world (ifyouknowwhatimean). It’s two different conversations happening at the same time, and one of them looks–if you’ll pardon my French–fucking silly. “Ur ugly” is the discourse of children and playgrounds.

Hmmm, no. These two?

Hint number one that we’re dealing with an expression of prejudice: Green can’t decide who the subjects are, unless she’s suddenly started believing that (cis) lesbians and (cis) straight men are the same. Badly misapprehending Jones’ tweet would have at least supported her conclusion that trans women are trying to coerce straight men, but instead she comes out of left field by making it about (cis) lesbians. That she saw no contradiction in these premises suggests to me she is starting with her conclusion and working her way backwards.

Hint number two that we’re dealing with an expression of prejudice: Green uses “lesbian” as mutually exclusive with “trans YouTuber.” Neither cis lesbians that can be attracted to trans women, nor trans lesbians, exist. Apparently.

Nope, that’s worse. Tell you what, just read the thing. It’ll make you reflect on your views of attraction and sexuality, most likely, and that alone is worth it.

P-values are Bullshit, 1942 edition

I keep an eye out for old criticisms of null hypothesis significance testing. There’s just something fascinating about reading this…

In this paper, I wish to examine a dogma of inferential procedure which, for psychologists at least, has attained the status of a religious conviction. The dogma to be scrutinized is the “null-hypothesis significance test” orthodoxy that passing statistical judgment on a scientific hypothesis by means of experimental observation is a decision procedure wherein one rejects or accepts a null hypothesis according to whether or not the value of a sample statistic yielded by an experiment falls within a certain predetermined “rejection region” of its possible values. The thesis to be advanced is that despite the awesome pre-eminence this method has attained in our experimental journals and textbooks of applied statistics, it is based upon a fundamental misunderstanding of the nature of rational inference, and is seldom if ever appropriate to the aims of scientific research. This is not a particularly original view—traditional null-hypothesis procedure has already been superceded in modern statistical theory by a variety of more satisfactory inferential techniques. But the perceptual defenses of psychologists are particularly efficient when dealing with matters of methodology, and so the statistical folkways of a more primitive past continue to dominate the local scene.[1]

… then realising it dates from 1960. So far I’ve spotted five waves of criticism: Jerzy Neyman and Egon Peterson head the first, dating from roughly 1928 to 1945; a number of authors such as the above-quoted Rozeboom formed a second wave between roughly 1960 and 1970; Jacob Cohen kicked off a third wave around 1990, which maybe lasted until his death in 1998; John Ioannidis spearheaded another wave in 2005, though this died out even quicker; and finally the “replication crisis” that kicked off in 2011 and is still ongoing as I type this.

I do like to search for papers outside of those waves, however, just to verify the partition. This one doesn’t qualify, but it’s pretty cool nonetheless.

Berkson, Joseph. “Tests of Significance Considered as Evidence.” Journal of the American Statistical Association 1942;37:325-35. International Journal of Epidemiology, vol. 32, no. 5, 2003, pp. 687.

For instance, they point to a specific example drawn from Ronald Fisher himself. The latter delves into a chart of eye facet frequency in Drosophila melanogaster, at various temperatures, and extracts some means. Conducting an ANOVA test, Fisher states “deviations from linear regression are evidently larger than would be expected, if the regression were really linear, from the variations within the arrays,” then concludes “There can therefore be no question of the statistical significance of the deviations from the straight line.”

Berkson’s response is to graph the dataset.eye facets vs. temperature, Drosophila Melangaster, graphed and fit to a line. From Fisher (1938).

The middle points look like outliers, but it’s pretty obvious we’re dealing with a linear relationship. That Fisher’s tests reject linearity is a blow against using them.

Jacob Cohen made a very strong argument against Fisherian frequentism in 1994, the “permanent illusion,” which he attributes to a paper by Gerd Gigerenzer in 1993.[3][4] I can’t find any evidence Gigerenzer actually named it that, but it doesn’t matter; Berkson scoops both of them by a whopping 51 years, then extends the argument.

Suppose I said, “Albinos are very rare in human populations, only one in fifty thousand. Therefore, if you have taken a random sample of 100 from a population and found in it an albino, the population is not human.” This is a similar argument but if it were given, I believe the rational retort would be, “If the population is not human, what is it?” A question would be asked that demands an affirmative answer. In the hull hypothesis schema we are trying only to nullify something: “The null hypothesis is never proved or established but is possibly disproved in the course of experimentation.” But ordinarily evidence does not take this form. With the corpus delicti in front of you, you do not say, “Here is evidence against the hypothesis that no one is dead.” You say, “Evidently someone has been murdered.”[5]

This hints at Berkson’s way out of the p-value mess: ditch falsification and allow evidence in favour of hypotheses. They point to another example or two to shore up their case, but can’t extend this intuition to a mathematical description of how this would work with p-values. A pity, but it was for the best.


[1] Rozeboom, William W. “The fallacy of the null-hypothesis significance test.” Psychological bulletin 57.5 (1960): 416.

[2] Berkson, Joseph. “Tests of Significance Considered as Evidence.” Journal of the American Statistical Association 1942;37:325-35. International Journal of Epidemiology, vol. 32, no. 5, 2003, pp. 687.

[3] Cohen, Jacob. “The Earth is Round (p < .05).” American Psychologist, vol. 49, no. 12, 1994, pp. 997-1003.

[4] Gigerenzer, Gerd. “The superego, the ego, and the id in statistical reasoning.” A handbook for data analysis in the behavioral sciences: Methodological issues (1993): 311-339.

[5] Berkson (1942), pg. 326.