How did we end up generally adopting a 95% confidence level? A post at Empirical Legal Studies explains that the answer lies in a paper: The Adoption of Significance Tests by the Scientific Community: An Empirical Analysis, by David A. Gully (Columbia--Engineering). The paper discusses the adoption of the 0.05 standard. The abstract explains the history as follows:
Tests of statistical significance are routinely used in many research studies. However, there are critics of this paradigm, and also a lingering sense that critical test levels are somewhat arbitrary. This paper adds to the literature by determining the timing and level of acceptance of common tests of statistical inference. Using the archives of the Royal Society, we examined 574 research studies published between 1926 and 1997, by which point adoption was virtually complete. We find that the rate and level of adoption rises over time, in a manner broadly consistent with the theoretical literature on the adoption rate of innovations. We detect the presence of several influences on the rate of adoption, which may include prior custom, the nature of empirical research topics being reported, the increasing ease of computer processing, and possibly journal editorial policies. We find that confidence/significance testing has been adopted by a majority of the scientific community for over 50 years; the customary reliance on 95 percent confidence (five percent significance) is upheld by the data; and that confidence intervals and critical significance levels are both widely reported and often together in recent decades. For historians of science these data suggest that neither Fisher nor Pearson conclusively “won” their private war. The study sheds new light on an issue of considerable practical importance, the admissibility of statistical evidence in most courts in the United States."
Number of Pages in PDF File: 21