Home
This Title All WIREs
WIREs RSS Feed
How to cite this WIREs title:
WIREs Comp Stat

Bayesian and frequentist testing for differences between two groups with parametric and nonparametric two‐sample tests

Full article on Wiley Online Library:   HTML PDF

Can't access this content? Tell your librarian.

Abstract Testing for differences between two groups is one of the scenarios most often faced by scientists across all domains and is particularly important in the medical sciences and psychology. The traditional solution to this problem is rooted inside the Neyman–Pearson theory of null hypothesis significance testing and uniformly most powerful tests. In the last decade, a lot of progress has been made in developing Bayesian versions of the most common parametric and nonparametric two‐sample tests, including Student's t‐test and the Mann–Whitney U test. In this article, we review the underlying assumptions, models and implications for research practice of these Bayesian two‐sample tests and contrast them with the existing frequentist solutions. Also, we show that in general Bayesian and frequentist two‐sample tests have different behavior regarding the type I and II error control, which needs to be carefully balanced in practical research. This article is categorized under: Statistical and Graphical Methods of Data Analysis > Bayesian Methods and Theory Statistical and Graphical Methods of Data Analysis > Monte Carlo Methods Statistical and Graphical Methods of Data Analysis > Markov Chain Monte Carlo
Left: Model for the Bayesian parametric two‐sample t‐test of Rouder et al. (2009), which is itself a special case of the model of Gronau et al. (2019). Data xi and yi is distributed and with grand mean μ, standard deviation σ and total effect α = δσ. The grand mean μ is assigned a flat prior p(μ) = 1, σ2 is assigned Jeffreys prior p(σ2) = 1/σ2 and the effect size δ is assigned a C(0, γ) prior; Right: Model of the Bayesian Wilcoxon rank sum test used for Gibbs sampling by van Doorn et al. (2020): The ranks and are ranks produced by the non‐observed latent variables and , each following a normal distribution with σ = 1 and shifted means and . The effect size δ is generated as , where g gets assigned a hyperprior itself
[ Normal View | Magnified View ]
Left: Model for the Bayesian parametric two‐sample t‐test of Kruschke (2013). Data y[i|j] is the ith observation in the jth group, j = 1, 2. y[i|j] is distributed y[i|j] ∼ tν(μj, σj), where ν~exp(K), and ; Right: Two‐component Gaussian mixture‐model with known allocations of Kelter (2020c), where for i = 1, 2 and priors μi ∼ N(b0, B0) and are used
[ Normal View | Magnified View ]
Type II error rates for Bayesian and frequentist two‐sample tests under different distributions
[ Normal View | Magnified View ]
Type I error rates for Bayesian and frequentist two‐sample tests under different distributions
[ Normal View | Magnified View ]
Prior and posterior plot of the effect size δ for the nonparametric Bayesian Mann–Whitney U test of van Doorn et al. (2020) when using a prior on δ in the computer science education data set of Kelter et al. (2018)
[ Normal View | Magnified View ]
Top: Prior and posterior plot of the effect size δ for the parametric Bayesian two‐sample test of Gronau et al. (2019) when using a prior on δ in the kitchen rolls data set of Wagenmakers et al. (2015); Bottom left: Robustness check for BF01 for varying prior width; Bottom right: Sequential analysis of how BF10 changes when each observation is gradually incorporated into the analysis
[ Normal View | Magnified View ]

Browse by Topic

Statistical and Graphical Methods of Data Analysis > Monte Carlo Methods
Statistical and Graphical Methods of Data Analysis > Bayesian Methods and Theory

Access to this WIREs title is by subscription only.

Recommend to Your
Librarian Now!

The latest WIREs articles in your inbox

Sign Up for Article Alerts