Overview
Researchers who want to demonstrate that two conditions are equivalent frequently use a standard null hypothesis test, find p>0.05, and conclude equivalence. This is a fundamental logical error. A non-significant result does not establish equivalence — it establishes that the evidence was insufficient to detect a difference at the given sample size. A study with N=15 per group will produce p>0.05 for virtually any comparison, including comparisons where the effect is very large but the study is underpowered. Establishing equivalence requires TOST (Two One-Sided Tests) with pre-specified equivalence bounds.
The Equivalence Testing Framework applies TOST, requires pre-specification of the equivalence bound from domain knowledge, and produces conclusions that distinguish "insufficient evidence of difference" (traditional null test non-significance) from "sufficient evidence of equivalence" (TOST within bounds).