2.3.3 Statistical analysis

Revision as of 19:41, 23 March 2021 by 2a02:908:182:f8e0:f44a:43e3:950d:cfda (talk)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

​​​​​​​A. Background​​ & Definitions

P-hacking: P-hacking means that analytical decisions are made after the results are known and data are analyzed in many different ways until the wanted results are reached or acquired. This can include e.g. use of an alternative statistical test and the post-hoc use of normalization A post-hoc increase in sample size/number of experiments also constitutes p-hacking. Examples of various forms of P-hacking can be found in Motulsky 2015

"HARKing" means Hypothesizing After the Results are Known: a hypothesis derived based on the interpretation of the data is presented as having existed before the data were obtained.

​B. Guidan​​ce & Expectations

The following recommendations are based on Motulsky 2015

Statistical analysis should be performed exactly as described in the study protocol.

Any changes (e.g. in steps used to process and analyze the data or changes to study hypothesis) must be documented; the reason for a change must be explained and the study conclusion may need to be labeled as “preliminary”.

As the p-value provides no information about the actual size of the observed effect, it is recommended to calculate, document and present the effect size as difference, percent difference, ratio, or correlation coefficient along with its confidence interval.

It is strongly recommended to report statistical hypothesis testing (and place significance asterisks on figures) only if a decision is to be based on that one analysis.

It is strongly advised against the use of the word “significant” in a report or a publication; in plain English "significant" means "relevant" or "important", but a p-value provides no basis for the importance of a finding. If statistical hypothesis testing is used to make a decision, it is recommended to state the p-value, a preset p-value threshold (statistical alpha), and the decision.

Once the statistical analysis is conducted, it is recommended to plot figures that show the distribution of data (scatter plot; box & whiskers; violin plot). However, if the data have to be presented as a mean (e.g. in a table), display results as a mean and the standard deviation (mean ± SD or median with inter-quartile ranges if normal distribution is not assumed) (for more information please check the item 2.3.4 Data visualization).

It is recommended not to plot the mean with error bars that represent the standard error (mean ± SEM) because SEM is not an indicator of variability but of precision and as such less informative than confidence intervals.

It is strongly recommended to report all details when describing statistical methods (for more information please check the item 2.3.4 Data visualization).


To preregister studies as it helps to reduce P-hacking and “HARKing” (for details please check the item 2.1.11 Preregistration​).

C. Resources

Guidelines on reporting of statistical analysis (in vivo research): ARRIVE Essential - Statistical methods

Essential reading:

  • Motulsky HJ (2015) Common misconceptions about data analysis and statistics. Pharmacol Res Perspect. 3(1). [1]


back to Toolbox

Next item: 2.3.4 Data visualization