ARRIVE Essential - Statistical methods

Revision as of 16:39, 5 September 2020 by Bjoerngerlach (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

​​DISCLAIMER: Information on this and related pages is based on or copied directly from the ARRIVE guidelines 2019 (please see the original guidelines for more information, references and examples that are not included on these pages):

ARRIVE Essential 10 - Item 7 - Statistical methods

7a. Provide details of the statistical methods used for each analysis.

In hypothesis-testing studies (research conducted in confirmatory mode) comparing two or more groups, inferential statistics are used to estimate the size of the effect and to determine the weight of evidence against the null hypothesis. The effect size is the magnitude of the difference between two groups. The description of the statistical analysis should provide enough detail so that another researcher could re-analyse the raw data using the same method and obtain the same results. Relevant information includes what the outcome measures and independent variables were, what statistical analyses were performed, what tests were used to check assumptions, and any data transformations. Give details of any confounders, blocking factors or covariates taken into account for each statistical test, include how the effects of each were mitigated. This allows readers to assess if analysis methods were appropriate.

In exploratory studies where no specific hypothesis was tested, descriptive statistics can be used to summarise the data. They do not allow conclusions beyond the data but are important for generating new hypotheses that may be tested in subsequent experiments.

For any study reporting descriptive statistics, explicitly state which measure of central tendency is reported (e.g. mean or median) and which measure of variability is reported (e.g. standard deviation, range, quartiles or interquartile range).

7b. Specify the experimental unit that was used for each statistical test.

Incorrect identification of the experimental unit can lead to pseudoreplication and underpowered studies (see item 1 – Study design). For example, measurements from 50 individual cells from a single mouse represent N = 1 when the experimental unit is the mouse. The 50 measurements are subsamples and provide an estimate of measurement error so should be averaged or used in a nested analysis. Reporting N = 50 in this case is an example of pseudoreplication. It underestimates the true variability in a study, which can lead to false positives. If, however, each cell taken from the mouse is then randomly allocated to different treatments and assessed individually, the cell might be regarded as the experimental unit.

Explicitly report the experimental unit used in each statistical analysis.

7c. Describe any methods used to assess whether the data met the assumptions of the statistical approach.

Hypothesis tests are based on assumptions about the underlying data. Describing how assumptions were assessed, and whether these assumptions are met by the data, enables readers to assess the suitability of the statistical approach used. If the assumptions are incorrect, the conclusions may not be valid. For example, the assumptions for data used in parametric tests (such as a t-test, Z-test, ANOVA, Pearson’s r coefficient, etc.) are that the data are continuous, the residuals from the analysis are normally distributed, the responses are independent, and that different groups should have similar variances.

There are various tests for normality, for example the Shapiro-Wilk and Kolmogorov-Smirnov tests. However, these tests have to be used cautiously. If the sample size is small, they will struggle to detect non-normality, if the sample size is large, the tests will detect minor deviations. An alternative approach is to evaluate data with visual plots e.g. normal probability plots, box plots, scatterplots. If the residuals of the analysis are not normally distributed, the assumption may be satisfied using a data transformation where the same mathematical function is applied to all data points to produce normally distributed data (e.g. loge, log10, square root, arcsine).

Other types of outcome measures (binary, categorical, or ordinal) will require different methods of analysis, and each will have different sets of assumptions. For example, categorical data are summarised by counts and percentages or proportions, and are analysed by tests of proportions; these analysis methods assume that data are binary, ordinal or nominal, and independent. Report the type of outcome measure and the methods used to test the assumptions of the statistical approach. If data were transformed, identify precisely the transformation used and which outcome measures it was applied to.

back to ARRIVE 2.0 overview​