2.3.2 Primary analysis and evaluation of raw data

Jump to: navigation, search

​​A. Background & Definitions

Primary analysis of raw data is the data processing required in order to derive (secondary) data that will be shared, presented and/or subjected to statistical analysis.

Information about primary analysis of raw data is critical for establishing a connection between raw data and reported results and is therefore an essential part of data traceability (see item Traceability of data and any person having impact on data​​​).​

B. Guidance & Expectations

Primary analysis of raw data should:

  • be performed blinded (e.g. by an experimenter unaware of pharmacological treatment)
  • maintain the original randomization scheme (if applicable)
  • follow a pre-specified analysis plan that may be a part of the study protocol
  • include data verification (even in case of data produced by automatic systems there are generally additional data which are manually produced. Examples may be body weight, volume of drugs administered, unplanned observations performed during an experiment such as aberrant behavior)
  • include a data validity check i.e. with respect to acceptance criteria pre-defined in the study protocol

Data generated via primary analysis of raw data should be securely stored (see item 3.1.1 Platform to record data). Alternatively, one may store tools, algorithms, scripts and related analysis-related information that would be sufficient to reconstitute the analysis. If the latter approach is taken, two requirements apply:

  • Repetition of the analysis should be possible for any researcher with the necessary skills
  • One should ensure technical feasibility of such re-analysis for the entire period during which raw data are stored (e.g. ability to re-analyze should not be affected by updates in software or readability of guiding information)


  • ​To consider adding this subject to a training program for new employees or refresher training​
  • To label and store all primary analysis files in such a way that it ensures data traceability (for details see item Traceability of data and any person having impact on data​)
  • Outside the pre-specified criteria, exclusion of data points and observations is only possible as long as primary analysis is conducted blind (i.e. before unblinding)
  • All decisions to exclude data MUST be transparent (e.g. if necessary, recorded and reported)

C. Resources

to be added

back to Toolbox

Next item: 2.3.3 Statistical analysis