Experiment Data Management#

Notes on storing and organizing raw output data from experiments and analyses thereof.

Process Viewpoints on Experimentation#

Experiments are activities in a larger processes. There are multiple process viewpoints.

From an digital artifact assembly viewpoint, the larger process is, in part:

...  SourceCode --build--> Executable --run--> RawOutput --analyze--> AnalysisResults --export--> ArticleFigure  ...

But from an scientific investigation viewpoint, the larger process is, in part:

...  ResearchQuestions ----> Hypotheses --plan--> StudyDesign 
             --execute--> RawData --analyze--> Results --report--> Publications

Our Experiment Data Management architecture must provide for both the technical software development and the scientific investigative viewpoints.

Experiment Conceptual Model#

There are many scientific investigation models, but the two best-developed ones that I found are summarized below. There is a group (RDA) developing a new model that should subsume these, but that won't be ready for years.

Investigation-Study-Assay (ISA) Model#

The ISA model is "built around the Investigation (the project context), Study (a unit of research) and Assay (analytical measurements) concepts". This model was developed in the biomedical fields, but attempts to be more general.

An investigation is a scientific effort that has particular goals and means. Investigations are the setting in which experiments are conducted. An investigation contains studies.

A study is a specification of an experiment. "The subject under study, its characteristics, and any treatments applied." A study specifies:

  • Overall experimental design type.
  • Material sources and the collection process that results in material samples.
  • Factors: independent variables to manipulated.
A study contains assays: measurements to performed on the collected samples.

An assay describe a measurement. Assays describe measurement workflow from sample to raw data file (material examined, attributes measured, with what technology).

In the ISA model, materials' parameters are called characteristics and processes' parameters are called parameters.

Core Scientific Metadata (CSMD) Model#

This model also has investigations and studies, but uses the terms in the reverse of ISA.

A study has a user responsible for it, and contains investigations.

An investigation has parameters, uses instruments, uses (material) samples, and is associated with datasets (collections of files).

Useful Concepts for Us#

Both models have a similar two-level study/investigation distinction, where the top level is an organizational thing (the PI, goals, etc.) and the lower level is an experimental design thing (measurement technology applied to material samples). This seems like a good organizing principle.

The ISA concept of a material-source-to-material-sample process at the study level, and then the material-sample-to-raw-data processes at the assay level maps nicely to our world, where our "material sources" are systems under test (dOrc/Porc) and "material samples" are test procedures.

Explicitly identifying a set of independent variables (factors) with a study is valuable.

Add new attachment

Only authorized users are allowed to upload new attachments.
« This page (revision-12) was last changed on 28-Sep-2017 10:48 by John Thywissen