!!! Experiment Data Management

''Notes on storing and organizing raw output data from experiments and analyses thereof.''

!! Process Viewpoints on Experimentation

Experiments are activities in a larger processes.  There are multiple process viewpoints.

From an digital artifact assembly viewpoint, the larger process is, in part:

{{{
...  SourceCode --build--> Executable --run--> RawOutput --analyze--> AnalysisResults --export--> ArticleFigure  ...
}}}

But from an scientific investigation viewpoint, the larger process is, in part:

{{{
...  ResearchQuestions ----> Hypotheses --plan--> StudyDesign 
             --execute--> RawData --analyze--> Results --report--> Publications
}}}

Our Experiment Data Management architecture must provide for both the technical software development and the scientific investigative viewpoints.


!! Experiment Conceptual Model

There are many scientific investigation models, but the two best-developed ones that I found are summarized below.
There is a group (RDA) developing a new model that should subsume these, but that won't be ready for years.


! Investigation-Study-Assay (ISA) Model

The ISA model is "built around the Investigation (the project context), Study (a unit of research) and Assay (analytical measurements) concepts".
This model was developed in the biomedical fields, but attempts to be more general.

An ''investigation'' is a scientific effort that has particular goals and means.
Investigations are the setting in which experiments are conducted.
An investigation contains studies.

A ''study'' is a specification of an experiment.
"The subject under study, its characteristics, and any treatments applied."
A study specifies:
* Overall experimental design type.
* Material ''sources'' and the collection process that results in material ''samples''.
* ''Factors'': independent variables to manipulated.
A study contains ''assays'': measurements to performed on the collected samples.

An ''assay'' describe a measurement.
Assays describe measurement workflow from sample to raw data file (material examined, attributes measured, with what technology).

In the ISA model, materials' parameters are called ''characteristics'' and processes' parameters are called ''parameters''.


! Core Scientific Metadata (CSMD) Model

This model also has ''investigations'' and ''studies'', but uses the terms in the reverse of ISA.

A ''study'' has a user responsible for it, and contains investigations.

An ''investigation'' has parameters, uses instruments, uses (material) samples, and is associated with datasets (collections of files).


! Useful Concepts for Us

Both models have a similar two-level study/investigation distinction, where the top level is an organizational thing (the PI, goals, etc.) and the lower level is an experimental design thing (measurement technology applied to material samples).
This seems like a good organizing principle.

The ISA concept of a material-source-to-material-sample process at the study level, and then the material-sample-to-raw-data processes at the assay level maps nicely to our world, where our "material sources" are systems under test (dOrc/Porc) and "material samples" are test procedures.

Explicitly identifying a set of independent variables (factors) with a study is valuable.