experiment management and process schedule

sigclear-experiment aims to help researchers and engineers with their experiments. An engineering experiment in sigclear-experiment is defined as a set of experimental processes and a set of QC figures. This package provides two highly summarized SCons commands, Process and Figure for experiments. It can automatically do the dependency analysis, schedule the experimental processes and distribute them among clusters.

Features

sigclear-experiment only defines two SCons commands. They are highly summarized, easy to use, but full of power for for any complicated scientific or engineering experiments.

Process

A process consists of some target datasets (targets), some source datasets (sources), and the programs to produce the targets using the sources. A Process has the following prototype: Process(targets, sources, programs, options) The options can be blank for most cases.

automatic dependency

sigclear-experiment will automatically generate the dependency relationship among datasets of targets and sources in all processes. The targets in a Process depends on both the sources and the programs including their parameters. Once a parameter in a Process changed, only the depending processes will be scheduled for rerun and their targets will be updated, other processes will not rerun at all.

process schedule and distribution

Dependent processes will be automatically scheduled one-by-one, while independent experimental processes can be scheduled simultaneously. For clusters configuration, process with high computational costs can be automatically distributed onto multiple computing nodes to accelerate the whole experiment.

Figure

QC figures are generated by the command Figure, defined as follows Figure(targets, sources, programs, options) The sources can be blank, when it has the same trunk name as the targets. Figure(targets, programs, options)

Figure is implemented as an alias of Process, but with the default values of some options different from Process.

list of options

option Process Figure description
sprefix data/data/ sources prefix
tprefix data/data/ targets prefix
ssuffix .sg.sg sources suffix
tsuffix .sg.ps targets suffix
verb TrueFalse verbose for this Process or Figure
stdin TrueTrue use the first source as stdin pipe
stdout TrueTrue output the first target to stdout
nodes NoneNA nodes dictionary for parallel processing
ngroup 0NA total groups for parallel processing

Get Started

design the first experiment

To start a new experiment, create a folder for the experiment, and then create a SCons script file named SConstruct in the folder as
  • mkdir test
  • cd test
  • vi SConstruct
An example SConstruct file is as follows
from experiment import *
Process('test2', 'test1', 
	'sgcreate vectors=data2')
Process('test1', None, 
	'sgcreate nx=100')
To execute the experiment, type the command scons in the command line
  • scons

define customized process

You are free to define customized functions for some experiment processes. It can help you to call them repeatedly in one experiment, and make your main experiment tidy. You can also share the customized functions among multiple experiments, or even distribute them to other cooperators.

Learn more from an example of loading and pre-processing NASA Lunar spectrum measurements.

FAQ

How to execute experiment in parallel

To run an experiment with maximal 8 processes, one can use the following commands
  • scons -j 8
sigclear-experiment will automatically choose how many processes it uses based on the dependency of processes in each experiments.

How to use Multiple Inputs Multiple Outputs (MIMO) for a process

Both sources and targets in Process can be a single or a list of multiple datasets. The first dataset in the sources will be automatically passed to the standard input of the programs, while the standard output of the problems will be redirect to the first dataset in targets.

how to change suffix and prefix for datasets

The default suffix for sources and targets in Process is .sg, and the defaults prefix is data/, which means the following process Process("b", "a", "sgfieldmath head:i=1") is equivalent to Process("data/b.sg", "data/a.sg", "sgfieldmath head:i=1") , and similar to the command in a shell terminal as
  • sgfieldmath head:i=1 < data/a.sg > data/b.sg
The sources and targets prefix can be customized by the option sprefix and tprefix respectively, and the sources and targets suffix ca be customized by the option ssuffix and tsuffix. Process("b", "a", "sgfieldmath head:i=1", tprefix="tmp/") is equivalent to
  • sgfieldmath head:i=1 < data/a.sg >tmp/b.sg

Focus on your own interests, let us handle the troubles

Contact Us

Software

Optimized for engineering data analysis

Overview

Install

License

Documents

Consult

Handle your trouble with our experiences

Our Services

Seek Consultancy


Copyright of this website is protected by SIGCLEAR PTE. LTD.