- Replicate SC-IPM's key capabilities (eg, model generation and evaluation).
- Package the capabilities as programs (ie, Unix commands, data filters).
- Communicate by means of standardized data formats (eg, JSON, YAML).
- Add programs for new capabilities (eg, FEM, PDE), as desired.
Much of the current code base
-based GUI support) will simply go away.
In other cases (eg, FEM, ODE, and PDE solvers),
we may be able to use externally supported libraries and programs.
So, we gain access to large bodies of well-supported code,
while dramatically reducing the amount of code that we need to maintain.
Much of the complexity in SC-IPM (and subsystems such as MISC)
has to do with cross-language communication
(eg, parsing and generating Lisp-encoded data structures in other languages,
supporting foreign function interfaces
for C libraries, translating Fortran libraries to Lisp).
Standardized data formats eliminate a great deal of this complexity from IPM Lab.
Instead of translating (and supporting) needed libraries, we can use them "as-is".
In many cases, existing libraries will handle data encoding issues for us.
Finally, these formats make it trivial to interact with web-based technologies
such as D3
, as well as software used by scientific collaborators.
The default serialization format will be YAML
a human-friendly, standardized syntax with support in many languages.
Programs in other languages may use the data in JSON
by means of (trivial) conversion filters.
encoding may be used to conserve data type information.
The specific division of components is still a bit unclear,
but there are some obvious candidates for extraction or replication from SC-IPM,
adoption from external sources, or (if need be) development:
- validation of model set specifications (specs)
- generation and evaluation of model structures
- generation and relaxation of constraints
- visualization (eg, models, results, specs)
Because the Lab's components are connected by data files,
plumbing them together should require very little effort.
For example, evaluating models in parallel becomes trivial.
This capability is likely to be very useful
as we enter the (computationally expensive) realm of FEM, PDEs, etc.
Fundamental architectural changes also become possible to explore.
For example, we may be able to feed model evaluation results
back to the generation code, improving the quality of model structures
and adding to the knowledge contained in the model set specification:
Although the Lab's programs can be linked together by simple scripts,
this is not the only possibility.
Existing scientific workflow systems such as
Kepler scientific workflow system
are worth considering; a custom system is also an intriguing possibility.
Kepler can be used to assemble experimental systems, providing a number of benefits:
Kepler is a free software system for designing, executing, reusing, evolving,
archiving, and sharing scientific workflows.
Kepler's facilities provide process and data monitoring, provenance information,
and high-speed data movement.
Workflows in general, and scientific workflows in particular,
are directed graphs where the nodes represent discrete computational components,
and the edges represent paths along which data and results can flow between components.
-- Kepler scientific workflow system (WP)
It may be possible to gain many of these benefits, and more,
by creating a custom job control framework.
page explores a design concept
based on Git, Neo4j, Rake, and other open source software infrastructure.