Inductive Process Modeling Language

This page is an overview of Inductive Process Modeling Language (IPML), a domain-specific language for specifying IPM entities, processes, and constraints.

Motivation

The primary goal of Inductive Process Modeling is to produce descriptive, explanatory, and predictive models of scientific phenomena. To do this, we construct libraries of model specifications, composed of components such as constraints, entities, and processes. We then use search techniques to find and tune appropriate combinations of model structures. The resulting models are known to be predictive, because they fit the observed data. However, they can also be descriptive and explanatory, because they inherit these characteristics from their components.

Domain Knowledge

Our modeling efforts (e.g., MSEAS ATL) require us to capture, manipulate, and use domain knowledge from several disciplines, eg:

  • ecology: predator-prey model

  • math: differential equations (ODEs, PDEs), finite element method (FEM)

  • fluid dynamics: Reaction-diffusion-advection equation (RDA)

We don't need to be experts in these disciplines, but we do need to record information about them faithfully, even when we don't understand all of the details or their implications. Typically, we need to capture a specification (spec) for a set of models, based on the work of a group of domain experts (e.g., biologists, ecologists, oceanographers).

The spec needs to be readable and recognizable by domain experts, so that they can review and "sign off" on it. However, it also needs to be a reliable (etc) data structure that we can use to drive a modeling system.

Summary

In summary, the serialization format needs to meet several criteria:

  • clear, descriptive, explanatory

  • computer- and human-friendly

  • convenient, flexible, readable

  • declarative, language-neutral

Approach

IPML (Inductive Process Modeling Language) is a DSL (domain-specific language) for Inductive Process Modeling. To encourage and support the inclusion of explanatory text, the base format is Markdown. Embedded code sections (in a custom notation) are then used to encode model information.

IPM Lab's programs may be written in a variety of languages. However, YAML is supported by many of these and JSON support is nearly universal. By translating IPML into a common abstract syntax (composed of lists, maps, and strings) and encoding it in a standardized, well-supported format, we can avoid the need to support an IPML parser for each language.

A translation script (ipml_in) reads the IPML input file (e.g., sample.ipml), manipulates its syntax and structure, and generates an output file (e.g., sample.yaml). Programs can import this file, using available libraries, then traverse the resulting data structure. In summary:

  • Humans can use a convenient, readable format.

  • Programs get pre-parsed, convenient data structures.

This decoupling should help IPML to meet the evolving needs of both developers and users.

Note: This wiki contains only an informal summary of IPML. A formal description is beyond the scope of this page (and indeed does not exist at the moment). However, the parsing script (ipml_in) can serve as the operational definition of current practice.

Input Format

IPML's input format is intended to support a wide range of use cases and user preferences. A layered combination of standard and custom syntax lets it handle anything from quick hacks and tests to polished sets of documents.

Although the format borrows heavily from YAML and Markdown, it isn't a subset of either notation. The syntax examples below provide a high-level overview; see IPML Details for more information.

Output Formats

Although most users will work with IPML's input format, developers may need to examine the output formats as well. So, some discussion of the generated syntax may be in order.

Abstract syntax is concerned with the types of things (e.g., data structures, scalar values) a language is able to encode. Concrete syntax is concerned with the manner in which the things are encoded (i.e., the serialization format). In the case of ipml_in's generated files, both the abstract and concrete syntax are of interest.

The abstract syntax of the generated formats is a tree of hashes (i.e., maps, dictionaries), using strings as keys. Most of the leaves are strings or perhaps lists (i.e., arrays) of strings. This data structure is simple, flexible, and well supported by most modern programming languages. The concrete syntax will typically be YAML, but various other formats (e.g., JSON) can be used. The only requirement is that they can encode IPML's abstract syntax without loss of information.

Some formats (e.g., XML) do not map well to IPML's abstract syntax. So, these formats should probably be avoided unless they are required by a particular program. And, in that case, there will generally be other encoding requirements which must be handled on a case-by-case basis. So, another level of translation (e.g., from JSON) will be required.

Syntax Examples

Here are some simple, annotated examples of IPML's input and (YAML) output syntax.

IPML Syntax

The outer layer of IPML syntax follows the rules set by Markdown, a document markup language which is very popular, stable, and well-supported. See the Markdown Dingus for a quick introduction and interactive playground. (Any IPML input file, including the examples on this page, can be pasted into the playground to see how it looks when formatted.)

Each IPML file contains a sequence of code blocks and supporting text. Code blocks are indented by (at least) four spaces; supporting text is not:

IPML Example 1a - Markdown

Markdown use can be very simple and unobtrusive.

    entity fox
      isa predator
      percent = 0.4
      ...

Markdown supports unobtrusive markup (e.g., headings, font control), but only in the text sections. This lets it generate formatted web pages and/or printed documentation, while leaving the code sections undisturbed. So, a file can use any desired amount of text formatting, from none at all to quite a bit.

In this version of the example, we specify that the first line is a top-level header and italicize the word "fancier":

# IPML Example 1b - Markdown

Markdown use can be much _fancier_, if desired.

    entity fox
      ...

Here is a screenshot of Markdown Dingus output for the sample above:

Notes

Spacing within each line of a code block (after the initial indentation) can be arranged to enhance readability. This is entirely optional.

Markdown is commonly translated into HTML, for online publication. For example, GitHub uses Markdown as its default format for project documentation, wikis, GitHub Pages, and more. Markdown can also be mechanically translated into other formats (e.g., LaTeX).

Output Syntax

Custom code in the ipml_in script processes Markdown's code and comment sections, translating them (as appropriate) to the specified concrete syntax. For example, here is a possible translation of "IPML Example 1a" (above). Predictably (being YAML), it will not format nicely in the Markdown playground:

# ipml_x1a.yaml (generated from ipml_x1a.ipml)
# Created by ipml_in for rdm on 2014.0614 at 1642.
##################################################

## IPML Example 1a - Markdown

# Markdown use can be very simple and unobtrusive.

    entities:
      fox:
        isa:      predator
        percent:  0.4
        ...

Open Issues

This document is a Work In Progress. Here are some open issues, for discussion.

Comment Handling

Inclusion of comment text in the output file is optional and may not be supported for some output formats (e.g., JSON). Comments do not become part of the data structure (ala docstrings in Lisp and Python), but this could be supported without much effort.

Tab Handling

Indentation is significant in IPML, as in Markdown, Python, and YAML. Because tabs can cause portability issues, ipml_in (like YAML) simply disallows their use in the input format. Once IPML's input format has stabilized, we may consider loosening this restriction somewhat.

Tabs have been outlawed since they are treated differently by different editors and tools. And since indentation is so critical to proper interpretation of YAML, this issue is just too tricky to even attempt. Indeed, Guido van Rossum of Python has acknowledged that allowing tabs in Python source is a headache for many people and that were he to design Python again, he would forbid them.

-- http://www.yaml.org/faq.html


This wiki page is maintained by Rich Morin, an independent consultant specializing in software design, development, and documentation. Please feel free to email comments, inquiries, suggestions, etc!

Topic revision: r9 - 04 Apr 2016, RichMorin
This site is powered by Foswiki Copyright © by the contributing authors. All material on this wiki is the property of the contributing authors.
Foswiki version v2.1.6, Release Foswiki-2.1.6, Plugin API version 2.4
Ideas, requests, problems regarding CFCL Wiki? Send us email