IPM Timeline

The Institute for the Study of Learning and Expertise (ISLE) has been working on Inductive Process Modeling (IPM) for the last decade. This page, based on abstracts (etc) from supporting papers, summarizes the project's trajectory in terms of research problems and solutions.

Highlights

  • 2002
    • A novel research problem is proposed:
      constructing a process model from continuous data.
    • The Inductive Process Model (IPM) algorithm is described.

  • 2003
    • Induction of process models is made more robust.
    • PROMETHEUS (a model construction GUI) is announced.
    • IPM is applied to inducing ecosystem models
      from background knowledge and time-series data.

  • 2004
    • IPM is applied to photosynthesis regulation.
    • IPM supports computational revision of models.

  • 2005
    • HIPM does constrained search for hierarchical models.
    • Overfitting in process model induction is addressed.

  • 2006
    • Generic processes (ie, templates) are introduced.
    • IPM is applied to biochemical kinetics.
    • The problem of missing data is addressed.

  • 2007
    • An inductive logic programming approach is introduced
      to address the issue of learning declarative bias.
    • An inductive logic programming approach is introduced.
    • An approach for extracting constraints
      on process model construction is introduced.
    • A logical formalism is introduced.
    • A computational method for acquiring scientific knowledge
      from candidate process models is introduced.

  • 2010
    • SC-IPM (Structural Constraints for IPM) is introduced.
    • MISC (a procedure that learns and transfers constraints) is introduced.
    • Spatio-temporal process modeling is introduced.

  • 2011
    • Application of SC-IPM to human physiological models is proposed.

  • 2012
    • An automated approach to discovering constraints is introduced.

Papers

Inducing Process Models from Continuous Data (2002):

[We] pose a novel research problem for machine learning that involves constructing a process model from continuous data. We claim that casting learned knowledge in terms of processes with associated equations is desirable for scientific and engineering domains, where such notations are commonly used. We also argue that existing induction methods are not well suited to this task, although some techniques hold partial solutions. In response, we describe an approach to learning process models from time-series data and illustrate its behavior in a population dynamics domain.

Robust Induction of Process Models from Time-Series Data (2003):

[We] revisit the problem of inducing a process model from time-series data. We illustrate this task with a realistic ecosystem model, review an initial method for its induction, then identify three challenges that require extension of this method. These include dealing with unobservable variables, finding numeric conditions on processes, and preventing the creation of models that overfit the training data. We describe responses to these challenges and present experimental evidence that they have the desired effects.

An Interactive Environment for Scientific Model Construction (2003):

Most AI research on scientific model construction aims to automate this process using discovery techniques. In contrast, we describe an interactive environment for model construction that lets the user construct, edit, and visualize scientific models, use them to make predictions, and call on discovery methods to revise them in ways that better fit the available data.

Discovering Ecosystem Models from Time-Series Data (2003):

Ecosystem models are used to interpret and predict the interactions of species and their environment. In this paper, we address the task of inducing ecosystem models from background knowledge and time-series data, and we review IPM, an algorithm that addresses this problem.

Inducing Explanatory Process Models from Biological Time Series (2004):

We address the task of inducing explanatory models from observations and knowledge about candidate biological processes, using the illustrative problem of modeling photosynthesis regulation. We cast both models and background knowledge in terms of processes that interact to account for behavior.

Computational Revision of Ecological Process Models (2004):

Most ecological models are developed manually by scientists, who decide on their basic structure, tune their parameters, compare them against available data, and refine them in response. In contrast, most work on computational scientific discovery has emphasized the automated generation of models from data and background knowledge. We believe that computational tools for model revision offer great practical value to scientists by decreasing the time required to search for models while letting them retain control over the search space. ...

Inducing Hierarchical Process Models in Dynamic Domains (2005):

Research on inductive process modeling combines background knowledge with time-series data to construct explanatory models, but previous work has placed few constraints on search through the model space. We present an extended formalism that organizes process knowledge in a hierarchical manner, and we describe HIPM, a system that carries out constrained search for hierarchical process models. We report experiments that suggest this approach produces more accurate and plausible models with less effort.

Reducing Overfitting in Process Model Induction (2005):

[IPM] uses background knowledge about possible component processes to construct quantitative models of dynamical systems. ... previous methods for this task tend to over fit the training data, which suggests ensemble learning as a likely response. However, such techniques combine models in ways that reduce comprehensibility, making their output much less accessible to domain scientists.

[We] introduce a new approach that induces a set of process models from different samples of the training data and uses them to guide a final search through the space of model structures. Experiments with synthetic and natural data suggest this method reduces error and decreases the chance of including unnecessary processes in the model.

Inductive Revision of Quantitative Process Models (2006):

[We] present an approach that represents candidate models as sets of quantitative processes and that treats revision as search through a model space which is guided by time-series observations and constrained by background knowledge cast as generic processes that serve as templates for the specific processes used in models.

Constructing Explanatory Process Models from Biological Data and Knowledge (2006):

We address the task of inducing explanatory models from observations and knowledge about candidate biological processes, using the illustrative problem of modeling photosynthesis regulation. We cast both models and background knowledge in terms of processes that interact to account for behavior. We demonstrate [IPM's] use both on photosynthesis and on a second domain, biochemical kinetics.

Learning Process Models with Missing Data (2006):

[We] discuss approaches to learning with missing values in time series, noting that these efforts are typically applied for descriptive modeling tasks that use little background knowledge. We also point out that these methods assume that data are missing at random -- a condition that may not hold in scienti c domains. [We] compare an expectation maximization approach with one that simply ignores the missing data.

An Interactive Environment for the Modeling and Discovery of Scientific Knowledge (2006):

[We] present a language for stating process models and background knowledge in terms familiar to scientists, along with an interactive environment for knowledge discovery that lets the user construct, edit, and visualize scientific models, use them to make predictions, and revise them to better fit available data. We report initial studies in three domains that illustrate the operation of this environment and the results of a user study carried out with domain scientists.

A Method for Representing and Developing Process Models (2006):

Scientists investigate the dynamics of complex systems with quantitative models, employing them to synthesize knowledge, to explain observations, and to forecast future system behavior. Complete specification of systems is impossible, so models must be simplified abstractions. Thus, the art of modeling involves deciding which system elements to include and determining how they should be represented. We view modeling as search through a space of candidate models that is guided by model objectives, theoretical knowledge, and empirical data.

In this contribution, we introduce a method for representing process-based models that facilitates the discovery of models that explain observed behavior. This representation casts dynamic systems as interacting sets of processes that act on entities. Using this approach, a modeler first encodes relevant ecological knowledge into a library of generic entities and processes, then instantiates these theoretical components, and finally assembles candidate models from these elements. We illustrate this methodology with a model of the Ross Sea ecosystem.

Learning Declarative Bias (2007):

In this paper, we introduce an inductive logic programming approach to learning declarative bias. The target learning task is inductive process modeling, which we briefly review. Next we discuss our approach to bias induction while emphasizing predicates that characterize the knowledge and models associated with the HIPM system. We then evaluate how the learned bias affects the space of model structures that HIPM considers and how well it generalizes to other search problems in the same domain.

Results indicate that the bias reduces the size of the search space without removing the most accurate structures. In addition, our approach reconstructs known constraints in population dynamics. We conclude the paper by discussing a generalization of the technique to learning bias for inductive logic programming.

Extracting Constraints for Process Modeling (2007):

In this paper, we introduce an approach for extracting constraints on process model construction. We begin by clarifying the type of knowledge produced by our method and how one may apply it. Next, we review the task of inductive process modeling, which provides the required data.

We then introduce a logical formalism and a computational method for acquiring scientific knowledge from candidate process models. Results suggest that the learned constraints make sense ecologically and may provide insight into the nature of the modeled domain.

Inductive Process Modeling (2007):

In this paper, we pose a novel research problem for machine learning that involves constructing a process model from continuous data. We claim that casting learned knowledge in terms of processes with associated equations is desirable for scientific and engineering domains, where such notations are commonly used. We also argue that existing induction methods are not well suited to this task, although some techniques hold partial solutions. In response, we describe an approach to learning process models from time-series data and illustrate its behavior in three domains.

Processes and Constraints in Explanatory Scientific Discovery (2008):

In previous publications, we have reported a computational approach to constructing explanatory process models of dynamic systems from time-series data and background knowledge. We have not aimed to mimic the detailed behavior of human researchers, but we maintain that our systems address the same tasks as ecologists, biologists, and other theory-guided scientists, and that they carry out search through similar problem spaces. ...

Supporting Innovative Construction of Explanatory Scientific Models (2009):

Scientific modeling is a creative activity that can benefit from computational support. This chapter reports five challenges that arise in developing such aids, as illustrated by PROMETHEUS, a software environment that supports the construction and revision of explanatory models. These challenges include the paucity of relevant data, the need to incorporate prior knowledge, the importance of comprehensibility, an emphasis on explanation, and the practicality of user interaction.

The responses to these challenges include the use of quantitative processes to encode models and background knowledge, as well as the combination of AND/OR search through a space of model structures with gradient descent to estimate parameters. This chapter reports our experiences with PROMETHEUS on three scientific modeling tasks and some lessons we have learned from those efforts. This chapter concludes by noting additional challenges that were not apparent at the outset of our work.

Two Kinds of Knowledge in Scientific Discovery (2010):

Research on computational models of scientific discovery investigates both the induction of descriptive laws and the construction of explanatory models. Although the work in law discovery centers on knowledge-lean approaches to searching a problem space, research on deeper modeling tasks emphasizes the pivotal role of domain knowledge. As an example, our own research on inductive process modeling uses information about candidate processes to explain why variables change over time.

However, our experience with IPM, an artificial intelligence system that implements this approach, suggests that process knowledge is insufficient to avoid consideration of implausible models. To this end, the discovery system needs additional knowledge that constrains the model structures. We report on an extended system, SC-IPM, that uses such information to reduce its search through the space of candidates and to produce models that human scientists find more plausible. We also argue that although people carry out less extensive search than SC-IPM, they rely on the same forms of knowledge -- processes and constraints -- when constructing explanatory models.

The Induction and Transfer of Declarative Bias (2010):

People constantly apply acquired knowledge to new learning tasks, but machines almost never do. Research on transfer learning attempts to address this dissimilarity. Working within this area, we report on a procedure that learns and transfers constraints in the context of inductive process modeling, which we review. After discussing the role of constraints in model induction, we describe the learning method, MISC, and introduce our metrics for assessing the cost and benefit of transferred knowledge. The reported results suggest that cross-domain transfer is beneficial in the scenarios that we investigated, lending further evidence that this strategy is a broadly effective means for increasing the efficiency of learning systems.

Integrated Systems for Inducing Spatio-Temporal Process Models (2010):

Quantitative modeling plays a key role in the natural sciences, and systems that address the task of inductive process modeling can assist researchers in explaining their data. In the past, such systems have been limited to data sets that recorded change over time, but many interesting problems involve both spatial and temporal dynamics.

To meet this challenge, we introduce SCISM, an integrated intelligent system which solves the task of inducing process models that account for spatial and temporal variation. We also integrate SCISM with a constraint learning method to reduce computation during induction. Applications to ecological modeling demonstrate that each system fares well on the task, but that the enhanced system does so much faster than the baseline version.

Combining Data-Driven and Knowledge-Guided Methods to Induce Interpretable Physiological Models (2011):

In this paper, we review the paradigm of inductive process modeling and examine its application to human physiology. This framework represents models as a set of interacting processes, each with associated differential or algebraic equations that express causal relations among variables. Simulating such a quantitative process model produces trajectories for variables over time that one can compare to observations. Background knowledge about candidate processes enables search through the space of model structures and their associated parameters, and thus identify quantitative models that explain time-series data.

We present an initial process model for aspects of human physiology, consider its uses for health monitoring, and discuss the induction of such models. In closing, we consider related efforts on physiological modeling and our plans for collecting data to evaluate our framework in this domain.

Discovering Constraints for Inductive Process Modeling (2012):

Scientists use two forms of knowledge in the construction of explanatory models: generalized entities and processes that relate them; and constraints that specify acceptable combinations of these components. Previous research on inductive process modeling, which constructs models from knowledge and time-series data, has relied on handcrafted constraints.

In this paper, we report an approach to discovering such constraints from a set of models that have been ranked according to their error on observations. Our approach adapts inductive techniques for supervised learning to identify process combinations that characterize accurate models.

We evaluate the method's ability to reconstruct known constraints and to generalize well to other modeling tasks in the same domain. Experiments with synthetic data indicate that the approach can successfully reconstruct known modeling constraints. Another study using natural data suggests that transferring constraints acquired from one modeling scenario to another within the same domain considerably reduces the amount of search for candidate model structures while retaining the most accurate ones.


This wiki page is maintained by Rich Morin, an independent consultant specializing in software design, development, and documentation. Please feel free to email comments, inquiries, suggestions, etc!

Topic revision: r7 - 28 Mar 2014, RichMorin
This site is powered by Foswiki Copyright © by the contributing authors. All material on this wiki is the property of the contributing authors.
Foswiki version v2.1.6, Release Foswiki-2.1.6, Plugin API version 2.4
Ideas, requests, problems regarding CFCL Wiki? Send us email