Skip to content

Formulations and BMI

Justin Singh-M. - NOAA edited this page Oct 18, 2023 · 8 revisions
Table of Contents

Formulation Config

A formulation is comprised of a model name, initial parameters, and configuration options. Semantically, each formulation is either a recipe on top of a single BMI model, a pipeline of BMI models (i.e. multi-BMI), or an ensemble of BMI models. Currently, only single-module and multi-module BMI formulations are supported as possible representations for a formulation configuration. The catchment entry in the formulation/realization config must be set to used the appropriate type for the associated BMI realization, via the formulation's name JSON element. E.g.:

//...
"cat-87": {
  "formulations": [
    {
      "name": "bmi_c",
      "params": { ... }
    }
}
//...

Valid name values for the currently implemented BMI formulation types are:

  • bmi_c++
  • bmi_c
  • bmi_fortran
  • bmi_python
  • bmi_multi

Because of the generalization of the interface to the model, the required and optional parameters for all the BMI formulation types are the same.

Required Parameters

Certain parameters are strictly required in the formulation/realization JSON config for a catchment entry using a BMI formulation type. Note that there is a slight distinction in "required" between single-module (e.g., bmi_c) and multi-module formulations (i.e., bmi_multi). These are summarized in the following table, with the details of the parameters list below.

Param Single-Module Multi-Module
model_type_name ✔️ ✔️
init_config ✔️
uses_forcing_file ✔️
main_output_variable ✔️ ✔️
modules ✔️

Parameter Details:

  • model_type_name
    • string name for the particular backing model type
    • may not be utilized in all cases, but still required
  • init_config
    • the string path to the BMI initialization config file for the catchment
  • uses_forcing_file
    • boolean indicating whether the backing BMI model is written to read input forcing data from a forcing file (as opposed to receiving it via getters calls made by the framework)
  • main_output_variable
    • the string value of the primary output variable
    • this is the value that returned by the realization's get_response()
    • the string must match an item return by the relevant variant of the BMI get_output_var_names() function
  • library_file
    • Path to the library file for the BMI library
  • modules
    • a list of individual formulation configs for component modules of a BMI multi-module formulation
    • each item in the list will be another nested JSON config object for a BMI formulation

Semi-Optional Parameters

There are some special BMI formulation config parameters which are required in certain circumstances, but which are neither always required nor required for either all single- or multi-module formulations. Thus, they do not behave exactly as Required params do in the configuration. However, they should be thought of as de facto required (and will trigger errors when missing) in the specific situations in which they are applicable.

  • forcing_file
    • string path to the forcing data file for the catchment
    • must be set whenever a model needs to read its own forcings directly
      • this is set/indicated using uses_forcing_file as described above
    • the BMI model's initialization config (i.e., init_config above) may define an analogous property, and the two should properly correspond in such cases
  • registration_function
    • Name of the bootstrapping pointer registration function in the external module
    • required for BMI modules if the module's implemented function is not named register_bmi.
    • required for BMI modules that interface through C (i.e. C, C++, Fortran).
  • python_type
    • Name of the Python class that represents a BMI model, including the package name as appropriate.
    • Required for Python-based BMI modules
    • Only needed for Python-based modules

Optional Parameters

  • variables_names_map
    • can specify a mapping of model variable names (input or output) to supported standard names the Bmi_Formulation.hpp file has a section where several supported standard names are defined and notes
    • this can be useful in particular for informing the framework how to provide the input a model needs for execution
    • e.g., "variables_names_map": {"model_variable_name": "standard_variable_name"}
  • model_params
    • can specify static or dynamic parameters passed to models as model variables.
    • static parameters are defined inline in the realization config.
    • dynamic parameters are derived from a given source, such as hydrofabric data.
    • if specified, must be within the params config level, i.e. within a "formulations": [..., {..., "params": {..., "model_params": {...}, ...}, ...}, ...] object.
    • if specified for multi-BMI, must be within the module-params config level, i.e. in the params config level for a given module.
    • e.g.,
      // Format: { <variable_name>: <value> }
      "model_params": {
        // Static parameter
        "APCP_Surface": 3.0,
      
        // Dynamic parameter
        "areasqkm": {
          // where this variable is deriving from, only "hydrofabric" is supported currently
          "source": "hydrofabric",
          // the property name of this value,
          // i.e. what property (area_sqkm) in the source (hydrofabric) maps to our variable (areasqkm)?
          "from": "area_sqkm"
        }
      }
  • output_variables
    • can specify the particular set and order of output variables to include in the realization's get_output_line_for_timestep() (and similar) function
    • JSON structure should be a list of strings
    • if not present, defaults to whatever it returned by the model's BMI get_output_var_names() function the first time it is invoked
    • if specified, must be at the root level of a formulation object.
    • if specified for multi-BMI, it should be within the formulation-root config level, i.e. in the root config level for a given formulation.
    • e.g.,
      // Example for CFE, which has 13 output variables
      "output_variables": ["RAIN_RATE", "Q_OUT"]
  • output_header_fields
    • can specify the header strings to use for the realization's printed output (i.e., the value returned by get_output_header_line())
    • JSON structure should be a list of strings
    • when not present, the literal variable names are used
    • when present, does not do any checking for ordering/correspondence compared to the output ordering of the variable values, so users must take care that ordering is consistent
    • if specified, must be at the root level of a formulation object.
    • if specified for multi-BMI, it should be within the formulation-root config level, i.e. in the root config level for a given formulation.
    • e.g.,
    // Same as `output_variables` example with CFE, but we want to change the formatting
    "output_header_fields": ["rain_rate", "Q"]
  • allow_exceed_end_time
    • boolean value to specify whether a model is allowed to execute Update calls that go beyond its end time (or the max forcing data entry)
    • implied to be false by default
  • fixed_time_step
    • boolean value to indicate whether this model has a fixed time step size
    • implied to be true by default
Single-BMI Formulation JSON
// <ref: single-bmi>
{
	"name": "bmi_c++" | "bmi_c" | "bmi_fortran" | "bmi_python",
	"params": {
		"model_type_name": "string",
		"library_file": "string",
		"init_config": "string",
		"allow_exceed_end_time": true | false,
		"main_output_variable": "string",
		"uses_forcing_file": true | false,
		"model_params": {
			"<model-variable>": "<value>"
		},
		"variables_names_map": {
			"<new_name>": "<old_name>"
		}
	}
}
  • name: An enumerator representing the BMI type of the underlying model for this formulation.
  • params:
    • model_type_name:
    • library_file:
    • init_config:
    • allow_exceed_end_time:
    • main_output_variable:
    • uses_forcing_file:
    • model_params:
    • variables_names_map:
Multi-BMI Formulation JSON
// <ref: multi-bmi>
{
	"name": "bmi_multi",
	"params": {
	    "model_type_name": "",
	    "allow_exceed_end_time": bool,
	    "main_output_variable": "",
	    "modules": [
		    // <ref: single-bmi>...
	    ]
	}
}

BMI Models

Model engine enforces a common interface for usable models by requiring the use of as a means of interfacing to models across various programming languages. Moreover, we emplace certain conventions on top of the BMI to enable dynamic shared library loading, in order to support using different models at run-time without recompiling model engine.

Multi-Module BMI Formulations

It is possible to configure a formulation to be a combination of several different individual BMI module components. This is the bmi_multi formulation type.

As described in Required Parameters, a BMI init_config does not need to be specified for this formulation type, but a nested list of sub-formulation configs (in modules) does. Execution of a formulation time step update proceeds through each module in list order.

A few other items of note:

  • there are some constraints on input and output variables of the sub-modules of a multi-module formulation
    • output variables must be uniquely traceable
      • there must not be any output variable from a sub-module for which there is another output variable in a different sub-module that has the same config-mapped alias
      • when nothing is set for a variable in variables_names_map, its alias is equal to its name
      • when this doesn't hold, a unique mapped alias must be configured for one of the two (i.e., either an alias added, or changed to something different)
    • input variables must have previously-identified data providers
      • every input variable must have its alias match a property from an output data source
      • one way this works is if the alias is equal to the name of an externally available forcing property (e.g., AORC read from file) defined before the sub-module
      • another way is if the input variable's alias matches the alias of an output variable from an earlier sub-module
        • in this case, one sub-module's output serve as a later sub-modules input for each multi-module formulation update
        • since modules get executed in order of configuration, "earlier" and "later" are with respect to the order they are defined in the modules config list
  • the framework allows independent configuration of the uses_forcing_file property among the individual sub-formulations, although this is not generally recommended
  • configuration of variables_names_map maps a given variable to a variable name of the directly
  • it is now possible to have an earlier nested module use as a provider (for one of its inputs) a later nested module, as long as a default value is configured
    • a collection of variable default values can be given in the formulation config at the top level by providing an entry in default_output_values with the variable's mapped configuration alias (or just variable name if it is unique) and the default value:
{
  "global": {
    "formulations": [
      {
        "name": "bmi_multi",
        "params": {
          "model_type_name": "bmi_multi_noahmp_cfe",
          "forcing_file": "",
          "init_config": "",
          "allow_exceed_end_time": true,
          "main_output_variable": "Q_OUT",
          "default_output_values": [
            {
              "name": "QINSUR",
              "value": 42.0
            }
          ],
          "modules": [
...