Docs

Docs

  • DAFNI Release Notes
  • How To
  • References
  • Terms and Conditions

›Models

How To

    Models

    • Create a DAFNI Ready Model
    • Upload a Model
    • Write a Model Definition File
    • Use a Service Model

    Workflows

    • Create a Workflow
    • Create a Parameter Set
    • Loop a Workflow

How to Write a Model Definition File

A Model Definition file is written in YAML and defines everything that both DAFNI and other users need to know about a Model, e.g. the name of the Model or a description of what it is for. We'll cover some basics of this file format in the following examples, but there is plenty more to it that the formal reference covers in full.

If you've not used YAML before you might find it helpful to read through a Beginner's Guide to YAML. I'll link out to relevant sections of the YAML guide throughout this guide.

Document Root

First, we will define two top-level items in our definition file.

# example-model-definition.yml

kind: Model
api_version: v1beta1

YAML Syntax

The syntax used for the kind and api_version fields defines a basic YAML mapping.

You may find this guide useful in understanding the YAML syntax

Firstly we have set the value of kind to Model. This lets DAFNI know that this definition file defines a Model (there are definition files for other assets too). Next we define api_version which tells DAFNI which version of the Model definition specification this definition conforms to. As DAFNI continues to develop and add new functionality, the Model definition specification will evolve and change. By specifying the version in the file, we can ensure that we always know how a particular definition file should be read. See the formal reference to see what versions are currently available.

Metadata

Next we will add a metadata section that allows you to define some important user-facing fields. The display_name and summary are two crucial fields for people discovering your Model. These are the values that you and other users will see in the Model Catalogue when browsing the Models on the platform. The description is an area that allows you to provide a far richer description of your Model and will be displayed when someone clicks to view the full entry for your Model in the Model Catalogue. The final field we need is the type field. This should be a one word description
of what type the Model will be, for instance it could be forecasting, optimisation or testing; the following examples use example.

kind: Model
api_version: v1beta1
metadata:
  display_name: Example Model
  name: example-model
  summary: A brief, one to two line summary of the Model.
  type: model
  publisher: DAFNI Example
  description: >
    A longer description that explains the purpose of the Model, its intended
    applications and other useful information such as assumptions that have been made
    when creating the Model and any potential impacts of these.

    The description can be written in paragraphs to provide clarity. Just leave a blank
    line in the description to start a new paragraph.

YAML Syntax

You will notice that the new fields we have added under metadata are indented. Whitespace is important to the meaning of YAML.

You might also notice that you don't need to wrap the values in quotations to make them strings. We have also used a > to define a multiline string.

Further information on YAML's syntax can be found here.

Spec

The last major part to add to the definition file is the spec part. This section of the definition contains the information required by DAFNI to be able to run the Model. It covers information such as what data the Model expects as inputs and what results the Model produces. Not only does this information allow DAFNI to run the Model, it also allows the Model to be linked with other Models in Workflows.

For the sake of brevity, I won't keep repeating the rest of the definition file in the following examples, instead it will be replaced with # rest of document #. Just remember that the rest of the information is required to form a valid Model Definition.

Inputs

The inputs section allows you to define what inputs your Model expects in order to run. DAFNI supports a range of input options that allow data to be passed to the Model in different ways.

Parameters

The Model Definition file allows you to define input environment variables using the parameters field. Each of these definitions supports a range of additional information such as the data type the value should be considered as among others.

# rest of document #
spec:
  inputs:
    parameters:
      - name: START_YEAR
        title: Start Year
        description: The year at which the Model execution should start.
        type: integer
        default: 2015
        min: 2010
        max: 2020
        required: true

      - name: END_YEAR
        title: End Year
        description: The year at which the Model execution should stop.
        type: integer
        default: 2025
        min: 2020
        max: 2030
        required: true

YAML Syntax

The above example uses YAML's syntax for defining a list of items as the value of parameters.

Further information on YAML's syntax can be found here.

Because the parameters field is a list, you can add multiple definitions of input environment variables. There are other supported fields and types for defining input environment variables so be sure to take a look at the formal Model Definition reference for more information.

Datasets

Another input field that can be specified is a Dataslot, or a number of Dataslots, that can be filled with a Dataset or multiple Datasets from the National Infrastructure Database (NID). Dataslots are specified using the dataslots field. Dataslots are filled with Datasets when the Model is run in a Workflow. This enables users to update the data being inserted into the Dataslot at run time. To help users of the Model choose the right kind of Datasets to insert into a Dataslot, a name and description should be provided for each of the slots. You must also provide the path that the Model expects the Datasets to be made available at. The required field dictates whether the Dataslot must be filled with a Dataset or whether this slot can be left empty. Finally, the default field is used to specify default Datasets to use in this slot. A default must be specified if required is true.

To add a default Dataset to a Dataslot, you need to know the unique ID of the Dataset, and the version of that particular Dataset you wish to use. The uid and the versionId of the Dataset should be set to their respective unique IDs, these identifiers take the form of "universally unique identifier" (UUID), for example 09f4e250-bfbf-4b2f-9aed-0f18444f605e.You can find both of these in the details page for any Dataset listed in the access panel shown in the image below.

Copy Dataset YAML

You can click the copy buttons next to the UUIDs to copy them individually or alternatively you can click the "Copy YAML for Model Definition" button to copy the full YAML needed to put in the datasets list:

- aaab2e9e-5f85-4401-8cbf-7f9eecec94e9

You would then need to replace the path specific to where you would like the dataset to be loaded into.

# rest of document #
spec:
  inputs:
    parameters:
    # environment variables would be here #
    dataslots:
      - name: Geospatial Data
        description: >
          Description of what this Geospatial Data should contain.
        default:
          - 4d5e424a-e177-11ea-845a-9f0b1c85544d
          - 4d5e424a-e177-11ea-845a-9f0b1c85544d
        path: inputs/geospatial-data
        required: true

n.b. The path the Datasets in a Dataslot are to be included at must always be a child directory of inputs/ e.g. inputs/my-dataset-directory.

As with parameters, dataslots takes a list as an argument so multiple Dataslots can be specified for a Model and each of these slots can take multiple Datasets in the default field.

Complete Example

Putting the pieces from the examples together, we end up with a definition file looking like the following.

kind: Model
api_version: v1beta1
metadata:
  display_name: Example Model
  name: example-model
  publisher: DAFNI Example
  type: model
  summary: A brief, one to two line summary of the Model.
  description: >
    A longer description that explains the purpose of the Model, its intended
    applications and other useful information such as assumptions that have been made
    when creating the Model and any potential impacts of these.

    The description can be written in paragraphs to provide clarity. Just leave a blank
    line in the description to start a new paragraph.
spec:
  inputs:
    parameters:
      - name: START_YEAR
        title: Start Year
        description: The year at which the Model execution should start.
        type: integer
        default: 2015
        min: 2010
        max: 2020
        required: true

      - name: END_YEAR
        title: End Year
        description: The year at which the Model execution should stop.
        type: integer
        default: 2025
        min: 2020
        max: 2030
        required: true
    dataslots:
      - name: Geospatial Data
        description: >
          Description of what this Geospatial Data should contain.
        default:
          - 4d5e424a-e177-11ea-845a-9f0b1c85544d
          - 4d5e424a-e177-11ea-845a-9f0b1c85544d
        path: inputs/geospatial-data
        required: true
← Upload a ModelUse a Service Model →
  • Document Root
  • Metadata
  • Spec
    • Inputs
  • Complete Example
Docs
Docs
Model UploadCreate a WorkflowCreate a Parameter Set
Community
TwitterYouTube
Copyright © 2022 STFC