Discovery metadata

Discovery metadata describes a given dataset or collection. Data being published through a wis2box requires discovery metadata (describing it) to be created, maintained and published to the wis2box catalogue API.

wis2box supports managing discovery metadata using the WMO Core Metadata Profile (WCMP2) standard.

Note

WCMP2 is currently in development as part of WMO activities.

Creating a discovery metadata record in wis2box is as easy as completing a YAML configuration file. wis2box leverages the pygeometa project’s metadata control file (MCF) format. Below is an example MCF file.

wis2box:
    retention: P30D
    topic_hierarchy: mw-mw_met_centre.data.core.weather.surface-based-observations.synop
    country: mwi
    centre_id: mw-mw_met_centre
    data_mappings:
        plugins:
            csv:
                - plugin: wis2box.data.csv2bufr.ObservationDataCSV2BUFR
                  template: CampbellAfrica-v1-template
                  notify: true
                  file-pattern: '^WIGOS_(\d-\d+-\d+-\w+)_.*\.csv$'
            bufr4:
                - plugin: wis2box.data.bufr2geojson.ObservationDataBUFR2GeoJSON
                  file-pattern: '^WIGOS_(\d-\d+-\d+-\w+)_.*\.bufr4$'

mcf:
    version: 1.0

metadata:
    identifier: urn:wmo:md:mw-mw_met_centre:surface-weather-observations
    hierarchylevel: dataset

identification:
    title: Surface weather observations from Malawi
    abstract: Surface weather observations from Malawi
    dates:
        creation: 2021-11-29
    keywords:
        default:
            keywords:
                - surface weather
                - temperature
                - observations
        wmo:
            keywords:
                - weather
            keywords_type: theme
            vocabulary:
                name: Earth system disciplines as defined by the WMO Unified Data Policy, Resolution 1 (Cg-Ext(2021), Annex 1.
                url: https://codes.wmo.int/wis/topic-hierarchy/earth-system-discipline
    extents:
        spatial:
            - bbox: [32.6881653175,-16.8012997372,35.7719047381,-9.23059905359]
              crs: 4326
        temporal:
            - begin: 2021-11-29
              end: null
              resolution: P1H
    url: https://example.org/malawi-surface-weather-observations
    wmo_data_policy: core

contact:
    host:
        organization: Department of Climate Change and Meteorologial Services (DCCMS)
        url: https://www.metmalawi.gov.mw
        individualname: Firstname Lastname
        positionname: Position Name
        phone: "+2651822014"
        fax: "+2651822215"
        address: P.O. Box 1808
        city: Blantyre
        administrativearea: Blantyre District
        postalcode: M3H 5T4
        country: Malawi
        email: you@example.org
        hoursofservice: 0700h - 1500h UTC
        contactinstructions: email

Note

There are no conventions to the MCF filename. The filename does not get used/exposed or published. It is up to the user to determine the best filename, keeping in mind your wis2box system may manage and publish numerous datasets (and MCF files) over time.

Data mappings

A discovery metadata configuration file (MCF) has a wis2box section which provides a default data mapping (in YAML format).

The data mappings are indicated by the wis2box.data_mappings keyword, with each topic having a separate entry specifying:

  • plugins: all plugin objects associated with the topic, by file type/extension

Each plugin is based on the file extension to be detected and processed, with the following configuration:

  • plugin: the codepath of the plugin

  • notify: whether the plugin should publish a data notification

  • template: additional argument allowing a mapping template name to be passed to the plugin. Note that if the path is relative, the plugin must be able to locate the template accordingly

  • file-pattern: additional argument allowing a file pattern to be passed to the plugin

  • buckets: the name(s) of the storage bucket(s) that data should be saved to (See Configuration for more information on buckets)

See Extending wis2box for more information on adding your own data processing pipeline.

Summary

At this point, you have created discovery metadata for your given dataset(s).