Data ingest, processing and publishing¶
At this point, the system is ready for ingest/processing and publishing.
Data ingest, processing and publishing can be run in automated fashion or via the wis2box CLI. Data is ingested, processed, and published as WMO BUFR data, as well as GeoJSON features.
GeoJSON data representations provided in wis2box are in development and are subject to change based on evolving requirements for observation data representations in WIS 2.0 technical regulations.
Interactive ingest, processing and publishing¶
The wis2box CLI provides a data subsystem to process data interactively. CLI data ingest/processing/publishing can be run with explicit or implicit topic hierarchy routing (which needs to be tied to the pipeline via the Data mappings).
Explicit topic hierarchy workflow¶
# process a single CSV file wis2box data ingest --topic-hierarchy foo.bar.baz -p /path/to/file.csv # process a directory of CSV files wis2box data ingest --topic-hierarchy foo.bar.baz -p /path/to/dir # process a directory of CSV files recursively wis2box data ingest --topic-hierarchy foo.bar.baz -p /path/to/dir -r
Implicit topic hierarchy workflow¶
# process incoming data; topic hierarchy is inferred from fuzzy filepath equivalent # wis2box will detect 'foo/bar/baz' as topic hierarchy 'foo.bar.baz' wis2box data ingest -p /path/to/foo/bar/baz/data/file.csv
Event driven ingest, processing and publishing¶
Once all metadata and topic hierarchies are setup, event driven workflow
will immediately start to listen on files in the
wis2box-incoming storage bucket as they are
placed in the appropriate topic hierarchy directory.
Congratulations! At this point, you have successfully setup a wis2box data pipeline. Data should be flowing through the system.