Data ingest setup¶
The runtime component of wis2box is data ingestion. This is an event driven workflow driven by S3 notifications from uploading data to wis2box storage.
The wis2box storage is provided using a MinIO container that provides S3-compatible object storage.
Any file received in the
wis2box-incoming storage bucket will trigger an action to process the file.
What action to take is determined by the
data-mappings.yml you’ve setup in the previous section.
MinIO user interface¶
To access the MinIO user interface, visit
http://localhost:9001 in your web browser.
You can login with your
To test the data ingest, add a sample file for your observations in the
wis2box-incoming storage bucket.
Select ‘browse’ on the
wis2box-incoming bucket and select ‘Choose or create a new path’ to define a new folder path:
The folder in which the file is placed defines the dataset for the data you are sharing. For example, for dataset
foo.bar, store your file in the path
The path is also used to define the topic hierarchy for your data (see WIS2 topic hierarchy). The first 3 levels of the WIS2 topic hierarchy
origin/a/wis2 are automatically included by wis2box when publishing data notification messages.
The error message
Topic Hierarchy validation error: No plugins for minio:9000/wis2box-incoming/... in data mappingsindicates you stored a file in a folder for which no matching dataset was defined in your
After uploading a file to
wis2box-incoming storage bucket, you can browse the content in the
wis2box-public bucket. If the data ingest was successful, new data will appear as follows:
If no data appears in the
wis2box-public storage bucket, you can inspect the logs from the command line:
python3 wis2box-ctl.py logs wis2box
Or by visiting the local Grafana instance running at
wis2box workflow monitoring¶
The Grafana homepage shows an overview with the number of files received, new files produced and messages published.
Pay attention to the messages reported in the wis2box logs (right hand side) which indicates errors encountered during data processing:
Once you have verified that the data ingest is working correctly you can prepare an automated workflow to send your data into wis2box.
Automating data ingestion¶
See below a Python example to upload data using the MinIO package:
import glob import sys from minio import Minio filepath = '/home/wis2box-user/local-data/mydata.bin' minio_path = '/ita/italy_wmo_demo/data/core/weather/surface-based-observations/synop/' endpoint = 'http://localhost:9000' WIS2BOX_STORAGE_USERNAME = 'wis2box-storage-user' WIS2BOX_STORAGE_PASSWORD = '<your-unique-password>' client = Minio( endpoint=endpoint, access_key=WIS2BOX_STORAGE_USERNAME, secret_key=WIS2BOX_STORAGE_PASSWORD, secure=is_secure=False) filename = filepath.split('/')[-1] client.fput_object('wis2box-incoming', minio_path+filename, filepath)
Another example can be found in the GitHub minio-ftp-forwarder repository, demonstrating how to setup FTP forwarding workflow to MinIO.
After you have successfully setup your data ingest process into the wis2box, you are ready to share your data with the global WIS2 network by enabling external access to your public services.
Next: Public services setup