Using Python, requests and Pandas¶
Python is a popular programming language which is heavily used in the data science domains. Python provides high level functionality supporting rapid application development with a large ecosystem of packages to work with weather/climate/water data.
Let’s use the Python requests package to further interact with the wis2box API, and Pandas to run some simple summary statistics.
[1]:
import json
import requests
def pretty_print(input):
print(json.dumps(input, indent=2))
# define the endpoint of the OGC API
api = 'http://localhost/oapi'
Stations¶
Let’s find all the stations in our wis2box:
[2]:
url = f'{api}/collections/stations/items?limit=50'
response = requests.get(url).json()
print(f"Number of stations: {response['numberMatched']}")
print('Stations:\n')
for station in response['features']:
print(station['properties']['name'])
Number of stations: 26
Stations:
NAMBUMA
BALAKA
BILIRA
CHIDOOLE
CHIKANGAWA
CHIKWEO
CHINGALE
KALAMBO
KASIYA AWS
KASUNGU NATIONAL PARK AWS
KAWALAZI
KAYEREKERA
LENGWE NATIONAL PARK
LOBI AWS
MAKANJIRA
MALOMO
MISUKU
MLARE
MLOMBA
MTOSA BENGA
NAMITAMBO
NANKUMBA
NKHOMA UNIVERSITY
NKHULAMBE
NYACHILENDA
TOLEZA
Discovery Metadata¶
Now, let’s find all the dataset that are provided by the above stations. Each dataset is identified by a WIS2 discovery metadata record.
[3]:
url = f'{api}/collections/discovery-metadata/items'
response = requests.get(url).json()
print('Datasets:\n')
for dataset in response['features']:
print(f"id: {dataset['properties']['id']}, title: {dataset['properties']['title']}")
Datasets:
id: data.core.test-passthrough, title: Surface weather observations (passthrough)
id: mw-mw_met_centre.data.core.weather.surface-based-observations.synop, title: Surface weather observations (hourly)
Let’s find all the data access links associated with the Surface weather observations (hourly) dataset:
[4]:
dataset_id = 'mw-mw_met_centre.data.core.weather.surface-based-observations.synop'
url = f"{api}/collections/discovery-metadata/items/{dataset_id}"
response = requests.get(url).json()
print('Data access links:\n')
for link in response['links']:
print(f"{link} {link['href']} ({link['type']}) {link['rel']}")
link['rel']
[link['href'] for link in response['links']]
Data access links:
{'rel': 'self', 'type': 'application/geo+json', 'title': 'This document as GeoJSON', 'href': 'http://localhost/oapi/collections/discovery-metadata/items/mw-mw_met_centre.data.core.weather.surface-based-observations.synop?f=json'} http://localhost/oapi/collections/discovery-metadata/items/mw-mw_met_centre.data.core.weather.surface-based-observations.synop?f=json (application/geo+json) self
{'rel': 'alternate', 'type': 'application/ld+json', 'title': 'This document as RDF (JSON-LD)', 'href': 'http://localhost/oapi/collections/discovery-metadata/items/mw-mw_met_centre.data.core.weather.surface-based-observations.synop?f=jsonld'} http://localhost/oapi/collections/discovery-metadata/items/mw-mw_met_centre.data.core.weather.surface-based-observations.synop?f=jsonld (application/ld+json) alternate
{'rel': 'alternate', 'type': 'text/html', 'title': 'This document as HTML', 'href': 'http://localhost/oapi/collections/discovery-metadata/items/mw-mw_met_centre.data.core.weather.surface-based-observations.synop?f=html'} http://localhost/oapi/collections/discovery-metadata/items/mw-mw_met_centre.data.core.weather.surface-based-observations.synop?f=html (text/html) alternate
{'rel': 'collection', 'type': 'application/json', 'title': 'Discovery metadata', 'href': 'http://localhost/oapi/collections/discovery-metadata'} http://localhost/oapi/collections/discovery-metadata (application/json) collection
[4]:
['http://localhost/oapi/collections/discovery-metadata/items/mw-mw_met_centre.data.core.weather.surface-based-observations.synop?f=json',
'http://localhost/oapi/collections/discovery-metadata/items/mw-mw_met_centre.data.core.weather.surface-based-observations.synop?f=jsonld',
'http://localhost/oapi/collections/discovery-metadata/items/mw-mw_met_centre.data.core.weather.surface-based-observations.synop?f=html',
'http://localhost/oapi/collections/discovery-metadata']
Let’s use the OGC API - Features (OAFeat) link to drill into the observations for Chidoole station
[5]:
dataset_api_link = 'http://localhost/oapi/collections/mw-mw_met_centre.data.core.weather.surface-based-observations.synop'
dataset_api_link
[5]:
'http://localhost/oapi/collections/mw-mw_met_centre.data.core.weather.surface-based-observations.synop'
Observations¶
Let’s inspect some of the data in the API’s raw GeoJSON format:
[6]:
url = f'{dataset_api_link}/items'
query_parameters = {
'wigos_station_identifier': '0-454-2-AWSCHIDOOLE',
'limit': 10000,
'name': 'air_temperature'
}
response = requests.get(url, params=query_parameters).json()
pretty_print(response['features'][0])
{
"id": "WIGOS_0-454-2-AWSCHINGALE_20220112T135500-25",
"reportId": "WIGOS_0-454-2-AWSCHINGALE_20220112T135500",
"type": "Feature",
"geometry": {
"type": "Point",
"coordinates": [
35.11,
-15.24,
623.0
]
},
"properties": {
"wigos_station_identifier": "0-454-2-AWSCHINGALE",
"phenomenonTime": "2022-01-12T13:55:00Z",
"resultTime": "2022-01-12T13:55:00Z",
"name": "air_temperature",
"value": 24.85,
"units": "Celsius",
"description": null,
"metadata": [
{
"name": "station_or_site_name",
"value": null,
"units": "CCITT IA5",
"description": "Chingale"
},
{
"name": "station_type",
"value": 0,
"units": "CODE TABLE",
"description": "Automatic"
},
{
"name": "height_of_barometer_above_mean_sea_level",
"value": 624.0,
"units": "m",
"description": null
},
{
"name": "height_of_sensor_above_local_ground_or_deck_of_marine_platform",
"value": 1.5,
"units": "m",
"description": null
}
],
"index": 25,
"fxxyyy": "012101",
"id": "WIGOS_0-454-2-AWSCHINGALE_20220112T135500-25"
}
}
Let’s inspect what’s measured at Chidoole:
[7]:
print('Observed property:\n')
feature = response['features'][9]
print(f"{feature['properties']['name']} ({feature['properties']['units']})")
Observed property:
air_temperature (Celsius)
Pandas¶
Let’s use the GeoJSON to build a more user-friendly table
[8]:
import pandas as pd
datestamp = [obs['properties']['resultTime'] for obs in response['features']]
air_temperature = [obs['properties']['value'] for obs in response['features']]
d = {
'Date/Time': datestamp,
'Air temperature (°C)': air_temperature
}
df = pd.DataFrame(data=d)
[9]:
df
[9]:
Date/Time | Air temperature (°C) | |
---|---|---|
0 | 2022-01-12T13:55:00Z | 24.85 |
1 | 2022-01-12T14:55:00Z | 27.25 |
2 | 2022-01-12T15:55:00Z | 26.65 |
3 | 2022-01-12T16:55:00Z | 25.95 |
4 | 2022-01-12T17:55:00Z | 25.45 |
... | ... | ... |
5101 | 2022-06-09T12:55:00Z | 21.35 |
5102 | 2022-06-09T13:55:00Z | 22.25 |
5103 | 2022-06-09T14:55:00Z | 20.25 |
5104 | 2022-06-10T12:55:00Z | 23.75 |
5105 | 2022-06-10T14:55:00Z | 21.15 |
5106 rows × 2 columns
[10]:
print("Time extent\n")
print(f'Begin: {df["Date/Time"].min()}')
print(f'End: {df["Date/Time"].max()}')
print("Summary statistics:\n")
df[['Air temperature (°C)']].describe()
Time extent
Begin: 2022-01-12T13:55:00Z
End: 2022-06-10T14:55:00Z
Summary statistics:
[10]:
Air temperature (°C) | |
---|---|
count | 5106.000000 |
mean | 23.541559 |
std | 4.053172 |
min | 13.550000 |
25% | 20.950000 |
50% | 23.350000 |
75% | 26.350000 |
max | 37.850000 |
[ ]: