Riverscapes API
query and download data from the Riverscapes Data Exchange
Underneath the Riverscapes Data Exchange is a powerful Application Programming Interface (API) that allows you to query and download data from the Data Exchange. This API is built uising GraphQL and can be used from most modern programming languages such as Python, R, and JavaScript.
The Riverscapes Command Line Interface (RSCLI) is a command line application that uses the API to access the Data Exchange. It is a good place to start if you want to access data in the exchange but don't want to learn about individual API calls. RSCLI is more appropriate if you just want to download a handful of projects. If you want to download a large number of projects, you will need to write your own code.
Security
The API is secured using the same technology as the Data Exchange itself:
Authentication
You will need a Riverscapes account to use the API. The easiest way to get an account is to sign up for the Data Exchange. Once you have an account, you can use the same credentials to access the API.
Authorization
The API uses the same authorization model as the Data Exchange. You can access the same projects and data that you can access through the Data Exchange with your user credentials. If there are private datasets that are inaccessible to you through the Data Exchange, you will not be able to access them through the API.
The Riverscapes Data Model
The Riverscapes Data Model is a set of concepts that describe the data that is stored in the Data Exchange. It consists of the following objects:
Projects
Projects are the foundation of the Riverscapes Data Model. A project is a collection of data that is related to a specific area and theme. Projects can represent a single river reach or an entire watershed. They can also represent a single theme such as fish habitat or a combination of themes such as fish habitat and beaver restoration.
By far the most numerous projects in the Data Exchange are those that represent a watershed. These projects are created by the Riverscapes Consortium by running our models in the cloud using Cyber Castor.
There are also Data Exchange projects that represent individual river reaches, just a few hundred metres of stream.
Each project is associated with a project type. The project type describes the theme of the project. For example, a project type might be fish_habitat
or beaver_restoration
. The project type is used to determine which datasets are associated with the project. For example, a project with a project type of fish_habitat
might have datasets that contain fish habitat data. The project type also defines how the project appears in the Data Exchange Web Viewer by specifying which business logic to use.
Datasets
Each project can contain one or more datasets. A dataset might be a single file such as a GeoPackage or a collection of files such as a ShapeFile. They can also be reports, imaages, logs, CSV files, practicaally any file based format.
Files
Each dataset contains one or more files.
Users
Users are individuals with a Riverscapes account. Users can be members of one or more organizations. Users, like organizations, can be the owner of a project. And users can be members of one or more organizations.
Organizations
Organizations, like users, can own projects. They are used to simplify the access permissions to projects. Users can be members of one or more organizations.
Collections
Collections are a way to group projects together. They are used to simplify the access permissions to projects. Users can create collections and add projects to them. Users can also share collections with other users.
Tags
Each project can have one or more tag. Tags are used to group projects together. For example, all of the projects that represent a watershed might have the tag watershed
. Tags are also used to search for projects.
Project Metadata
Each project can metadata key/value pairs that describe the project. For example, a project might have a watershed
key with a value of Sandy River Watershed
. Project metadata is also used to search for projects.
Visibility
Projects can have one of three visibility settings:
public
- Anyone can see and download this project.private
- Anyone can see this project, but specific access is required to download it.secret
- Only those with specific access can see or download this project.
Accessing the API with GraphQL
Before you attempt to connect to the API via computer code, its strongly recommended that you develop your queries using desktop API software tool such as Postman or Insomnia. These tools allow you to interact with the API and see the results of your queries. Once you have your queries working, you can then move on to writing code.
Start by making a basic query and then gradually add more clauses to get more specific. Here's a basic query for getting all projects of type "VBET", the Riverscapes Valley Bottem Estimation Tool:
query searchProjects_query(
$searchParams: ProjectSearchParamsInput!
$sort: [SearchSortEnum!]
$limit: Int!
$offset: Int!
) {
searchProjects(limit: $limit, offset: $offset, params: $searchParams, sort: $sort) {
results {
item {
id
name
tags
meta {
key
value
}
projectType {
id
}
createdOn
}
}
total
}
}
The GraphQL is also self-documenting. Within Insomnia or Postman you can click on the "Docs" button to see the documentation for the API. This documentation will show you all of the available queries and the parameters that they accept.
Accessing the API with a Programming Language
There are several examples of scripts that connect to the Data Exchange API to download projects:
Multiple Projects For a Given Watershed
The Data Exchange can and does contain multiple projects of a specific type for a given watershed. There are many instances where we have run multiple versions of a model (e.g. VBET) for a particular watershed. You should anticipate this when coding against the API and incporporate logic to handle multiple projects for a given watershed. One way to do this is to distinguish the projects by there version. We do this using semantic versioning using the semver
Python package and code like this:
# Find the model with the greatest version number
project_versions = {}
for project_id, project_info in projects.items():
for key, val in {meta_item['key']: meta_item['value'] for meta_item in project_info['meta']}.items():
if key.replace(' ', '').lower() == 'modelversion' and val is not None:
project_versions[semver.VersionInfo(val)] = project_id
break
project_versions_list = list(project_versions)
project_versions_list.sort(reverse=True)
return projects[project_versions[project_versions_list[0]]]