Riverscapes API

query and download data from the Riverscapes Data Exchange

Underneath the Riverscapes Data Exchange is a powerful Application Programming Interface (API) that allows you to query and download data from the Data Exchange. This API is built using GraphQL and can be used from most modern programming languages such as Python, R, and JavaScript.

The Riverscapes Command Line Interface (rscli) is a command line application that uses the API to access the Data Exchange. It is a good place to start if you want to access data in the exchange but don't want to learn about individual API calls. Using rscli is appropriate if you want to download a handful of projects. If you want to download a large number of projects, you will need to write your own code.

Security

The API is secured using the same technology as the Data Exchange itself:

Authentication

You will need a Riverscapes account to use the API. The easiest way to get an account is to sign up for the Data Exchange. Once you have an account, you can use the same credentials to access the API.

Authorization

The API uses the same authorization model as the Data Exchange. You can access the same projects and data that you can access through the Data Exchange with your user credentials. If there are private datasets that are inaccessible to you through the Data Exchange, you will not be able to access them through the API.

The Riverscapes Data Model

The Riverscapes Data Model is a set of concepts that describe the data that is stored in the Data Exchange. It consists of the following objects:

Projects

Projects are the foundation of the Riverscapes Data Model. A project is a collection of data that is related to a specific area and theme. Projects can represent a single river reach or an entire watershed. They can also represent a single theme such as fish habitat or a combination of themes such as fish habitat and beaver restoration.

By far the most numerous projects in the Data Exchange are those that represent a watershed. These projects are created by the Riverscapes Consortium by running our models in the cloud using Cyber Castor.

There are also Data Exchange projects that represent individual river reaches, just a few hundred metres of stream.

Each project is associated with a project type. The project type describes the theme of the project. For example, a project type might be fish_habitat or beaver_restoration. The project type is used to determine which datasets are associated with the project. For example, a project with a project type of fish_habitat might have datasets that contain fish habitat data. The project type also defines how the project appears in the any Riverscapes Viewer including the Data Exchange Web Viewer by specifying which business logic to use.

Datasets

Each project can contain one or more datasets. A dataset might be represented by a single file such as a GeoPackage or a collection of files, for example for a shapefile. Projects can also contain reports, images, logs, CSV files, -- practically any file-based format.

Files

Each dataset contains one or more files.

Users

Users are individuals with a Riverscapes account. Users can be members of one or more organizations. Users, like organizations, can be the owner of a project. And users can be members of one or more organizations.

Organizations

Organizations, like users, can own projects. They are used to simplify the access permissions to projects. Users can be members of one or more organizations.

Collections

Collections are a way to group projects together. They are used to simplify the access permissions to projects. Users can create collections and add projects to them. Users can also share collections with other users.

Project Metadata

Each project has metadata formatted as key/value pairs that describe the project. For example, a project might have a watershed key with a value of Sandy River Watershed. Project metadata is also used to search for projects.

Visibility

Projects can have one of three visibility settings:

public - Anyone can see and download this project.
private - Anyone can see this project, but specific access is required to download it.
secret - Only those with specific access can see or download this project.

Accessing the API with GraphQL

The Riverscapes API endpoint is https://api.data.riverscapes.net/.

Before you attempt to connect to the API via computer code, its strongly recommended that you develop your queries using desktop API software tool such as Postman or Insomnia. These tools allow you to interact with the API and see the results of your queries. Once you have your queries working, you can then move on to writing code.

Start by making a basic query and then gradually add more clauses to get more specific. Here's a basic query for getting all projects of type "VBET", the Riverscapes Valley Bottom Estimation Tool:

  query searchProjects_query(
    $searchParams: ProjectSearchParamsInput!
    $sort: [SearchSortEnum!]
    $limit: Int!
    $offset: Int!
    ) {
      searchProjects(limit: $limit, offset: $offset, params: $searchParams, sort: $sort) {
    results {
      item {
        id
        name
        tags
        meta {
          key
          value
        }
        projectType {
          id
        }
        createdOn
      }
    }
    total
  }
}

The GraphQL is also self-documenting. Within Insomnia or Postman you can click on the "Docs" button to see the documentation for the API. This documentation will show you all of the available queries and the parameters that they accept.

Accessing the API with a Programming Language

There are several examples of scripts that connect to the Data Exchange API to download projects:

Merge Projects Script

Multiple Projects For a Given Watershed

The Data Exchange can and does contain multiple projects of a specific type for a given watershed. There are many instances where we have run multiple versions of a model (e.g. VBET) for a particular watershed. You should anticipate this when coding against the API and incporporate logic to handle multiple projects for a given watershed. One way to do this is to distinguish the projects by their version. We do this using semantic versioning using the semver Python package and code like this:

# Find the model with the greatest version number
project_versions = {}
for project_id, project_info in projects.items():
    for key, val in {meta_item['key']: meta_item['value'] for meta_item in project_info['meta']}.items():
        if key.replace(' ', '').lower() == 'modelversion' and val is not None:
            project_versions[semver.VersionInfo(val)] = project_id
            break

project_versions_list = list(project_versions)
project_versions_list.sort(reverse=True)
return projects[project_versions[project_versions_list[0]]]