Standards
Riverscapes standards and how to apply them
The Riverscapes Consortium data standards provide a simple and flexible way of describing any data to be stored in the Riverscapes Data Exchange and visualized with relevant, consistent cartography.
We have intentionally kept our standards minimal. We have designed them to be just enough to catalog each dataset and how it should be displayed in our Riverscapes Viewers.
Core Concepts
Most geospatial and simulation modeling is built around the concept of transforming input data, and using some kind of model or transformer into output data. Each time the model is run can be considered a realization of some modeled state.
The Riverscapes Consortium have defined two standards: one for the data and one for the model itself.
The data standard lists the data in question including mandatory and optional metadata. We refer to one set of input data as a “project”, regardless of whether it is an input or output dataset. In practical terms, the project is described by a plain text XML file. The standard defines a set of rules that the XML file must follow to be considered compliant. This ruleset is stored and published online in an XSD file. The XSD specifies which XML tags should be used, how they should be used, and what attributes are required. Practical examples are provided for existing datasets for those wishing to learn how the standard can be applied to their own data.
XML rules
The XML rules are simple and generic. They are intended to accommodate data formats that are common among river scientists, while remaining flexible about how you choose to organize your data to accommodate its geospatial or temporal dimensionality.
Each project possesses its own XML file and the name of the file is always project.rs.xml
. This file is packaged and travels with the data so that all Riverscapes applications (Data Exchange, Viewer, and Studio) know about the standard and can discover, process and visualize the data in the dataset. The project.rs.xml
really is the magic sauce of the riverscapes data standard. It makes all other parts of our approach function.
Business Logic
Software that displays datasets to users, such as the Riverscapes Viewer, use another XML file called Business Logic to translate the project XML file into more practical ways for the users to interact with the data. The Business Logic XML file defines how to organize the datasets in the project into a hierarchical tree and how to symbolize the data in maps. Essentially, the Business Logic XML takes the technical specification of the data in the project file and turns it into a more useful format for more effectively finding and visualizing the data.
There is one Business Logic XSD ruleset that specifies which tags can be used and how to combine them to define the hierarchical tree structure for each project type. Each model has its own business logic XML file. Users can also customize the Business Logic for a particular model should they wish to visualize a project in different ways.
Why Data Standards Are Important
Increased Interoperability
The simple act of standardizing data on open source formats eliminates the need for expensive proprietary software that excludes those without the necessary license. Moreover, the standards allow for automated approaches to data processing that dramatically increase the scale at which analysis can be performed. The project.rs.xml makes it possible for any tool to read and load any file structure and use the data. It is with this approach that the Riverscapes Consortium has developed approximately six models that combine to form a data processing waterfall, each model handing off its outputs to the next step in the process. None of this would be possible without the riverscapes data standards.
Faster Map Visualization
Simply dragging geospatial layers onto a map in GIS and accepting the default symbology limits the value one can glean from the data. We have solved this problem by incorporating a system of consistent cartographic symbology. This allows a model developer to design symbology once and have that symbology reused across all our applications, be they on the web, in QGIS, or ArcGIS.
Every time you look at a certain layer it should look the same way. Not only does this avoid the repetitive, time-consuming task of symbolizing data all the time, but it speeds comprehension because the presentation of the data is fit for purpose. Relevant colors are used. Categorical symbols and color ramps fit the data. You no longer have to think about any of this once the standards are applied.
Increased Adoption
Riverscapes research, like many disciplines, is held back by too many technology silos and standalone ecosystems. The riverscapes emphasis on FAIR principles, a core tenant of which is open source non-proprietary data formats, is motivated by making data and software more accessible. The quicker and easier that people can get hold of data, the quicker they can incorporate it into their decision making. We are committed to reduce the friction surrounding data and believe that more people will use your models and data if it complies with widely adopted standards.
How to Apply Riverscapes Standards to Your Data
Follow these steps, all of which are underpinned by the FAIR principles, to apply the riverscapes standards to your data.
- Ensure that all the data are in open, non-proprietary formats. This is especially important for geospatial data. We recommend GeoTIFF files for rasters and GeoPackages for vector data. Shapefiles, although a published format, are discouraged for numerous reasons.
- Organize the data into a logical file and folder structure. Use common sense and think about how users will consume the data. There are no specific rules about the folder structure.
- Provide an accompanying XML file called
project.rs.xml
at the root of the folder structure. This XML file has a strict, but helpful structure that helps catalog the individual data files. This XML file must refer to the latest version of the Riverscapes Data Standards. The best way to ensure compliance is to refer to the XSD file and then validate your XML file using a tool such as Visual Studio Code. Data with aproject.rs.xml
file that does not validate cannot be incorporated into the Riverscapes Data Exchange. - Ensure that the
project.rs.xml
file includes appropriate metadata to help others understand and leverage your data. This is your chance to help future you remember all the nitty gritty details about where the data originated, how it was processed and any limitations on its use.