How to Access Data
Overview
TERRA-REF data can be accessed through many different interfaces: Globus, Clowder, BETYdb, CyVerse, and CoGe. Raw data is transfered to the primary compute pipeline using Globus Online. Data is ingested into Clowder to support exploratory analysis. The Clowder extractor system is used to transform the data and create derived data products, which are either available via Clowder or published to specialized services, such as BETYdb.
Resource
Use
Web User Interface
API*
clients
Sensor Data
Globus
Browse directories; transfer large sensor files
globus.org #TERRAREF endpoint
R, Python
Clowder
Browse and Download small Sensor Data
terraref.org/clowder
Python
Trait Data
BETYdb
Trait and Agronomic Metadata
terraref.org/bety
R traits package, Python: terrautils; SQL: Postgres in Docker
traitvis
View available trait data
terraref.org/traitvis
NA
NA
Genomics Data
CyVerse
Download Genomics data
terraref.org/cyverse-genomics
yes
CoGe
Download, process, visualize Genomics data
terraref.org/coge
Other
Tutorials
R and Python scripts for accessing data
terraref.org/tutorials
NA
Advanced Search
Search across sensor and trait data
search.terraref.org (under development)
yes
Tutorials (Recommended!)
We have developed tutorials to provide users with both 'quick start' vignettes and more detailed introductions to TERRA REF datasets. Tutorials for accessing trait data, sensor data, and genomics data are organized by directory ("traits", "sensors", and "genomics").
The tutorials assume familiarity with or willingness to learn Python and / or R, and provide the greatest flexibility and access to available data.
Globus: Browse and Transfer Files
Raw data is transferred to the primary TERRA-REF file system at the National Center for Computing Applications at the University of Illinois.
Use Globus Online when you want to transfer data from the TERRA-REF system for local analysis.
Transferring data using Globus Connect:
To access data via Globus, you must first have a Globus account and endpoint.
Select source
Endpoint: #Terraref
Path: Navigate to the subdirectory that you want.
Select (click) a folder
Select (highlight) files that you want to download at destination
Select the endpoint that you set up above of your local computer or server
Select the destination folder (e.g. /~/Downloads/)
Click 'go'
Files will be transfered to your computer
Requesting Access to unpublished data in TERRA-REF BETYdb:
To request access to unpublished data, send your Globus id to David LeBauer (dlebauer@email.arizona.edu) with 'TERRAREF Globus Access Request' in the subject.
fill out the terraref.org/beta user form
email dlebauer@email.arizona.edu with your globusid to request access.
BETYdb: Trait Data and Agronomic Metadata
BETYdb contains the derived trait data with plot locations and other information associated with agronomic experimental design.
Accessing data in R
Requesting Access to unpublished data in TERRA-REF BETYdb:
email dlebauer@email.arizona.edu for your account to be approved.
Using SQL and PostGIS with Docker (Advanced Users)
The fastest and most comprehensive way to access the database using SQL and other database interfaces (such as the R package dplyr interface described below, or GIS programs described in . You can run an instance of the database using docker, as described below
This is how you can access the TERRA REF trait database. It requires that you install the Docker software on your computer.
psql
R
GIS software
Clowder: Sensor Data and Metadata Browser
Data organization in Clowder
Data is organized into spaces, collections, and datasets, collections.
Spaces contain collections and datasets. TERRA-REF uses one space for each of the phenotyping platforms.
Collections consist of one or more datasets. TERRA-REF collections are organized by acquisition date and sensor. Users can also create their own collections.
Datasets consist of one or more files with associated metadata collected by one sensor at one time point. Users can annotate, download, and use these sensor datasets.
Requesting Access to unpublished data in Clowder:
email dlebauer@email.arizona.edu for your account to be approved.
CyVerse: Genomics Data
TERRA-REF genomics data is accessible on the CyVerse Data Store and Discovery Environment. Accessing data through the CyVerse Discovery Environment requires signing up for a free CyVerse account. The Discovery Environment gives users access to software and computing resources, so this method has the advantage that TERRA-REF data can be utilized directly without the need to copy the data elsewhere.
You can also find these in the CyVerse discovery environment in the TERRA-REF Community Data folder: /iplant/home/shared/terraref
.
CoGe: Genomics Data
Last updated