As contributors and maintainers of this project, we pledge to respect all people who contribute through reporting issues, posting feature requests, updating documentation, submitting pull requests or patches, and other activities.
We are committed to making participation in this project a harassment-free experience for everyone, regardless of level of experience, gender, gender identity and expression, sexual orientation, disability, personal appearance, body size, race, ethnicity, age, or religion.
Examples of unacceptable behavior by participants include the use of sexual language or imagery, derogatory comments or personal attacks, trolling, public or private harassment, insults, or other unprofessional conduct.
Project maintainers have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct. Project maintainers who do not follow the Code of Conduct may be removed from the project team.
This code of conduct applies both within project spaces and in public spaces when an individual is representing the project or its community.
Instances of abusive, harassing, or otherwise unacceptable behavior may be reported by opening an issue or contacting one or more of the project maintainers.
This Code of Conduct is adapted from the Contributor Covenant, version 1.1.0, available from http://contributor-covenant.org/version/1/1/0/
Accession - plant materials collected from a particular area.
Active reflectance - measurement of light originating from a sensor that reflects off of an object and back to the sensor
Algorithm - a process or set of rules to be followed in calculations or other problem-solving operations
Alignment, sequence - a way of arranging the sequences of DNA, RNA, or protein to identify regions of similarity that may be a consequence of functional, structural, or evolutionary relationships between the sequences
API (application programming interface) - a set of routine definitions, protocols, and tools for building software and applications.
BAM (Binary Alignment/Map) format - binary format for storing sequence data.
BED (Browser Extensible Data) format - format consisting of one line per feature, each containing 3-12 columns of data, plus optional track definition lines.
BETYdb (Biofuel Ecophysiological Traits and Yields database) - a web-based database of plant trait and yield data that supports research, forecasting, and decision making associated with the development and production of cellulosic biofuel crops
BRDF (Bidirectional Reflectance Distribution Function) - a function of four real variables that defines how light is reflected at an opaque surface.
Breeding Management System (BMS) - an information management system developed by the Integrated Breeding Platform to help breeders manage the breeding process, from program planning to decision-making.
Brown Dog - a research project to develop a method for easily accessing historic research data stored in order to maintain the long-term viability of large bodies of scientific research.
BWA - a software package for mapping low-divergent sequences against a large reference genome.
Clowder - a scalable data repository for sharing, organizing and analyzing data
Collections - one or more datasets.
Cultivar - plants selected for desirable characteristics that can be maintained by propagation.
Data product level - relative amount that data products are processed. Level 0 products are raw data at full instrument resolution. At higher levels, the data are converted into more useful parameters and formats.
Data standards - the rules by which data are described and recorded.
Datasets - one or more files with associated metadata collected by one sensor at one time point.
Downwelling spectral irradiance - The component of radiation directed toward the earth's surface per unit frequency or wavelength
Exposure - the amount of light per unit area reaching an electronic image sensor
FASTQ format - a text-based format for storing both a biological sequence (usually nucleotide sequence) and its corresponding quality scores.
FASTX-toolkit - a collection of command line tools for Short-Reads FASTA/FASTQ files preprocessing.
Gantry - a rail-bound crane systems that transport a measurement platform (like the Scanalyzer) over a field
GAPIT (Genome Association and Prediction Integrated Tool) – an R package that performs Genome Wide Association Study (GWAS) and genome prediction (or selection).
GATK (Genome Analysis Toolkit) - a software package for analysis of high-throughput sequencing data
Gbrowse - a combination of database and interactive web pages for manipulating and displaying annotations on genomes.
Generic Model Organism Database (GMOD) - a collection of open source software tools for managing, visualizing, storing, and disseminating genetic and genomic data.
Genome annotation - the process of attaching biological information to sequences.
Genomic coordinates - The beginning and ending positions of an annotation along a sequence
Genotype calling - inferring the genotype carried by an individual at each site
GeoDjango - geographic Web framework for building GIS Web applications
Germplasm - the sum total of genetic resources of an organism.
GFF (General Feature Format) - format consisting of one line per feature, each containing 9 columns of data, plus optional track definition lines
GIS (geographic information system) - a system designed to capture, store, manipulate, analyze, manage, and present all types of spatial or geographical data.
Globus - a connected set of data transfer and sharing services for research data management.
Hierarchical Data Format (HDF) - a set of file formats (HDF4, HDF5) designed to store and organize large amounts of data.
Hyperspectral data - information from across the electromagnetic spectrum.
IGV (Integrative Genomics Viewer) - a high-performance visualization tool for interactive exploration of large, integrated genomic datasets.
Integrated Breeding Platform (IBP) - platform providing integrated, high-performing breeding informatics and management system
Jbrowse - an embeddable genome browser
Json - open-standard format that uses human-readable text to transmit data objects consisting of attribute–value pairs.
Jupyter Notebook - a web application for creating and sharing documents that contain live code, equations, visualizations and explanatory text.
Lemnatec - supplier of software and automated research platforms for plant phenotyping.
Metadata - data that provides information about other data
MLMM (multi-locus mixed-model) - analysis for genome-wide association studies (GWAS) that uses a forward and backward stepwise approach to select markers as fixed effect covariates in the model.
NetCDF - a set of software libraries and self-describing, machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data.
OpenAlea - a distributed collaborative effort to develop Python libraries and tools that address the needs of current and future works in Plant Architecture modeling.
OpenCV (Open Source Computer Vision Library) - an open source computer vision and machine learning software library.
PAR (Photosynthetically Active Radiation) - the amount of light available for photosynthesis, which is light in the 400 to 700 nanometer wavelength range.
Phenotype - the set of observable characteristics of an individual resulting from the interaction of its genotype with the environment.
Phytozome - a project that facilitates comparative genomic studies amongst green plants.
PlantCV - an imaging processing package specific for plants that is built upon open-source software
PostGIS - an open source software program that adds support for geographic objects to the PostgreSQL object-relational database.
Python - a programming language
QA (quality assurance) - a planned system of review procedures conducted outside the actual data compilation.
QC (quality control) - a system of checks to assess and maintain the quality of the data.
Quality scores - measure of the probability that a nucleotide base is correctly identified from DNA sequencing
R/qtl - an extensible, interactive environment for mapping quantitative trait loci (QTL) in experimental crosses.
Raw data - unprocessed data collected from an experiment
Reads - sequence of nucleotides of a segment of DNA
Reference data - data that defines the set of permissible values to be used by other data fields.
RESTful API - an application program interface (API) that uses HTTP requests to get, put, post, and delete data.
ROGER - a cluster housed at NCSA that has 13.3 TB of system memory available for computation
Rstudio - a set of integrated tools for use with R, a software environment for statistical computing and graphics.
SAMtools (Sequence Alignment/Map) – a generic format for storing large nucleotide sequence alignments.
Scanalyzer - instrumentation created by Lemnatec with robotic sensor arm with multiple overhead cameras and sensors
Sequencing - the process of determining the precise order of nucleotides within a DNA molecule.
SNP (single nucleotide polymorphism) - a variation in a single nucleotide that occurs at a specific position in the genome
Spaces - contain collections and datasets. TERRA-REF uses one space for each of the phenotyping platforms.
Spectral exposure - the radiant energy received by a surface, per unit time, per unit frequency
Spectral flux - the radiant energy emitted, reflected, transmitted or received, per unit time, per unit frequency
Spectral response function (SRF) - the quantum efficiency of a sensor at specific wavelengths over the range of a spectral band
SQL (Structured Query Language) is a special-purpose programming language designed for managing data held in a relational database management system
SRA (Sequence Read Archive) - a bioinformatics database that provides a public repository for DNA sequencing data
Standards committee - TERRA project representatives and external advisors who work to create clear definitions of data formats, semantics, and interfaces, file formats, and representations of space, time, and genetic identity based on existing standards, commonly used file formats, and user needs to make it easier to analyze and exchange data and results.
Swagger - a set of rules for a format describing REST API. The format can be used to share documentation among product managers, testers and developers, but can also be used by various tools to automate API-related processes.
TASSEL-GBS - software for investigating the relationship between phenotypes and genotypes
TERRA (Transportation Energy Resources from Renewable Agriculture) - a program funded by ARPA-E program that facilitates the improvement of advanced biofuel crops, by developing and integrating cutting-edge remote sensing platforms, complex data analytics tools, and high-throughput plant breeding technologies.
TERRA-REF (Transportation Energy Resources from Renewable Agriculture Phenotyping Reference Platform) - a research project focused on developing an integrated phenotyping system for energy sorghum that leverages genetics and breeding, automation, remote plant sensing, genomics, and computational analytics.
Thredds: Geospatial Data server - a web server that provides metadata and data access for scientific datasets, using a variety of remote data access protocols
Trait - the morphological, anatomical, physiological, biochemical and phenological characteristics of plants and their organs
Variants - a nucleotide difference in a genotype compared to a reference genotype
VCF - a text file format (most likely stored in a compressed manner). It contains meta-information lines, a header line, and then data lines each containing information about a position in the genome.
Vcftools - a program package designed for working with VCF files
White reference, reflectance of - light reflecting off of a white reference object that is used for the calibration of hyperspectral images
For use by the TERRA Reference Phenotyping Standards Committee.
All of the web-based software below provides the ability to organize projects hierarchically, facilitate sharing, and support collaboration. Much of this is publicly viewable.
Github github.com/terraref project management, website content and hosting, collaborative software development
Google Drive collaborative editing of documents that we create (notes, manuscripts, etc)
Data products repository https://github.com/terraref/reference-data
issues and milestones: https://github.com/terraref/reference-data/issues
Computational Pipeline Repository https://github.com/terraref/computational-pipeline
issues and milestones: https://github.com/terraref/computational-pipeline/issues
Website for R&D : https://terraref.ncsa.illinois.edu
Documentation
GitHub Repository: https://terraref.ncsa.illinois.edu
Edit in the GitBook Desktop Editor or GitBook Web interface (see GitBook Documentation)
Features
Interface to 'git', a specialized command-line tool for version control.
Issue tracking and discussion forum https://guides.github.com/features/issues/
participants can reply to issues via email, similar to an email discussion list