Tool assembly: Plant Phenomics

What is the plant phenomics tool assembly and who can use it?

The plant phenomics tool assembly covers the whole life cycle of experimental plant phenotyping data. It uses the concepts of the MIAPPE (Minimum Information About a Plant Phenotyping Experiment) standard: (i) experiments description including organisation, objectives and location, (ii) biological material description and identification and (iii) traits (phenotypic and environmental) description including measurement methodology. A more detailed overview of the MIAPPE standard is available, as well as the full specifications.

The plant phenomics tool assembly helps everyone in charge of plant phenotyping data management to enable:

the integration of phenotyping data with other omics data: see the general principles on the Plant Sciences domain page;
the findability of their data in plant specific (e.g. FAIDARE) or generic search portal (e.g. Google Data Search);
the long term reusability of their data.

How can you access the plant phenomics tool assembly?

All the components of the plant phenomics tool assembly are publicly available and listed below, but many of them require registration.

Tools and resources used in managing plant phenomics and phenotyping data. — Figure 1. The plant phenomics tool assembly.

Data management planning

The general principles to be considered are described in the Plant Sciences domain page.

Data Stewardship Wizard is a human-friendly tool for machine-actionable DMP collaborative editing. The DSW Plant Sciences project template, available on ELIXIR’s DSW instance for researchers can be used for any plant sciences project. When creating the DMP Project, choose the option “From Project Template” and search for the “Plant Sciences” template.

File based data collection

The metadata and description of your experiments should be filled using a MIAPPE template. Note that a README file is provided, which fully describes each field, including its type and whether it is optional or mandatory. All fields should be present in the file you are using, even if you leave the optional ones empty. This will allow standard processing and validation using dedicated tools.

Experimental data gathering and management

Systems for file based data collection

FAIRDOM-SEEK is an open source web-based data sharing platform used as a repository or a catalogue. It is being deployed as several instances ranging from confidential project data sharing platforms (INRAE/AGENT, VIB) to public repositories like FAIRDOMHub. It is MIAPPE compliant through the integration of MIAPPE metadata at the investigation, study and assay levels. It can be used for project based early data sharing, in preparation for long term data storage, but also as a preservation tool for raw data.
pISA-tree is a data management solution developed to contribute to the reproducibility of research and analyses. Hierarchical set of batch files is used to create standardised nested directory tree and associated files for research projects.
COPO is a data management platform specific to plant sciences.

High throughput dedicated systems

PHIS the open-source Phenotyping Hybrid Information System (PHIS), based on OpenSILEX, manages and collects data from Phenotyping and High Throughput Phenotyping experiments on a day to day basis. It can store, organise and manage highly heterogeneous (e.g. images, spectra, growth curves) and multi-spatial and temporal scale data (leaf to canopy level) originating from multiple sources (field, greenhouse). It unambiguously identifies all objects and traits in an experiment and establishes their relations via ontologies and semantics that apply to both field and controlled conditions. Its ontology-driven architecture is a powerful tool for integrating and managing data from multiple experiments and platforms, for creating relationships between objects and enriching datasets with knowledge and metadata. It is MIAPPE and BrAPI compliant, and naming conventions are recommended for users to declare their resources. Several experimental platforms use PHIS to manage their data, and PHIS instances dedicated to sharing resources (projects, genetic resources, variables) also exist to allow the sharing of studied concepts.
PIPPA is the PSB Interface for Plant Phenotype Analysis, is the central web interface and database that provides the tools for the management of the plant imaging robots on the one hand, and the analysis of images and data on the other hand. The database supports all MIAPPE fields which are accessible through the BrAPI endpoints. Experiment pages are marked up with Bioschemas to improve findability on google.

Data processing and analysis

It is important to keep in mind the difference between data processing and analysing.

Processing provides the tools and procedures to transform primary data, such as imaging or observational data, to appropriate quality and processability.
Analysing, on the other hand, is concerned with extracting information from the processed data for the purpose of supporting knowledge acquisition. Some analysis tools dedicated to plant phenotyping experiments are registered in bio.tools, for example: Plant 3D,LeafNet, PlantCV, Phenomenal 3D

The data collected and annotated can be shared in trustworthy repositories under clear conditions of access to the data. As no global central repository exists for phenotyping data, the Plant Science research community combines the use of scattered trustworthy repositories and of centralised search tools.

Metadata management

ISA4J is a software library which can help you to programmatically generate ISA-Tab formatted metadata for your experiments. This will make your metadata machine-(and human-)readable and thereby improve the reusability of your work. It was especially designed for large datasets and/or to be included in applications which export data regularly, but of course it can also be used for smaller, individual datasets (although you will need to know how to code). Since version 1.1 it also supports specific term completion and validation for MIAPPE, see the isa4j documentation.

Repositories

DATAVERSE is an open source research data repository software used by several research institute over the globe to publicly share heterogenous dataset. In Europe, it is being used among others by the portuguese DMPortal, the german Julich data portal, and the french Recherche Data Gouv (previously Data.INRAE) research communities. Its main strength is its flexibility, as the mandatory metadata are focused on publication information such as title, abstract, authors and keywords. It can therefore host any datatype, which is both a strength and a weakness, as shared good practices are necessary to ensure the reusability and findability of published phenomic data.
Plant Genomics and Phenomics Research Data Repository is a comprehensive research data repository, which is hosted at the Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben and is mainly focused on sharing high valuable and large genomics and phenomics datasets. It is the first productive instance, which is based on the open source e!DAL infrastructure software and is furthermore a part of the de.NBI/ELIXIR Germany services. All provided datasets are FAIR compliant and citable via a persistent DOI. By using the widely established LifeScience AAI (formerly known as ELIXIR AAI) the submission procedure is open for all ELIXIR associated users. The key feature of e!DAL-PGP is its user-friendly, simple and FAIR-compliant data submission and internal review procedure. The repository has no general limit to any type of size of datasets. A comprehensive documentation including, guidelines, code snippets for technical integration and videos is available on the project website.
GnpIS is a multispecies integrative information system dedicated to plants. It allows researchers to access genetic, MIAPPE compliant phenotypic data as well as genomic data. It is used by both large international projects and the French National Research Institute for Agriculture, Food and Environment.
Zenodo is a powerful data publication service, which is supported by the European commission and focused on research data, including supplemental material like software, tables, figures or slides. Therefore the publication is usually associated with the publication of a research paper, book chapters or presentations. The Zenodo data submission form allows to describe every data file with a set of technical metadata based on the DataCite metadata schema, which is necessary and assign a persistent DOI to every dataset. The Zenodo infrastructure is hosted at the CERN and can publish dataset up to a size of 50 GB for free. For larger datasets a specific support request is necessary. A further valuable feature of Zenodo is the connection to GitHub and the provided opportunity to assign a DOI to a concrete version or rather commit of a hosted software repository which allows to persist software scripts, which improves the reproducibility of research workflows and results, which is often a challenge especially for older research publications.

BrAPI(the Breeding API) is a MIAPPE compliant web service specification available on several deposition databases. Those endpoints can be validated using the BrAPI validator BRAVA BrAPI hosts several documentation and training material to support its usage.

Data reuse

Plant phenotyping data reuse relies on rich metadata following the MIAPPE specifications annotated with proper ontologies. Most of the important ontologies are registered on FAIRSHARING: use this search example.

AgroPortal is a vocabulary and ontology repository for agronomy and related domains.
FAIDARE(FAIR Data-finder for Agronomic Research) is a portal facilitating discoverability of public data on plant biology from a federation of established data repositories.

Your tasks

Documentation and metadata

How to document and describe your data.

Your tasks

Data publication

How to prepare data and find repositories for publication.

Your domain

Plant sciences

Data management solutions for plant sciences data.

Tool assembly

Plant Genomics

Tool assembly for managing plant genomic data.

More information

Links to FAIR Cookbook

FAIR Cookbook is an online, open and live resource for the Life Sciences with recipes that help you to make and keep data Findable, Accessible, Interoperable and Reusable; in one word FAIR.

Publishing plant phenotypic data

Training

MIAPPE training in TeSS

MIAPPE templates on GitHub

PHIS user documentation

PHIS developer documentation

Tools and resources on this page

Tool or resource	Description	Related pages	Registry
AgroPortal	Browser for ontologies for agricultural science based on NBCO BioPortal.	Agroecology Documentation and meta...	Tool info Standards/Databases
Bioschemas	Bioschemas aims to improve the Findability on the Web of life sciences resources such as datasets, software, and training materials	Enzymology and biocata... Intrinsically disorder... Virology Data discoverability Machine actionability Documentation and meta...	Standards/Databases Training
BrAPI	Specification for a standard API for plant data: plant material, plant phenotyping data	Plant sciences	Training
BRAVA	BRAVA is a tool designed to help developers test servers that comply with the BrAPI specifications.
COPO	Collaborative OPen Omics (COPO) is a portal for scientists to describe, store and retrieve data more easily, using community standards and public repositories that enable the open sharing of results. The COPO project is one of several projects supported by the [Earlham Institute](https://www.earlham.ac.uk/research-project/collaborative-open-omics-copo) (EI), in Norwich, United Kingdom.	Biodiversity Plant sciences Data discoverability Documentation and meta...	Tool info Standards/Databases
Data Stewardship Wizard	Publicly available online tool for composing smart data management plans DSW@IFB learning.DSW DS-Wizard ELIXIR-Norway BioData.pt Data Stewardship Wizard SciLifeLab DS-Wizard DS Wizard ELIXIR Slovenia	CSC FAIRtracks Plant Genomics Plant sciences Data management plan GDPR compliance	Tool info Training
DATAVERSE	Open source research data respository software. DataverseNO BioData.pt Data Management Portal (DMPortal)	Enzymology and biocata... Plant sciences Machine actionability Data storage	Training
e!DAL	Electronic data archive library is a framework for publishing and sharing research data	Plant sciences	Tool info Training
FAIDARE	FAIDARE is a tool allowing to search data across dinstinct databases that implemented BrAPI.	Agroecology Plant sciences	Tool info Training
FAIRDOM-SEEK	A data Management Platform for organising, sharing and publishing research datasets, models, protocols, samples, publications and other research outcomes.	NeLS Microbial biotechnology Plant sciences Data discoverability Documentation and meta... Data storage	Tool info
FAIRDOMHub	Data, model and SOPs management for projects, from preliminary data to publication, support for running SBML models, etc. (public SEEK instance)	NeLS Plant Genomics Microbial biotechnology Plant sciences Data discoverability Documentation and meta...	Standards/Databases Training
GnpIS	A multispecies integrative information system dedicated to plant and fungi pests. It allows researchers to access genetic, phenotypic and genomic data. It is used by both large international projects and the French National Research Institute for Agriculture, Food and Environment.		Tool info Standards/Databases
ISA4J	Open source software library that can be used to generate a ISA-TAB export from in-house data sets. These comprises e.g. local database or local file system based experimental.		Tool info
LeafNet	LeafNet is a convenient tool that can robustly localise stomata and segment pavement cells for light-microscope images of leaves		Tool info
MIAPPE	Minimum Information About a Plant Phenotyping Experiment	Plant Genomics Plant sciences Machine actionability Documentation and meta...	Standards/Databases Training
Phenomenal 3D	Phenomenal-3D is an automatic open-source library for 3D shoot architecture reconstruction and analysis for image-based plant phenotyping		Tool info
PHIS	The open-source Phenotyping Hybrid Information System (PHIS) manages and collects data from plants phenotyping and high throughput phenotyping experiments on a day to day basis.		Training
pISA-tree	A data management solution for intra-institutional organization and structured storage of life science project-associated research data, with emphasis on the generation of adequate metadata.		Tool info
Plant 3D	Plant 3D is a plant phenotyping toolkit for 3D point clouds. Plant 3D (P3D) automatically extracts common phenotyping features of interest from high-resolution 3D scans of plant architectures.		Tool info
Plant Genomics and Phenomics Research Data Repository	A repository for plant genomics and phenomics research data, including data from the German Plant Phenotyping Network (DPPN) and the European Plant Phenotyping Network (EPPN).	Plant Genomics Agroecology	Tool info Standards/Databases
PlantCV	Plant Computer Vision (PlantCV)is an image processing toolkit for plant phenotyping analysis.		Tool info
Zenodo	Generalist research data repository built and developed by OpenAIRE and CERN	FAIRtracks Bioimaging data Biomolecular simulatio... Enzymology and biocata... Plant sciences Single-cell sequencing Data publication Identifiers	Standards/Databases Training

National resources

Tools and resources tailored to users in different countries.

Tool or resource	Description	Related pages	Registry
PIPPA	PIPPA, the PSB Interface for Plant Phenotype Analysis, is the central web interface and database that provides the tools for the management of the plant imaging robots on the one hand, and the analysis of images and data on the other hand.	Plant sciences Data Steward Researcher Research Software Engi...	Tool info