Introduction
Data management challenges in plant sciences
The plant sciences domain includes studying the adaptation of plants to their environments, with applications ranging from improving crop yield or resistance to environmental conditions to managing forest ecosystems. Data integration and reuse facilitate the understanding of the interaction between genotype and environment to produce a phenotype, which requires integrating phenotyping experiments and genomic assays made on the same plant material with geo-climatic data. Moreover, cross-species comparisons are often necessary to understand the mechanisms behind phenotypic traits, especially at the genotypic level, due to the gap in genomic knowledge between well-studied plant species and newly sequenced ones.
The challenges to data integration stem from the multiple levels of heterogeneity in this domain. It encompasses a variety of species, ranging from model organisms to crop species and wild plants such as forest trees. These often need to be detailed at infra-specific levels (e.g. subspecies, variety, Genebank material), but naming at these levels sometimes lacks consensus. Studies can take place in a diversity of settings, including indoor (e.g. growth chamber, greenhouse) and outdoor settings (e.g. cultivated field, forest), which differ fundamentally in the requirements and manner of characterising the environment. Phenotypic data can be collected manually or automatically (e.g. by sensors and drones) and be very diverse in nature, spanning physical measurements, the results of biochemical assays, and images. Some omics data can be considered as well as molecular phenotypes (e.g. transcriptome, metabolomes). Thus, the extension and depth of metadata required to describe a plant experiment in a FAIR-compliant way are very demanding for researchers.
Another particularity of this domain is the absence of central deposition databases for certain important data types, in particular data deriving from plant phenotyping experiments. Whereas datasets from plant omics experiments are typically deposited in global deposition databases for that type of experiment (i.e. EBI, NCBI, DDBJ), those from phenotyping experiments remain in institutional, national(e.g. Dataverse FR, e!Dal) or possibly European repositories (i.e. Zenodo). This makes it difficult to find, access and interconnect plant phenotyping data to prepare a meta-analysis.
Data management planning
Description
The general principles for data management planning are described in the Planning page of the Data life cycle section, while generic but more practical aspects of writing a DMP can be found on the Data Management Plan page.
Considerations
- Important general considerations about data management planning can be found on the Planning page.
- Phenotyping data must be described following the MIAPPE data standard.
- Omics data should include the Biological Material section of MIAPPE to ensure interoperability with phenotyping data.
- Make sure to identify and describe the biological material and the observation variables in your dataset description and metadata.
Solutions
The knowledge model of the data management planning application Data Stewardship Wizard was reviewed for compliance with the needs of the Plant Sciences community.
Machine-actionable DMP
- The DSW Plant Sciences project template, available on ELIXIR’s DSW instance for researchers, can be used for any plant sciences project. When creating the DMP Project, choose the option “Project Template” and search for the “Plant Sciences” template.
DMP as a text document
- DataPLAN is a Data Management Plan generator for plant science. It supports DMPs for Horizon 2020, Horizon Europe and the German BMBF and DFG. The main focus during development was to be able to be used with German funding agencies, but it was also extended to include other European funders.
- Depending on the country, there might also be other tools to take into consideration: for example, DMP OPIDoR in France or DMPonline for the UK. Visit the RDMkit national resources section for details.
Plant biological materials: (meta)data collection and sharing
Description
Plant genetic studies, such as GWAS or Genomic Selection, require the integration of genomic and phenotypic data with environmental data. While phenotypic and environmental data are typically stored together in phenotyping databases, genomic and other types of molecular data are typically deposited in international deposition databases, for example, those of the International Nucleotide Sequence Database Collaboration (INSDC).
It can be challenging to integrate phenotypic and molecular data even within a single project, particularly if the project involves studying a panel of genetic resources in different conditions. It is paramount to maintain the link between the plant material in the field, the samples extracted from them (e.g. at different development stages), and the results of omics experiments (e.g. transcriptomics, metabolomics) performed on those samples, across all datasets that will be generated and published.
Integrating phenotyping and molecular data, both within and between studies, hinges entirely on precise identification of the plant material under study (down to the variety or even the seed lot), as well as of the samples that are collected from these plants.
Considerations
- Are you working with established plant varieties, namely crop plants?
- Can you trace their provenance to a genebank accession or a plant variety registered in a national catalog?
- Are they identified in a germplasm database with an accession number?
- Are you working with crosses of established plant varieties?
- Can you trace the genealogy of the crosses to plant varieties from a genebank or identified in a germplasm database?
- Are you working with experimental material?
- Can you trace a genealogy to other material?
- How do you unambiguously identify your material?
Solutions
Identification of plant biological materials
- Detailed metadata needs to be captured on the biological materials used in the study—the accession in the genebank or the experimental identification and, when applicable, the seed lots or the parent plants as well as the possible samples taken from the plant—as they are the key to integrating omics and phenotyping datasets.
Checklists and metadata standard
- The identification and description of plant materials should comply with the standard for the identification of plant genetic resources, the Multi-Crop Passport Descriptor (MCPD).
- The minimal fields from MCPD are listed in the Biological Material section of the Minimum Information About Plant Phenotyping Experiments (MIAPPE) metadata standard.
- If you are studying experimental plant materials that cannot be traced to an existing genebank or germplasm database, you should describe them in accordance with the MCPD/MIAPPE Biological Material in as much detail as possible.
- If your plant materials can be traced to an existing genebank or germplasm database, you need only to cross-reference to the MCPD information already published in the genebank or germplasm database.
- For wild plants and accessions from tree collections, precise identification often requires the GPS coordinates of the tree. MIAPPE provides the necessary fields.
Tools for (meta)data collection
- For identifying your plant material in a plant genetic resource repository (genebank or germplasm database), you can consult the European Cooperative Programme for Plant Genetic Resources ECPGR, which includes a ECPGR Central Crop Databases and other Crop Databases and a catalogue of relevant International Multicrop Databases.
- Other key databases for identifying plant material are
- the European Search Catalogue for Plant Genetic Resources EURISCO, which provides information about more than 2 million accessions of crop plants and their wild relatives from hundreds of European institutes in 43 member countries.
- Genesys, an online platform with a search engine for Plant Genetic Resources for Food and Agriculture (PGRFA) conserved in genebanks worldwide.
- The “Biological Material” section of the MIAPPE_Checklist-Data-Model checklist deals with sample description.
(Meta)Data sharing
- For identifying samples from which molecular data was produced, the BioSamples database is recommended as a provider of international unique identifiers.
- The plant-miappe.json model provided by BioSamples is aligned with all recommendations provided above for plant identification and is therefore recommended for your sample submission.
- It is also recommended that you provide permanent access to a description of the project or study that contains links to all the data, molecular or phenotypic (see Data Publication)
Phenotyping: (meta)data collection and sharing
Description
Archiving, sharing, and publication of plant phenotyping data can be challenging, given that there is no global centralised archive for this type of data. Research projects often involve multiple partners, some of which collate data into their own (institutional) data management platforms, whereas others collate data into Excel spreadsheets.
For researchers, it is highly desirable that the datasets collected in different media by the partners in a research project (or across different collaborative projects) can be shared in a way that enables their integration for collective analysis and for facilitating deposition into a dedicated repository. For managers of plant phenotyping data repositories that support a project or institution, it is essential to ensure that the uptake of data is easy and includes a step of metadata validation upon intake.
It is recommended that metadata collection is contemplated from the start of the experiment and that the working environment facilitates (meta)data collection, storage, and validation throughout the project. In field studies, it is critical to record the geographical coordinates and time of the experiment for linkage with geo-climatic data. For all study types (fields, growth chamber or greenhouse), the environmental conditions that were measured should be described in detail.
Considerations
- Did you collect the metadata for the identification of your plant material according to the recommendation provided in the above section?
- Have you documented your phenotyping and environment assays (i.e. measurement or computation methodology based on the trait, method, scale triplet) both for direct measures (data collection) and computed data (after data processing or analysis)?
- Is there an existing Crop Ontology for the species you experiment with and does it describe your assay? If not, have you described your data following the trait, method, and scale triplet?
- Do you have your own system to collect data? Is it compliant with the MIAPPE standard?
- Are you exchanging data with individual researchers?
- In what media is data being collected?
- Is the data described in a MIAPPE-compliant manner?
- Are you exchanging data across different data management platforms?
- Do these platforms implement the Breeding API BrAPI specification?
- If not, are they MIAPPE-compliant? Do they enable automated data exchange?
Solutions
Data exchange standards and ontologies
- The metadata standard applicable to plant phenotyping experiments is MIAPPE.
- The Plant biological materials section follows Multi-Crop Passport Descriptor (MCPD) described above.
- The trait and phenotypes, including methods and protocols, are based on the Crop Ontology recommendations.
- Experiment type (e.g. greenhouse, field), location (geographical coordinates) and time enable linkage with geo-climatic data.
- Other sections include a description of investigations/dataset, studies, people involved, data files, environmental parameters, experimental factors, and events.
- It is implemented as an Excel template, as an ISA profile, in several databases and in the Breeding API (BrAPI).
- Tools and resources for data collection and management (all resources that support MIAPPE also support Crop Ontology):
- A simple MIAPPE Excel template enables pragmatic and basic data exchange. See also MIAPPE-compliant spreadsheet template to potentially enhance customised templates with ontology tools such as Rightfield or OnotoMaton.
- Submission using MIAPPE-compliant spreadsheet template to databases such as DATAVERSE, Zenodo and e!DAL is described in Plant Phenomics, which includes a detailled step by step procedure.
- FAIRDOM-SEEK is a free data management platform for which MIAPPE templates are available.
- DATAVERSE is a free data sharing platform where full datasets can be deposited with MIAPPE templates for a complete description. It is used in several repositories such as Recherche Data Gouv.
- e!DAL is a free data management platform for which MIAPPE templates are in development.
- ISA-tools also include a configuration for MIAPPE and can be used both for filling in metadata and for validating MIAPPE in ISA-Tab and ISA-JSON format.
- Collaborative Open Plant Omics COPO is a data management platform specific to the plant sciences.
- FAIRsharing is a manually curated registry of reporting guidelines, vocabularies, identifier schemes, models, formats, repositories, knowledge bases, and data policies that includes many resources relevant for managing plant phenotyping data.
- ISA Wizard is a configurable web application providing a questionnaire-based form resulting in ISA output.
-
For validation of MIAPPE datasets, see the validation section.
- If you or your partners collect data into data management platforms:
- If it implements BrAPI, you can exchange data using BrAPI calls.
- If it doesn’t implement BrAPI, the simplest solution would be to export data into the MIAPPE spreadsheet template, or another formally defined data template.
- It is also recommended that you provide permanent access to a description of the project or study that contains links to all the data, molecular or phenotypic (see Data Publication)
Genotyping: (meta)data collection and sharing
Description
Here are described the mandatory, recommended and optional metadata fields for data interoperability and re-use, as well as for data deposition in European Variation Archive (EVA) (European Variation Archive), the EMBL-EBI’s open-access genetic variation archive connected to BioSamples, described above. In addition to sample and experiment metadata, the use of stable variant identifiers (RSids) issued by EVA is strongly recommended. RSids (Reference SNP cluster IDs) provide a persistent and globally recognised reference for each variable locus, ensuring long-term traceability and interoperability across datasets.
Considerations
- Did you collect the metadata for the identification of your plant samples according to the recommendations provided in the above section?
- Is the reference genome assembly available in an International Nucleotide Sequence Database Collaboration (INSDC) archive and has a Genome Collections Accession number, either GCA or GCF?
- Is the analytic approach used for creating the Variant Call Format (VCF) file available in a publication and has a Digital Object Identifier (DOI)?
- How do you plan to refer to the variants you will submit to the European Variation Archive (EVA)?
Solutions
Checklists, ontologies, file formats
Sharing plant genotyping data files involves the use of the Variant Call Format (VCF) standard. Findability and reusability of Variant Call Format (VCF) files depend on the supplied metadata and, in particular, with MIAPPE-compliant biological material description: the plant genomic and genetic variation data submission recipe helps you on that topic. While metadata standards like MIAPPE and MCPD are described in the phenotyping and biological materials sections, respectively, genomic data relies on a range of standardised file formats to ensure interoperability. Depending on your research, you will encounter several other common formats:
- FASTA: The most fundamental format for representing nucleotide or protein sequences. Each sequence entry begins with a single-line description starting with a
>character, followed by lines of sequence data. - General Feature Format (GFF3): A tab-delimited text file used to describe the functional features of a genome, such as genes, exons, and regulatory elements. It allows researchers to annotate a reference genome. GFF3 is a popular and more structured version of the format.
- Browser Extensible Data (BED): A concise and flexible format for defining genomic regions. It is commonly used to provide annotation tracks for display in genome browsers and is simpler than GFF, requiring only the chromosome, start position, and end position for each feature.
- A Golden Path (AGP): A file format used in genome assembly projects. It describes how larger sequences, like chromosomes, are constructed by ordering and orienting smaller sequence fragments (contigs or scaffolds).
- HapMap Format: A specific text-based format for storing genotypes from a population. It typically arranges single-nucleotide polymorphisms (SNPs) in rows and individual samples in columns, making it well-suited for population genetics and genome-wide association studies (GWAS).
Data sharing
- Once the VCF file is ready with all necessary metadata, it can be submitted to European Variation Archive (EVA). You will find all necessary information on the submission steps on the EVA submission page.
Permanent identifers
- European Variation Archive (EVA) will issue a permanent identifier for each study (BioProject Accession, pattern: e.g. PRJEBxxxxxx) and analysis (Analysis Accession, e.g. ERZxxxxxx) included in the submission. This permanent identifier can be used in a publication to refer to the dataset.
-
Each variant submitted to the EVA will receive RSids for consistent referencing and interoperability. RSids are unique, stable identifiers assigned by the European Variation Archive (EVA) that cluster identical genetic variants found at the same genomic location across multiple independent submissions. These can be used in a publication or other database to highlight a specific variant of interest.
- It is also recommended that you provide permanent access to a description of the project or study that contains links to all the data, molecular or phenotypic (see Data Publication)
Validation of Plant Phenotypic and Genotypic Data
Description
To ensure the integrity, quality, and interoperability of datasets in plant phenotyping and genotyping, implementing data and metadata standards is required. These standards provide a structured framework for consistent data collection, storage, and sharing. They not only define the expected format and structure of the data but also necessitate validation to confirm compliance with these specifications. Validation can be broadly categorised into two types: semantic and syntactic. Both types are essential for ensuring that data and metadata meet the necessary standards for effective use and integration.
Considerations
- Syntactic Validation: This type of validation focuses on the structure and format of the data. It checks whether the data adheres to predefined rules regarding:
- Data Types: Ensuring that fields contain the correct types of data (e.g. numerical, textual).
- Field Completeness: Verifying that all mandatory fields are filled and that optional fields are populated appropriately.
- Consistency: Checking for uniformity in data entries, such as consistent naming conventions and units of measurement.
Syntactic validation is often automated and can be performed using software tools that analyse the data against the defined schema, making it efficient and reliable.
- Semantic Validation: This assesses the meaning and context of the data. It ensures that the data is meaningful within the context of the research and is used in accordance with its intended purpose. Key aspects include:
- Logical Consistency: Verifying that relationships between different data elements are logical and coherent.
- Domain Constraints: Ensuring that data values fall within acceptable ranges or categories relevant to the study.
- Contextual Relevance: Assessing whether the metadata accurately describes the dataset and provides sufficient context for future users.
Semantic validation often requires human supervision and domain expertise, as it involves interpreting the meaning and relevance of the data.
Solutions
To improve data and metadata validation in plant phenotyping and genotyping, several effective solutions can be implemented:
- Adoption of Standardised Protocols: Utilising established data and metadata standards, such as the Minimum Information About a Plant Phenotyping Experiment MIAPPE, can streamline the validation process.
- Automated Validation Tools: Implementing software tools designed for both syntactic and semantic validation can significantly improve the efficiency and accuracy of the validation process. These tools can automatically check data against predefined schemas, flagging errors and inconsistencies, which reduces the likelihood of manual errors.
- Training and Capacity Building: Providing training for researchers and data managers on the importance of data validation and the use of (meta)data standards is needed. Workshops, online courses, and resources can equip personnel with the necessary skills to effectively implement validation practices, fostering a culture of quality in data management. MIAPPE proposes several training materials and recorded webinars that can be freely reused.
Packaging and contextualising data for reuse
Beyond collecting metadata, it is important to package your data, code, and experimental descriptions together in a standardised way. This ensures that others can easily understand and reuse your work. Key concepts for this are the ISA-tools, Research Object Crate (RO-Crate), and Annotated Research Context.
- The ISA Model: The Investigation, Study, and Assay (ISA) model is a hierarchical framework for structuring and describing the metadata of a scientific experiment. It organises your project into three levels:
- Investigation: The overall project or research goal.
- Study: A specific experiment within the investigation (e.g. a greenhouse trial). It describes the biological material, experimental factors, and protocols used.
- Assay: A specific analysis performed on the material from a study (e.g. mass spectrometry, phenotyping measurements, or a sequencing run).
- The ISA-tools are built to help you create metadata that follows this model, and it is the structural backbone for platforms like FAIRDOM-SEEK.
- Research Object Crate (RO-Crate): Research Object Crate (RO-Crate) is a community standard for packaging all your research outputs into a single, FAIR-compliant bundle. Think of it as a “zip file for science” that contains not just your data, but also rich, machine-readable metadata describing the contents, contributors, methods, and publications. An RO-Crate can contain an ARC/ISA structure, making your entire research project easier to share, publish, and understand. Platforms like FAIRDOM-SEEK can export projects as RO-Crates, simplifying the process of archiving a complete, reproducible research package.
- Annotated Research Context (ARC): An Annotated Research Context is a practical implementation and extension of the ISA model. It is a container that bundles all the ISA metadata files together with the associated data files, protocols, and other relevant documents. The ARC provides a complete and self-contained “map” of your experiment, making it clear how all components are connected. It introduces two further layers to the ISA model, namely:
- Workflows: Workflows cover all computational steps of a study and contain application code, scripts, or any other executable description of an analysis, ensuring the highest flexibility for the scientists. To ensure persistence and reproducibility, these workflows comprise their own containerised running environment.
- Runs: The resulting data (runs) is linked to the workflows by a minimal Common Workflow Language (CWL) file specifying the input and output of the process.
Research Data Publication
Besides sharing standardised data using established repositories such as EVA or BioSamples, it is highly recommended to provide persistent access to the project or study description, which contains links to all data, molecular or phenotypic. Several generic databases handling plant research data are recommended for this purpose, including:
Especially for data deposition of phenotypic data, it is highly recommended that you opt for one of the repositories that either support MIAPPE Templates deposition (DATAVERSE, Zenodo, e!DAL) or that implement BrAPI compatible server, as they enhance findability through the ELIXIR plant data discovery service, FAIR Data-finder for Agronomic Research (FAIDARE), enable machine actionable access to MIAPPE compliant data and validation of that compliance.
The Plant Phenomics Assembly provides a more detailed overview and further information on the requirements and different submission procedures for the aforementioned databases and repositories.
Related pages
More information
Links to FAIRsharing
FAIRsharing is a curated, informative and educational resource on data and metadata standards, inter-related to databases and data policies.
Links to FAIR Cookbook
FAIR Cookbook is an online, open and live resource for the Life Sciences with recipes that help you to make and keep data Findable, Accessible, Interoperable and Reusable; in one word FAIR.
Training
Skip tool tableTools and resources on this page
| Tool or resource | Description | Related pages | Registry |
|---|---|---|---|
| A Golden Path (AGP) | A file format used in genome assembly projects. It describes how larger sequences, like chromosomes, are constructed by ordering and orienting smaller sequence fragments (contigs or scaffolds). | Standards/Databases | |
| Annotated Research Context | Framework for organising and documenting research data, extending ISA, CWL, and RO-Crate. | Training | |
| BioSamples | BioSamples stores and supplies descriptions and metadata about biological samples used in research and development by academia and industry. | Plant Genomics Biodiversity Virology | Tool info Standards/Databases Training |
| BioStudies | A database hosting datasets from biological studies. Useful for storing or accessing life sciences data without community-accepted repositories, and for linking components of data from multi-omics studies. | Microbial biotechnology Single-cell sequencing Data publication Project data managemen... | Tool info Standards/Databases Training |
| BrAPI | Specification for a standard API for plant data: plant material, plant phenotyping data | Plant Phenomics | Training |
| BrAPI compatible server | Submit a new BrAPI compatible server | ||
| Browser Extensible Data (BED) | A concise and flexible format for defining genomic regions. It is commonly used to provide annotation tracks for display in genome browsers and is simpler than GFF. | Standards/Databases | |
| COPO | Collaborative OPen Omics (COPO) is a portal for scientists to describe, store and retrieve data more easily, using community standards and public repositories that enable the open sharing of results. The COPO project is one of several projects supported by the [Earlham Institute](https://www.earlham.ac.uk/research-project/collaborative-open-omics-copo) (EI), in Norwich, United Kingdom. | Plant Phenomics Biodiversity Data discoverability Documentation and meta... | Tool info Standards/Databases |
| Crop Ontology | The Crop Ontology compiles concepts to curate phenotyping assays on crop plants, including anatomy, structure and phenotype. | Standards/Databases Training | |
| Data Stewardship Wizard | Publicly available online tool for composing smart data management plans | CSC FAIRtracks Plant Genomics Plant Phenomics Data management plan GDPR compliance | Tool info Training |
| DataPLAN | Data Management Plan (DMP) generator that focuses on plant science. | Tool info | |
| DATAVERSE | Open source research data respository software. | Plant Phenomics Enzymology and biocata... Machine actionability Data storage | Training |
| DMPonline | Data Management Plans that meet institutional funder requirements. | CSC Data management plan | Training |
| e!DAL | Electronic data archive library is a framework for publishing and sharing research data | Plant Phenomics | Tool info Training |
| ECPGR | Hub for the identification of plant genetic resources in Europe | ||
| ECPGR Central Crop Databases and other Crop Databases | A number of ECPGR Central Crop Databases have been established through the initiative of individual institutes and of ECPGR Working Groups. The databases hold passport data and, to varying degrees, characterization and primary evaluation data of the major collections of the respective crops in Europe. | ||
| EURISCO | European Search Catalogue for Plant Genetic Resources | Tool info | |
| European Variation Archive (EVA) | Open-access database of all types of genetic variation data from all species. | Plant Genomics | Tool info Standards/Databases Training |
| FAIDARE | FAIDARE is a tool allowing to search data across dinstinct databases that implemented BrAPI. | Plant Phenomics | Tool info Training |
| FAIRDOM-SEEK | A data Management Platform for organising, sharing and publishing research datasets, models, protocols, samples, publications and other research outcomes. | NeLS Plant Phenomics Microbial biotechnology Data discoverability Documentation and meta... Data storage | Tool info |
| FAIRDOMHub | Data, model and SOPs management for projects, from preliminary data to publication, support for running SBML models, etc. (public SEEK instance) | NeLS Plant Genomics Plant Phenomics Microbial biotechnology Data discoverability Documentation and meta... | Standards/Databases Training |
| FAIRsharing | A curated, informative and educational resource on data and metadata standards, inter-related to databases and data policies. | FAIRtracks Health data Microbial biotechnology Virology Data discoverability Data provenance Data publication Existing data Machine actionability Documentation and meta... | Standards/Databases Training |
| General Feature Format (GFF3) | A tab-delimited text file used to describe the functional features of a genome, such as genes, exons, and regulatory elements. | Standards/Databases | |
| Genesys | Genesys is an online platform where you can find information about Plant Genetic Resources for Food and Agriculture PGRFA conserved in genebanks worldwide. | ||
| HapMap Format | A specific text-based format for storing genotypes from a population. It typically arranges single nucleotide polymorphisms (SNPs) in rows and individual samples in columns. | ||
| International Multicrop Databases | A catalogue of relevant International Multicrop Databases | ||
| International Nucleotide Sequence Database Collaboration | The International Nucleotide Sequence Database Collaboration (INSDC) is a long-standing foundational initiative that operates between DDBJ, EMBL-EBI and NCBI. INSDC covers the spectrum of data raw reads, through alignments and assemblies to functional annotation, enriched with contextual information relating to samples and experimental configurations. | Galaxy Biodiversity Microbial biotechnology Data publication | Training |
| International Nucleotide Sequence Database Collaboration (INSDC) | A collaborative database of genetic sequence datasets from DDBJ, EMBL-EBI and NCBI | Galaxy Biodiversity Microbial biotechnology Data publication | Tool info Training |
| ISA Wizard | User-friendly application for creating FAIR- and ISA-compliant metadata for results of life science experiments. | ||
| ISA-tools | Open source framework and tools helping to manage a diverse set of life science, environmental and biomedical experiments using the Investigation Study Assay (ISA) standard | Standards/Databases | |
| MIAPPE | Minimum Information About a Plant Phenotyping Experiment | Plant Genomics Plant Phenomics Machine actionability Documentation and meta... | Standards/Databases Training |
| MIAPPE-compliant spreadsheet template | MIAPPE-compliant spreadsheet template | ||
| MIAPPE_Checklist-Data-Model | This document describes the MIAPPE Checklist and Data Model | ||
| Multi-Crop Passport Descriptor (MCPD) | The Multi-Crop Passport Descriptor is the metadata standard for plant genetic resources maintained ex situ by genbanks. | Standards/Databases Training | |
| OnotoMaton | OntoMaton facilitates ontology search and tagging functionalities within Google Spreadsheets. | Identifiers | |
| plant-miappe.json | BioSamples Plant MIAPPE checklist in JSON format | ||
| Recherche Data Gouv | An ecosystem for sharing and opening research data | Standards/Databases | |
| Research Object Crate (RO-Crate) | RO-Crate is a lightweight approach to packaging research data with their metadata, using schema.org. An RO-Crate is a structured archive of all the items that contributed to the research outcome, including their identifiers, provenance, relations and annotations. | Galaxy Microbial biotechnology Data provenance | Standards/Databases Training |
| Rightfield | RightField is an open-source tool for adding ontology term selection to Excel spreadsheets | Microbial biotechnology Identifiers | Tool info |
| Variant Call Format (VCF) | A common file format that contains information about variants found at specific positions in a reference genome. | Cancer data | Standards/Databases |
| Zenodo | Generalist research data repository built and developed by OpenAIRE and CERN | FAIRtracks Plant Phenomics Bioimaging data Biomolecular simulatio... Enzymology and biocata... Single-cell sequencing Data publication Identifiers | Standards/Databases Training |
National resources
Tools and resources tailored to users in different countries.
| Tool or resource | Description | Related pages | Registry |
|---|---|---|---|
| PIPPA | PIPPA, the PSB Interface for Plant Phenotype Analysis, is the central web interface and database that provides the tools for the management of the plant imaging robots on the one hand, and the analysis of images and data on the other hand. |
Plant Phenomics Data Steward Researcher Research Software Engi... | Tool info |