Your role: Data Steward: research
Description
As a research data steward, I support and work in close collaboration with the main data producers and users in academia: the researchers, ranging from undergraduate students to full professors. I advise researchers, make sure data is handled in a manner compliant with the institute’s policy and may also perform hands-on work in a project.
My work focuses on implementing the institute’s data guidelines and translating them into domain and project specific procedures, for example by managing a database or reviewing data management plans. My responsibilities and tasks focus on translating the researcher needs on data into infrastructural and service requirements.
Focus
- Develop and implement data management plans for projects and data collections and align Data Managements Plans (DMP) with the FAIR (Findable, Accessible, Interoperable, Reusable) data principles and the principles of Open Science
- Advise projects and data collections on compliance with codes of conduct, regulations and field specific legal and ethical standards
- Provide adequate research data management (RDM) support to researchers. This involves, for example, supporting researchers in improving the reproducibility of their computational analyses or directing researchers to appropriate data management and archival solutions
- Monitor a project’s needs regarding data-infrastructure and tools for RDM
- Determine the adequate level of knowledge and skills of researchers on RDM
- Identify the requirements of adequate support and data infrastructure for FAIR and long-term archiving of data of a project
Learning path
Institutes across Europe have started hiring professional data stewards. A research oriented data steward is expected to be competent in the following areas:
- Create awareness and communicate about RDM and the FAIR data principles and translate RDM policies into guidelines for researchers
- Transform discipline specific research data into FAIR data with help of available services and tools
- Advise and assist researchers on short and long term actions for RDM
- Assess RDM knowledge and skills, identify gaps among researchers and take action when needed
- Understand the purpose and use of a DMP in a project and have the skills to utilise the available tools and templates to produce a DMP
- Assist researchers in developing a DMP, review DMPs, and support researchers in putting DMPs into action
- Liaise with the surrounding environment (department, project, national stakeholders and international network) and continuously follow the field to gain knowledge of relevant facilities, tools and emerging standards available for RDM
Related pages
How to measure compliance to data management regulations and standards. Data management plan
How to write a Data Management Plan (DMP). Data organisation
Best practices to name and organise research data. Licensing
How to license research data. Documentation and metadata
How to document and describe your data. Data protection
How to protect your research data, and how to make research data compliant to GDPR. Data publication
How to prepare data and find repositories for publication. Data quality
How to ensure high quality of research data. Data transfer
How to transfer data files. Identifiers
How to use identifiers for research data. Machine actionability
How to make machine-actionable (meta)data. Project data management coordination
How to coordinate and organise data management activities in collaborative or multi-parter projects. Data provenance
How to record information about data provenance.
More information
Relevant tools and resources
Skip tool tableTool or resource | Description | Related pages | Registry |
---|---|---|---|
Argos | Plan and follow your data. Bring your Data Management Plans closer to where data are generated, analysed and stored. | Data management plan Researcher | |
Atlas | Free, publicly available web-based, open-source software application developed by the OHDSI community to support the design and execution of observational analyses to generate real world evidence from patient level observational data. | Researcher TransMed | Tool info Training |
BBMRI-ERIC's ELSI Knowledge Base | The ELSI Knowledge Base is an open-access resource platform that aims at providing practical know-how for responsible research. | Data protection Data sensitivity Data Steward: policy Human data | |
Beacon | The Beacon protocol defines an open standard for genomics data discovery. | Researcher Data Steward: infrastructure Human data | Tool info Standards/Databases Training |
BisQue | Resource for management and analysis of 5D biological images | Data organisation Data analysis Bioimaging data | Tool info |
Bitbucket | Git based code hosting and collaboration tool, built for teams. | Data organisation Data Steward: infrastructure | Standards/Databases |
Bulk Rename Utility | File renaming software for Windows | Data organisation Researcher | |
Castor | Castor is an EDC system for researchers and institutions. With Castor, you can create and customize your own database in no time. Without any prior technical knowledge, you can build a study in just a few clicks using our intuitive Form Builder. Simply define your data points and start collecting high quality data, all you need is a web browser. | Identifiers Data Steward: infrastructure | Tool info |
CEDAR | CEDAR is making data submission smarter and faster, so that scientific researchers and analysts can create and use better metadata. | Documentation and metadata Machine actionability Researcher | Tool info Standards/Databases |
Choose a license | Choose an open source license | Licensing Researcher Data Steward: policy | |
Cookiecutter | A command-line utility that creates projects from cookiecutters (project templates), e.g. creating a Python package project from a Python package project template. | Data organisation Data Steward: infrastructure | |
Create a Codebook | Examples and tools to create a codebook by the Data Documentation Initiative (DDI) | Documentation and metadata Researcher | |
Creative Commons License Chooser | It helps you choose the right Creative Commons license for your needs. | Licensing Researcher Data Steward: policy | |
Crop Ontology | The Crop Ontology compiles concepts to curate phenotyping assays on crop plants, including anatomy, structure and phenotype. | Researcher Data Steward: infrastructure Plant sciences Plant Phenomics | Standards/Databases Training |
Cytomine-IMS | Image Data management | Bioimaging data | |
DAMAP | It guides you step by step through a DMP and lets you export a pre-filled DMP as a Word document that you can customize and use for submission to funders. Also, DAMAP is compatible with the RDA recommendation for machine-actionable DMPs and offers an export of JSON DMPs. DAMAP is open source and to be self deployed. | Data management plan Researcher | |
Data Curation Centre Metadata list | List of metadata standards | Documentation and metadata Researcher | |
Data INRAE | Dataverse for life sciences and agronomic related data | Plant sciences Plant Genomics Researcher Plant Phenomics | Standards/Databases |
Data Stewardship Wizard | Publicly available online tool for composing smart data management plans | Data management plan Researcher Data Steward: infrastructure NeLS TSD Plant Phenomics Plant Genomics | Tool info Training |
Data Use Ontology | DUO allows to semantically tag datasets with restriction about their usage. | Researcher Human data | Standards/Databases Training |
DATAVERSE | Open source research data respository software. | Data storage Researcher Data Steward: infrastructure IFB | Training |
DMP Canvas Generator | Questionnaire, which generates a pre-filled a DMP | Data management plan Researcher | |
DMPlanner | Semi-automatically generated, searchable catalogue of resources that are relevant to data management plans. | Data management plan Researcher | |
DMPRoadmap | DMP Roadmap is a Data Management Planning tool | Data management plan Researcher | |
DMPTool | Build your Data Management Plan | Data management plan Researcher | |
e!DAL-PGP | Plant Genomics and Phenomics Research Data Repository | Plant sciences Plant Genomics Researcher Data Steward: infrastructure Data publication Documentation and metadata Plant Phenomics | Standards/Databases |
ECPGR | Hub for the identification of plant genetic resources in Europe | Plant sciences Researcher | |
ELIXIR Deposition Databases for Biomolecular Data | List of discipline-specific deposition databases recommended by ELIXIR. | Data publication Researcher Data Steward: infrastructure COVID-19 Data Portal NeLS IFB CSC | Standards/Databases |
EMBL-EBI Ontology Lookup Service | EMBL-EBI’s web portal for finding ontologies | Documentation and metadata Researcher | |
EMBL-EBI's data submission wizard | EMBL-EBI's wizard for finding the right EMBL-EBI repository for your data. | Data publication Researcher | |
ENA COMPARE Data Hubs | This tool carries out data hub set up at the European Nucleotide Archive (ENA). | Project data management coordination Data Steward: infrastructure | |
ENA upload tool | The program submits experimental data and respective metadata to the European Nucleotide Archive (ENA). | Data Steward: infrastructure Researcher Data brokering | |
EUDAT licence selector wizard | EUDAT's wizard for finding the right licence for your data or code. | Licensing Researcher Data Steward: policy | |
EURISCO | European Search Catalogue for Plant Genetic Resources | Plant sciences Researcher Plant Phenomics | Tool info |
FAIDARE | FAIDARE is a tool allowing to search data across dinstinct databases that implemented BrAPI. | Researcher Plant sciences IFB Plant Phenomics Plant Genomics | Tool info |
FAIR Cookbook | FAIR Cookbook is an online resource for the Life Sciences with recipes that help you to make and keep data Findable, Accessible, Interoperable and Reusable (FAIR) | Compliance monitoring & measurement TransMed | |
FAIR Evaluation Services | Resources and guidelines to assess the FAIRness of digital resources. | Compliance monitoring & measurement Data Steward: policy | |
FAIR Implementation Profile | The FIP is a collection of FAIR implementation choices made by a community of practice for each of the FAIR Principles. | Project data management coordination Data management plan Researcher | Standards/Databases |
FAIR-Wizard | The FAIR wizard utilizes FAIRification resources developed by the FAIRplus project and other platforms, suggests FAIRification materials based on the FAIRification requirements, and designs FAIRification solutions for data owners, data stewards, and other people involved in FAIRification. | Compliance monitoring & measurement Data Steward: policy | Training |
FAIRassist.org | Help you discover resources to measure and improve FAIRness. | Compliance monitoring & measurement Data Steward: policy | |
FAIRDOMHub | Data, model and SOPs management for projects, from preliminary data to publication, support for running SBML models, etc. (public SEEK instance) | Data storage Researcher NeLS Documentation and metadata Microbial biotechnology Machine actionability | Standards/Databases |
FAIRshake | A System to Evaluate the FAIRness of Digital Objects | Compliance monitoring & measurement Data Steward: infrastructure | |
FAIRsharing | A curated, informative and educational resource on data and metadata standards, inter-related to databases and data policies. | Documentation and metadata Data publication Data Steward: policy Researcher Microbial biotechnology Existing data | Standards/Databases Training |
FIP Wizard | FIP Wizard is a toolset to facilitate the capture of data in FAIR Convergence Matrix questionnaire prompting communities to explicitly declare their FAIR Implementation Profiles. These profiles can be then stored and published as nanopublications. | Project data management coordination Data management plan Researcher | |
GA4GH Data Security Toolkit | Principled and practical framework for the responsible sharing of genomic and health-related data. | Data publication Data Steward: policy Data Steward: infrastructure Human data Data sensitivity | |
GA4GH Genomic Data Toolkit | Open standards for genomic data sharing. | Data Steward: infrastructure Human data | |
GA4GH Regulatory and Ethics toolkit | Framework for Responsible Sharing of Genomic and Health-Related Data | Data protection Data sensitivity Data Steward: policy Data Steward: infrastructure Human data | |
Git | Distributed version control system designed to handle everything from small to very large projects | Data organisation Data Steward: infrastructure | Training |
GitHub | Versioning system, used for sharing code, as well as for sharing of small data | Data publication Data organisation Data Steward: infrastructure | Standards/Databases Standards/Databases Training |
GitLab | GitLab is an open source end-to-end software development platform with built-in version control, issue tracking, code review, CI/CD, and more. Self-host GitLab on your own servers, in a container, or on a cloud provider. | Data organisation Data publication Data Steward: infrastructure | Standards/Databases Training |
Harvard Medical School - Electronic Lab Notebooks | ELN Comparison Grid by Hardvard Medical School | Documentation and metadata Identifiers Researcher | |
How to License Research Data - DCC | Guidelines about how to license research data from Digital Curation Centre | Licensing Researcher Data Steward: policy | |
HumanMine | HumanMine integrates many types of human data and provides a powerful query engine, export for results, analysis for lists of data and FAIR access via web services. | Data organisation Researcher Human data Data analysis | Tool info Standards/Databases Training |
Identifiers.org | The Identifiers.org Resolution Service provides consistent access to life science data using Compact Identifiers. Compact Identifiers consist of an assigned unique prefix and a local provider designated accession number (prefix:accession). | Identifiers Data Steward: infrastructure | Tool info Standards/Databases Training |
ISA-tools | Open source framework and tools helping to manage a diverse set of life science, environmental and biomedical experiments using the Investigation Study Assay (ISA) standard | Data Steward: infrastructure Microbial biotechnology Machine actionability | Standards/Databases |
Linked Open Vocabularies (LOV) | Web portal for finding ontologies | Documentation and metadata Researcher | |
MIADE | Minimum Information About Disorder Experiments (MIADE) standard | Documentation and metadata Researcher Intrinsically disordered proteins | |
MIAPPE | Minimum Information About a Plant Phenotyping Experiment | Documentation and metadata Researcher Plant sciences Plant Genomics Plant Phenomics | Standards/Databases Training |
MIGS/MIMS | Minimum Information about a (Meta)Genome Sequence | Documentation and metadata Researcher Marine metagenomics Microbial biotechnology | Standards/Databases |
MIxS | Minimum Information about any (x) Sequence | Documentation and metadata Researcher Marine metagenomics Plant Genomics | Standards/Databases Training |
MOLGENIS | Molgenis is a modular web application for scientific data. Molgenis provides researchers with user friendly and scalable software infrastructures to capture, exchange, and exploit the large amounts of data that is being produced by scientific organisations all around the world. | Identifiers Data Steward: infrastructure | Tool info |
MRI2DICOM | a Magnetic Resonance Imaging (MRI) converter from ParaVision® (Bruker, Inc. Billerica, MA) file format to DICOM standard | Researcher XNAT-PIC | |
MyTARDIS | A file-system based platform handling the transfer of data | Data transfer Bioimaging data | |
OHDSI | Multi-stakeholder, interdisciplinary collaborative to bring out the value of health data through large-scale analytics. All our solutions are open-source. | Researcher Data analysis Data storage TransMed Toxicology data | Tool info |
OMERO | OMERO is an open-source client-server platform for managing, visualizing and analyzing microscopy images and associated metadata | Documentation and metadata Data Steward: infrastructure Data storage OMERO Bioimaging data | Tool info Training |
OnotoMaton | OntoMaton facilitates ontology search and tagging functionalities within Google Spreadsheets. | Researcher Data Steward: infrastructure Documentation and metadata Identifiers | |
Ontobee | A web portal to search and visualise ontologies | Documentation and metadata Researcher | Standards/Databases |
Open Definition Conformant Licenses | Licenses that are conformant with the principles laid out in the Open Definition. | Licensing Researcher Data Steward: policy | |
OpenEBench | ELIXIR benchmarking platform to support community-led scientific benchmarking efforts and the technical monitoring of bioinformatics reosurces | Data analysis Data Steward: infrastructure | Tool info |
OSF | OSF (Open Science Framework) is a free, open platform to support your research and enable collaboration. | Data storage Researcher | Training |
PANGAEA | Data Publisher for Earth and Environmental Science | Data publication Documentation and metadata Researcher | Tool info Standards/Databases |
pISA-tree | A data management solution for intra-institutional organization and structured storage of life science project-associated research data, with emphasis on the generation of adequate metadata. | Microbial biotechnology Researcher Data organisation Documentation and metadata Plant Phenomics Plant Genomics | Tool info |
RDA Standards | Directory of standard metadata, divided into different research areas | Documentation and metadata Researcher | |
REDCap | REDCap is a secure web application for building and managing online surveys and databases. While REDCap can be used to collect virtually any type of data in any environment, it is specifically geared to support online and offline data capture for research studies and operations. | Identifiers Data Steward: infrastructure Data quality | Tool info Training |
Renamer4Mac | File renaming software for Mac | Data organisation Researcher | |
Repository Finder | Repository Finder can help you find an appropriate repository to deposit your research data. The tool is hosted by DataCite and queries the re3data registry of research data repositories. | Data publication Researcher | |
Research Data Management Organiser | Supports the systematic planning, organisation and implementation of research data management throughout the course of a project | Data management plan Researcher Data Steward: infrastructure | |
Research Management Plan | Machine actionable DMPs. | Data management plan Researcher | |
Research Object Crate (RO-Crate) | RO-Crate is a lightweight approach to packaging research data with their metadata, using schema.org. An RO-Crate is a structured archive of all the items that contributed to the research outcome, including their identifiers, provenance, relations and annotations. | Documentation and metadata Data storage Data organisation Researcher Microbial biotechnology Machine actionability Data provenance | Standards/Databases |
Rightfield | RightField is an open-source tool for adding ontology term selection to Excel spreadsheets | Researcher Documentation and metadata Microbial biotechnology Identifiers Machine actionability | Tool info |
Scientific Data's Recommended Repositories | List of respositories recommended by Scientific Data, contains both discipline-specific and general repositories. | Data publication Researcher Data Steward: infrastructure | |
Semares | All-in-one platform for life science data management, semantic data integration, data analysis and visualization | Researcher Documentation and metadata Data analysis Data Steward: infrastructure Data storage | |
Talend | Talend is an open source data integration platform. | Researcher TransMed | |
The Open Biological and Biomedical Ontology (OBO) Foundry | Collaborative effort to develob interoperable ontologies for the biological sciences | Documentation and metadata Researcher | Standards/Databases |
tranSMART | Knowledge management and high-content analysis platform enabling analysis of integrated data for the purposes of hypothesis generation, hypothesis validation, and cohort discovery in translational research. | Researcher Data analysis Data storage TransMed | Tool info |
Tryggve ELSI Checklist | A list of Ethical, Legal, and Societal Implications (ELSI) to consider for research projects on human subjects | Data sensitivity Data Steward: policy Human data NeLS CSC TSD Data protection | |
University of Cambridge - Electronic Research Notebook Products | List of Electronic Research Notebook Products by University of Cambridge | Documentation and metadata Identifiers Researcher | |
Wellcome Open Research - Data Guidelines | Wellcome Open Research requires that the source data underlying the results are made available as soon as an article is published. This page provides information about data you need to include, where your data can be stored, and how your data should be presented. | Data publication Researcher | |
WorkflowHub | WorkflowHub is a registry for describing, sharing and publishing scientific computational workflows. | Data publication Researcher | Tool info Standards/Databases Training |
XNAT-PIC Pipelines | Analysing of single or multiple subjects within the same project in XNAT | Researcher Data analysis XNAT-PIC | |
XNAT-PIC Uploader | Import tool for multimodal DICOM image datasets to XNAT | Researcher XNAT-PIC | |
Zooma | Find possible ontology mappings for free text terms in the ZOOMA repository. | Documentation and metadata Researcher | Tool info Training |
National resources | |||
RDM Guide | RDM Guide describes Belgian data management guidelines, resources, tools and services available for researchers in Life Sciences. |
Researcher | |
DMPonline.be | This instance of DMPonline is provided by the DMPbelgium Consortium. We can help you write and maintain data management plans for your research.
DMPRoadmap
|
Researcher Data management plan | |
PIPPA | PIPPA, the PSB Interface for Plant Phenotype Analysis, is the central web interface and database that provides the tools for the management of the plant imaging robots on the one hand, and the analysis of images and data on the other hand. |
Plant Phenomics Plant sciences Researcher Data Steward: infrastructure | Tool info |
Belnet | Belnet is the privileged partner of higher education, research and administration for connectivity. We provide high-bandwidth internet access and related services for our specific target groups. |
Researcher Data Steward: infrastructure Data transfer | |
Flemish Supercomputing Center (VSC) | VSC is the Flanders’ most highly integrated high-performance research computing environment, providing world-class services to government, industry, and researchers. |
Data Steward: infrastructure Data analysis Data storage | |
e-INFRA CZ (Supercomputing and Data Services) | e-INFRA CZ provides integrated high-performance research computing/data storage environment, providing world-class services to government, industry, and researchers. It also cooperates with European Open Science Cloud (EOSC) implementation in the Czech Republic. |
Data Steward: infrastructure Data analysis Data storage | |
Czech National Repository | National Repository (NR) is a service provided to the scientific and research communities in the Czech Republic to store their generated research data together with persistent DOI identifier. NR service is currently under the pilot program. |
Researcher Data Steward: infrastructure Data storage Existing data Identifiers Data management plan | |
GHGA | The German Human Genome-Phenome Archive. |
Data storage Documentation and metadata Researcher | |
PUBLISSO | Open access publishing platform for life sciences. |
Data publication Researcher | |
REDCap Estonia | This is the Estonian instance of REDCap, which is a secure web platform for building and managing online databases and surveys.
REDCap
|
Data quality | |
Red Española de Supercomputación | The Spanish Supercomputing Network’s mission is to offer the resources and services of supercomputing and data management necessary for the development of innovative and high-quality scientific and technological projects, through competitive calls based on the scientific excellence of the projects to be developed. |
Researcher Data Steward: infrastructure | |
RedIRIS | Spanish academic and research network that provides advanced communication services to the scientific community and national universities. |
Researcher Data Steward: infrastructure | |
Recolecta | The national aggregator of open access repositories. This platform brings together all the Spanish digital infrastructures in which open access research results are published and / or deposited. |
Researcher Data Steward: infrastructure | |
Datos.gob.es | Open data portal of the spanish government. A meeting point for the various actors that make up the open data ecosystem. |
Researcher Data Steward: infrastructure | |
DMPTuuli | Data management planning tool (Finland).
DMPRoadmap
|
CSC Researcher Data management plan | |
Fairdata.fi | With the Fairdata Services you can store, share and publish your research data with easy-to-use web tools. |
CSC Researcher Data storage Data publication Existing data | |
Federated EGA Finland | FEGA allows you to store and shaare sensitive data in Finland in a way that fulfils all the requirements of the General Data Protection Regulation (GDPR). |
CSC Researcher Data sensitivity Data publication Existing data Human data | |
Findata | The Health and Social Data Permit Authority. Findata offers services and enables secure and efficient utilisation of data materials containing health and social data. |
CSC Researcher Data sensitivity Existing data Human data | |
Fingenious | Finnish Biobank Cooperative (FINBB) connects researchers to Finnish biomedical research. Via Fingenious® services the researcher can connect to all Finnish public bio banks. |
CSC Researcher Data sensitivity Human data | |
Sensitive Data Services for Research | CSC Sensitive Data Services for Research are designed to support secure sensitive data management through web-user interfaces accessible from the user’s own computer. |
CSC Researcher Data sensitivity Data analysis Data storage Data publication Human data | |
High performance computing | CSC Supercomputers Puhti, Mahti and LUMI performance ranges from medium scale simulations to one of the most competitive supercomputers in the world. |
CSC Researcher Data analysis | |
Cloud computing | CSC offers a variety of cloud computing services: the Pouta IaaS services and the Rahti container cloud service. |
CSC Researcher Data analysis | |
IceBear | A browser-based Research Data Management tool for protein cyrstallization that offers flexible crystal fishing workbench, no-typing submission for crystal shipment, and linking crystals and datasets including PDB depositions. |
Researcher Data analysis | |
DMP OPIDoR | Online questionnaire for the development of data management plans - repository of DMPs.
DMPRoadmap
|
IFB Researcher Data management plan | |
Open-science.it | Italian portal dedicated to the field of open science. |
Researcher Data management plan | |
RETTE | System for Risk and compliance. Processing of personal data in research and student projects at UiB. |
Human data Data protection Data sensitivity Data Steward: policy | |
BioData.pt Service Hub | BioData.pt Service Hub includes several data management resources, tools and services available for researchers in Life Sciences. |
Researcher Data analysis Data storage | |
BioData.pt Data Management Portal (DMPortal) | This instance of DataVerse is provided by the BioData.pt. We can help you write and maintain data management plans for your research.
DATAVERSE
|
Researcher Data storage | |
BioData.pt Data Stewardship Wizard | Local instance of Data Stewardship Wizard. You can use this tool to create your own Data Management Plans.
Data Stewardship Wizard
|
Researcher Data management plan | |
DMPonline | DMPonline is a web-based tool that supports researchers to develop data management and sharing plans. It contains the latest funder templates and best practice guidelines to support users to create good quality DMPs.
DMPRoadmap
|
Researcher Data management plan | |
CyVerse UK | The CyVerse Data Store is a cloud-based storage space, accessible via the CyVerse Discovery Environment (DE), a virtual bioinformatics lab workbench, and developer APIs such as the AGAVE API. In the DE, users can share datasets and tools to analyse data with as many or as few people as they wish. |
Researcher Documentation and metadata | |
Jisc Research data management toolkit | Guidance on the research data lifecycle that signposts resources from a wide range of organisations and websites. |
Researcher Documentation and metadata | |
Agrischema | Linked data schemas for the fields of agriculture, food, agri-business, plant biology. |
Researcher Documentation and metadata | |
InterMine | InterMine integrates heterogenous data sources, making it easy to query and analyse data. |
Researcher Documentation and metadata |