Data life cycle: Collecting
What is data collection?
Data collection is the process where information is gathered about specific variables of interest either using instrumentation or other methods (e.g. questionnaires, patient records). While data collection methods depend on the field and research subject, it is important to ensure data quality.
You can also reuse existing data in your project. This can either be individual earlier collected datasets or reference data from curated resources like ELIXIR Core Data Resources or consensus data like reference genomes. For more information see Reuse in the data life cycle.
Why is data collection important?
Apart from being the source of information to build your findings on, the collection phase lays the foundation for the quality of both the data and its documentation. It is important that the decisions made regarding quality measures are implemented, and that the collect procedures are appropriately recorded.
What should be considered for data collection?
Appropriate tools or integration of multiple tools (also called tool assembly or ecosystem) can help you with data management and documentation during data collection. Suitable tools for data management and documentation during data collection are Electronic Lab Notebooks (ELNs), Electronic Data Capture (EDC) systems, Laboratory Information Management Systems (LIMS). Moreover, online platforms for collaborative research and file sharing services could also be used as ELN or data management systems.
Independently of the tools you will use, consider the following, while collecting data
- How to capture provenance - e.g. of samples, researchers and instruments
- Ensure data quality - data can either be generated by yourself, or by another infrastructure or facility with this specialization
- Reusing data instead of generating new data
- Experimental design - including a collection plan (e.g. repetitions, controls, randomization) in advance
- Instrument calibration
- If you work with sensitive or confidential data, take care of data protection and security issues
- If you work with human-related data, think about permissions, consent
- How to store the data
- Where to store the data
- Identify suitable metadata standards
Best practices to name and organise research data
Ensure high quality research data
How to find and reuse existing data
How to use identifiers for research data
Documentation and metadata
How to document and describe your data
How to identify different research data types
How to find appropriate storage solutions