What is the CSC data management tool assembly?
CSC – IT Center for Science and ELIXIR Finland provide services, tools and software for managing research data throughout the project life cycle. Services cover computing environments, analysis programs, tools for storing and sharing data during the project as well as opening and discovering research data. Furthermore, ELIXIR-FI provides flexible infrastructure for bioinformatics data analysis. Services are actively developed, and hence, please visit CSC web pages for the latest updates.
Who can use the CSC data management tool assembly?
CSC and ELIXIR-FI services are available for researchers affiliated with a Finnish academic organisation or research institutes and their international collaborators. Most of CSC’s services are free of charge for academic research, education and training purposes in Finnish higher education institutions and state research institutes. Researchers can start using services by registering an account and get bioinformatics user support from our service desk.
How can you access the CSC data management tool assembly?
You can access all CSC services through several secure authentication methods. Start by registering an account at CSC with your home organization HAKA or VIRTU login or by contacting our service desk. Afterwards you can also use ELIXIR login. Find more information from CSC accounts and support web pages how to get access to different services.
For what can you use the CSC data management tool assembly?
Figure 1. The CSC - IT Center for Science data management tool assembly.
Data management planning
Research funders often require a data management plan (DMP) as part of the funding application process or after funding has been approved. See e.g. guidance on creating a DMP for the Research Council of Finland (former Academy of Finland). There are several tools to guide you through the planning process, such as DMPTuuli, a Finnish instance of DMPonline which includes guidance and templates provided by different organisations and funders. DMPs created in DMPTuuli are not yet machine-actionable or linked to the CSC data management tool assembly services. However, a DMP is a valuable document when contacting and using the CSC research data management services. Other available DMP tools include Data Stewardship Wizard and DMPonline. Furthermore, research data support services at Finnish research organisations offer help and guidance through the planning process and when making decisions on data management.
Data collection
When you start collecting data and need a storing environment where you can, for example, host cumulating data, Allas Object Storage is the recommended option. Indeed, Allas is CSC’s general purpose research data storage server, which can be accessed on the CSC servers as well as from anywhere on the internet. Allas can be used both for static research data that needs to be available for analysis and to collect cumulating data. For example, if you work with sequence data, the sequencing provider can transfer the data directly to Allas under your project. However, as an object storage system, Allas is not suitable for very dynamic data like SQL databases.
Data processing and analysis
For processing, analysing and storing data during the research project, CSC offers several computing platforms. These include both environments for non-sensitive and sensitive data. Depending on your needs, you can choose from a wide variety of computing resources: use Chipster software for high-throughput data such as RNA-seq and single-cell RNA-seq, build your own custom virtual machine, or utilise the full power of our world-class supercomputers.
CSC is hosting EuroHPC’s world-class supercomputer LUMI, which is available for researchers across Europe for projects requiring extreme computing capacity. LUMI is one of the world’s most competitive supercomputers and one of the most advanced platforms for artificial intelligence. In addition, you can utilise two national supercomputers Puhti and Mahti for medium to large-scale simulations. Pouta and Rahti cloud computing services offer more flexibility, allowing the user to manage the infrastructure. CSC’s computers have a wide range of preinstalled scientific software and databases with usage instructions.
For management of sensitive data, the SD Connect and SD Desktop services are available. The Sensitive Data Services are designed to facilitate collaborative research across Finland and between Finnish academics and their collaborators. SD Connect allows you to collect, organise and share your encrypted sensitive data in a secure manner via web browser or programmatically. SD Desktop is a service that allows a user and their authorized colleagues to access a private computing environment workspace via a web browser and analyse the data within a secure cloud. A restricted version of SD Desktop is also available for processing health and social data for secondary use in compliance with the Finnish law and Findata regulation.
Data sharing and publishing
It is recommended to publish data in data specific repositories. You can find many options from ELIXIR Deposition Databases for Biomolecular Data. Furthermore, CSC and ELIXIR-FI will offer Federated EGA for sensitive human biomedical data that is linked to the central The European Genome-phenome Archive (EGA)
The Finnish Federated EGA is part of a European network of repositories for biomedical data. The service will give you the tools to describe your dataset (adding the appropriate metadata) and assigning an EGA accession number. After publication, you will remain the data controller and decide according to specific policies, who can access the sensitive data for reuse. According to the GDPR, your data will remain within the Finnish borders and, at the same time, they will be accessible and discoverable according to FAIR data principles.
In addition to the above mentioned services, you can use national Fairdata.fi services. Fairdata IDA storage service enables saving, organising and sharing data within the project group and storing the data in an immutable state. After freezing your data in IDA, you can use Qvain, the research dataset description tool, to describe your data and thus create core metadata for your dataset, and publish it. Publishing means that your dataset will be published in Etsin, the research data finder, where you can discover and download any files you have associated with the dataset. Any published dataset is also made available to the Research.fi portal automatically by Fairdata services.
Related pages
More information
Tools and resources on this page
Tool or resource | Description | Related pages | Registry |
Data Stewardship Wizard | Publicly available online tool for composing smart data management plans | FAIRtracks Plant Genomics Plant Phenomics Plant sciences Data management plan GDPR compliance | Tool info Training |
DMPonline | Data Management Plans that meet institutional funder requirements. | Plant sciences Data management plan | Training |
ELIXIR Deposition Databases for Biomolecular Data | List of discipline-specific deposition databases recommended by ELIXIR. | IFB Marine Metagenomics NeLS Data discoverability Data publication Documentation and meta... | Standards/Databases |
The European Genome-phenome Archive (EGA) | EGA is a service for permanent archiving and sharing of all types of personally identifiable genetic and phenotypic data resulting from biomedical research projects
TSD Human data Data publication | Tool info Standards/Databases Training |
National resources
Tools and resources tailored to users in different countries.
Tool or resource | Description | Related pages | Registry |
Chipster | Chipster is a user-friendly analysis software for high-throughput data such as RNA-seq and single cell RNA-seq. It contains analysis tools and a large reference genome collection. |
Researcher Research Software Engi... Data analysis | |
Cloud computing | CSC offers a variety of cloud computing services: the Pouta IaaS services and the Rahti container cloud service. |
Researcher Data Steward Data analysis | |
DMPTuuli | Data management planning tool (Finland).
Researcher Data Steward Data management plan | |
Fairdata.fi | With the Fairdata Services you can store, share and publish your research data with easy-to-use web tools. |
Researcher Data Steward Data storage Data publication Existing data | |
Federated EGA Finland | FEGA allows you to store and share sensitive data in Finland in a way that fulfils all the requirements of the General Data Protection Regulation (GDPR).
The European Genome-phenome Archive (EGA)
Researcher Data Steward Data sensitivity Data publication Existing data Human data | |
Findata | The Health and Social Data Permit Authority. Findata offers services and enables secure and efficient utilisation of data materials containing health and social data. |
Researcher Data Steward Data sensitivity Existing data Human data | |
Fingenious | Finnish Biobank Cooperative (FINBB) connects researchers to Finnish biomedical research. Via Fingenious® services the researcher can connect to all Finnish public bio banks. |
Researcher Data Steward Data sensitivity Human data | |
High performance computing | CSC Supercomputers Puhti, Mahti and LUMI performance ranges from medium scale simulations to one of the most competitive supercomputers in the world. |
Researcher Data Steward Data analysis | |
Sensitive Data Services for Research | CSC Sensitive Data Services for Research are designed to support secure sensitive data management through web-user interfaces accessible from the user’s own computer. |
Researcher Data Steward Data sensitivity Data analysis Data storage Data publication Human data |