Tool assembly: CSC
What is the CSC data management tool assembly?
CSC – IT Center for Science and ELIXIR Finland provide services, tools and software for managing research data throughout the project life cycle. Services cover computing environments, analysis programs, tools for storing and sharing data during the project as well as opening and discovering research data. Furthermore, ELIXIR-FI provides flexible infrastructure for bioinformatics data analysis. Services are actively developed, and hence, please visit CSC web pages for the latest updates.
Who can use the CSC data management tool assembly?
CSC and ELIXIR-FI services are accessible for researchers in Finland and to foreign collaborators of Finland-based research groups. Most of CSC’s services are free-of-charge for academic research, education and training purposes in Finnish higher education institutions and in state research institutes. Researchers can start using services by registering an account and get bioinformatics user support from our service desk.
How can you access the CSC data management tool assembly?
You can access all CSC services through several secure authentication methods. Start by registering an account at CSC with your home organization HAKA or VIRTU login or by contacting our service desk. Afterwards you can also use ELIXIR login. Find more information from CSC accounts and support web pages how to get access to different services.
For what can you use the CSC data management tool assembly?
Data management planning
You can get support for data management planning through DMPTuuli, a Finnish instance of DMPonline which includes guidance and templates provided by different organisations and funders. DMPs created in DMPTuuli are not yet machine actionable or linked to the CSC data management tool assembly services. However, a DMP is a valuable document when contacting the CSC research data management services.
When you start collecting data and need a storing environment where you can, for example, host cumulating or changing data, Allas Object Storage is the recommended option. Indeed, Allas is CSC’s general purpose research data storage server, which can be accessed on the CSC servers as well as from anywhere on the internet. Allas can be used both for static research data that needs to be available for analysis and to collect cumulating or changing data. For example, if you work with sequence data, the sequencing provider can transfer the data directly to Allas under your project.
Data processing and analysis
For processing, analysing and storing data during the research project, CSC offers several computing platforms. These include both environments for non-sensitive and sensitive data. Depending on your needs, you can choose from a wide variety of computing resources: use Chipster software for high-throughput data such as RNA-seq and single cell RNA-seq, build your own custom virtual machine, or utilise the full power of our world-class supercomputers.
Supercomputers Puhti and Mahti can be used for larger scale analysis and simulations. They will soon be accompanied with the world-class supercomputer LUMI. Pouta and Rahti cloud computing services offer more flexibility, allowing the user to manage the infrastructure. CSC’s computers have a wide range of preinstalled scientific software and databases with usage instructions.
This summer, CSC will be releasing beta versions of new services for sensitive data management: Sensitive Data Desktop (SD Desktop) and Sensitive Data Connect (SD Connect). Sensitive Data Submit (SD Submit) will be available later this year. The new Sensitive Data Services are designed to facilitate collaborative research across Finland and between Finnish academics and their collaborators. SD Desktop is a service that allows a user and their authorized colleagues to access a private computing environment workspace via a web browser and analyze the data within a secure cloud. SD Connect allows you to collect, organize and share your encrypted sensitive data in a secure manner via web browser.
Data sharing and publishing
It is recommended to publish data in data specific repositories. You can find many options from ELIXIR Deposition Databases for Biomolecular data web page. Furthermore, CSC and ELIXIR-FI will offer Federated EGA for sensitive human biomedical data that is linked to the central European Genome-phenome Archive and the SD Submit at the end of 2021.
SD Submit allows you to publish sensitive data securely in a national repository. The service will give you the tools to describe your dataset (adding the appropriate metadata) and assign a permanent identifier (DOI). After publication, you will remain the data controller and decide according to specific policies, who can access the sensitive data for reuse. According to the GDPR, your data will remain within the Finnish borders and, at the same time, they will be accessible and discoverable according to FAIR data principles.
In addition to the above mentioned services, you can use national Fairdata.fi services. Fairdata IDA storage service enables saving, organising and sharing data within the project group and storing the data in an immutable state. After freezing your data in IDA, you can use Qvain, the research dataset description tool, to describe your data and thus create core metadata for your dataset, and publish it. Publishing means that your dataset will be published in Etsin, the research data finder, where you can discover and download any files you have associated with the dataset. Any published dataset is also made available to the Research.fi portal automatically by Fairdata services.
How to identify different research data types. Data management plan
How to write a Data Management Plan (DMP). Data protection
How to make research data compliant to GDPR. Data storage
How to find appropriate storage solutions. Data publication
Prepare data and find repositories for publication. Data analysis
How to make data analysis FAIR.
Relevant tools and resourcesSkip tool table
|Tool or resource||Description||Related pages||Registry|
|ELIXIR Deposition Databases for Biomolecular Data||List of discipline-specific deposition databases recommended by ELIXIR.||Data publication Researcher Data Steward: research Data Steward: infrastructure COVID-19 Data Portal NeLS IFB||Standards/Databases|
|LUMI||EuroHPC world-class supercomputer||Data analysis Researcher Data Steward: infrastructure||Tool info|
|The European Genome-phenome Archive (EGA)||EGA is a service for permanent archiving and sharing of all types of personally identifiable genetic and phenotypic data resulting from biomedical research projects||Data publication Human data Data Steward: policy TSD||Tool info Standards/Databases Training|
|Tryggve ELSI Checklist||A list of Ethical, Legal, and Societal Implications (ELSI) to consider for research projects on human subjects||Sensitive data Data Steward: policy Data Steward: research Human data NeLS TSD|
Chipster is a user-friendly analysis software for high-throughput data such as RNA-seq and single cell RNA-seq. It contains analysis tools and a large reference genome collection.
|Researcher Data Steward: infrastructure Data analysis|
Data management planning tool (Finland).
|Researcher Data Steward: research Data management plan|
With the Fairdata Services you can store, share and publish your research data with easy-to-use web tools.
|Researcher Data Steward: research Data storage Data publication Existing data|
|Federated EGA Finland||
FEGA allows you to store and shaare sensitive data in Finland in a way that fulfils all the requirements of the General Data Protection Regulation (GDPR).
|Researcher Data Steward: research Sensitive data Data publication Existing data Human data|
The Health and Social Data Permit Authority. Findata offers services and enables secure and efficient utilisation of data materials containing health and social data.
|Researcher Data Steward: research Sensitive data Existing data Human data|
Finnish Biobank Cooperative (FINBB) connects researchers to Finnish biomedical research. Via Fingenious® services the researcher can connect to all Finnish public bio banks.
|Researcher Data Steward: research Sensitive data Human data|
|Sensitive Data Services for Research||
CSC Sensitive Data Services for Research are designed to support secure sensitive data management through web-user interfaces accessible from the user’s own computer.
|Researcher Data Steward: research Sensitive data Data analysis Data storage Data publication Human data|
|High performance computing||
CSC Supercomputers Puhti, Mahti and LUMI performance ranges from medium scale simulations to one of the most competitive supercomputers in the world.
|Researcher Data Steward: research Data analysis|
CSC offers a variety of cloud computing services: the Pouta IaaS services and the Rahti container cloud service.
|Researcher Data Steward: research Data analysis|