Tool assembly: Marine Metagenomics
What is the Norwegian tool assembly for marine metagenomics data management?
The Norwegian tool assembly for marine metagenomics aims to provide a comprehensive toolkit for management of marine genomic research data throughout a project’s data life cycle. The toolkit, developed by students and researchers in Norway, contains resources and software tools for both data management (Planning, Processing, Storing and Sharing), data analysis and training. It is built on the Norwegian e-Infrastructure for Life Sciences (NeLS) tool assembly of ELIXIR Norway and the Marine Metagenomics Platform (MMP).
Who can use the marine metagenomics data management tool assembly?
This tool assembly is useful for students and researchers, in Norway, who are interested in analysing marine datasets (e.g. genomes, metagenomes, and transcriptomes). Parts of the assembly, such as data storage, are based on national infrastructures, laws and regulations, and consequently limited to Norwegian users, while other parts, such as data analysis tools and data repositories, are globally accessible.
How can you access the marine metagenomics data management tool assembly?
To be able to use resources and tools that are mentioned here, you are recommended to have a Feide account. In addition, it is important for you to have a NeLs account in order to access usegalaxy.no. In case your institution does not use the national Feide secure login service, you can apply for a NeLs IDP through the ELIXIR Norway help desk. Note, that Marine Metagenomics Platform (MMP) is an open-access platform that can be accessed without a Feide account at https://mmp2.sfb.uit.no/.
For what purpose can you use the marine metagenomics data management tool assembly?
Data management planning
The support for data management planning and the Data Management Plan model for marine metagenomics in Norway is provided through the ELIXIR-NO instance of the Data Stewardship Wizard. To read more on standards and best practices for the metagenomics data life-cycle, we refer you to a publication for further reading. Questions regarding the DSW and data management in general can be directed to the ELIXIR Norway helpdesk.
If you use one of the National Norwegian research infrastructures, such as the Norwegian sequencing infrastructure NorSeq, they can directly upload data to your NeLS project for you, as described in this page
Data storage, sharing and compute
The solutions for data storage, sharing and computation are built on the services and infrastructure delivered by ELIXIR Norway described in the Norwegian e-Infrastructure for Life Sciences (NeLS) tool assembly.
Data processing and analysis
The Marine Metagenomics Portal provides a complete service for analysis of marine metagenomic data through the tool META-pipe. META-pipe is a pipeline that can assemble your high-throughput sequence data, functionally annotate the predicted genes, and taxonomically profile your marine metagenomics samples, helping you to gain insight into the phylogenetic diversity, metabolic and functional potential of environmental communities. You can read more details about META-pipe in the publication. Norwegian users with Feide access can access the online version of META-pipe. For other users META-pipe is downloadable and can easily be run on any computing environment (e.g. any Linux workstation, SLURM cluster or Kubernetes).
Usegalaxy.no is a Norwegian instance of the Galaxy web-based platform for data intensive life science research that provides users with a unified, easy-to-use graphical interface to a host of more than 200 different analysis tools. Here, you can find tools for a wide variety of analysis for your marine metagenomic and genomic data. The tools are publicly available in the Galaxy Toolshed which serves as an “appstore” so you can easily transfer them to your favourite Galaxy instance anywhere. You can run the tools interactively, one by one, or combine them into multi-step workflows that can be executed as a single analysis. Premade workflows (i.e for Taxonomic classification of metagenomic sequences) are provided, and you can request installation of your favourite tool by contacting the ELIXIR Norway help desk.
Data sharing and publishing
ELIXIR Norway acts as a broker for Norwegian end-users that wish to submit data to ELIXIR Deposition Databases (such as ENA), providing support in submitting the data on behalf of the data owners directly from the National e-infrastructure for Life Science (NeLS).
If you need help with publishing or are interested in using the brokering service, please contact the ELIXIR Norway help desk.
The Marine Metagenomics Portal (MMP) provides you with high-quality curated and freely accessible microbial genomics and metagenomics resources. Through MMP you can access the The Marine reference databases (MarRef), Marine Genome Database (MarDb), (MarFun; database for marine fungi genomes), and (SalDB; salmon specific database of genome sequenced prokaryotes) databases. They are built by aggregating data from a number of publicly available sequences, taxonomy and literature databases in a semi-automatic fashion. Other databases or resources such as bacterial diversity and culture collections databases, web mapping service and ontology databases are used extensively for curation of metadata. At present the MarRef contains nearly 1000 complete microbial genomes, and MarDB hosts more than 13,000 non-complete genomes. The MAR database entries are cross-referenced with ENA and the World Register of Marine Species (WoRMS) - you can read the publication about the Mar databases.
How to write a Data Management Plan (DMP). Existing data
How to find and reuse existing data. Data organisation
Best practices to name and organise research data. Data storage
How to find appropriate storage solutions. Data publication
How to prepare data and find repositories for publication. Documentation and metadata
How to document and describe your data. Data analysis
How to make data analysis FAIR.
Relevant tools and resourcesSkip tool table
|Tool or resource||Description||Related pages||Registry|
|Galaxy||Open, web-based platform for data intensive biomedical research. Whether on the free public server or your own instance, you can perform, reproduce, and share complete analyses.||NeLS Data analysis Researcher Data Steward: infrastructure IFB Galaxy||Tool info Training|
|MarDB||MarDB includes all non-complete marine microbial genomes regardless of level of completeness. Each entry contains 120 metadata fields including information about sampling environment or host, organism and taxonomy, phenotype, pathogenicity, assembly and annotation.||Data analysis||Tool info|
|MarFun||MarFun is a manually curated marine fungi genome database.||Data analysis|
|Marine metagenomics portal||High-quality curated and freely accessible microbial genomics and metagenomics resources for the marine scientific community||Tool info|
|MarRef||MarRef is a manually curated marine microbial reference genome database that equenced genomes. Each entry contains 120 metadata fields including information about sampling environment or host, organism and taxonomy, phenotype, pathogenicity, assembly and annotation information||Tool info|
|salDB||SalDB is a salmon specific database of genome sequenced prokaryotes representing the microbiota of fishes found in the taxonomic family of Salmonidae.|
Feide is the national solution for secure login and data exchange in education and research. Feide can be linked with ELIXIR-AAI through eduGAIN.
DS-Wizard is a tool to aid the creation, organisaton and sharing of data management plans. It provides scientists with guidance, facilitating the understanding of the key components of FAIR-oriented Data Stewardship. The template in this instance provides additional guidance on resources, laws and regulations in Norway.
Data Stewardship Wizard
|TSD NeLS Data management plan|
META-pipe is a pipeline for annotation and analysis of marine metagenomics samples, which provides insight into phylogenetic diversity, metabolic and functional potential of environmental communities.
Norwegian e-Infrastructure for Life Sciences enables Norwegian life scientists and their international collaborators to store, share, archive, and analyse their omics-scale data.