Welcome to MultiOmicsIntegrator documentation!

MOI is a nextflow pipeline that aims to cover extensive and diverse omics analyses. MOI offers end-to-end analysis of RNAseq, taking into account different RNA molecules, like isoforms of genes and miRNAs. In addition to traditional RNAseq pipelines, MOI has tools for functional annotation and secondary structure prediction of RNA transcripts. Moreover, it can cover proteomics and metabolomics analyses. For the latter, MOI has additional tools for accessing differences in specific biochemical attributes. All available omics-specific workflows can run autonomously, however, if the user wishes they can integrate their data either with data driven or biology driven approaches. Finally, differentiating from existing pipelines we offer biotranslator as our pathway enrichment algorithm as it can better mitigate the high noise inhereit with biological data as well as omnipathr for advance downstream analyses.

Note

This project is additional documentation for MOI: MultiOmicsIntegrator: an integrated solution for omics analyses.

General inputs and outputs

The MOI pipeline is organized into individual modules, each responsible for a specific step in the analysis workflow. The modular design facilitates code flexibility in incorporating new analyses techniques or custom implementations, as well as easy maintenance and scalability.

MOI’s behavior is regulated through the params.yml files, each named to align with the specific analysis segment they govern. In those files the user is tasked with specifying input and output parameters and with the optional fine-tuning intricacies such as algorithm selection and algorithmic configurations.

The pipeline’s inputs are streamlined to one csv file. This file accommodates either a solitary column of SRA codes or a directory pointing to the location of fastq files, along with any other metadata pertaining to their samples. If the analysis commences with count (abundance) matrices the user can specify the directory of the this matrix along with a phenotype file.

MOI produces extensive outputs, including informative plots and intermediate results in the form of text and RData objects for each module, accommodating users who seek further utilization or detailed inspection of results. Outputs are organized hierarchically based on the user’s parameterization; for example, the pathway enrichment analysis of genes will be located under the directory “/user_defined_output_directory/genes/biotranslator/”.

Most important tools

Omics

Functionality

Tools

Genes, miRNA, isoforms

SRA download

SRA toolkit

Genes, miRNA, isoforms

Quality control

FastQC, trimgalore

Genes, miRNA, isoforms

Align and Assembly

Salmon, samtools, STAR, Hisat2, StringTie2

Genes, miRNA, isoforms, proteins, lipids

Data preprocessing

R packages: edger, limma, sva, ggplot2, ComplexHeatmap

Proteins, lipids

Specific for proteins and lipids

R packages: p reprocesscore, mstus normalization

Lipids

Specific for lipids

R packags: lipidr

Genes, miRNA, isoforms, proteins, lipids

Differential expression analyss

R packages: DESeq2, edger, RankProd, ggplot2 ComplexHeatmap

Genes, miRNA, isoforms, proteins, lipids

Correlation analysis

R package stats

Genes, miRNA, isoforms, proteins, lipids

Pathway enrichment analysis

Cl usterprofiler, Biotranslator

Lipids

Specific for lipids pathway enrichment analysis

Custom tool: Lipidb

Genes, miRNA, isoforms, proteins

RIDDER (module to identify IRE1 substrates)

gRIDD, RNAeval, fimo

Genes, miRNA, isoforms

Functional annotation

CPAT, signalP, pfam

Genes, miRNA, isoforms, proteins

Secondary structure prediction

RNAfold, RNAeval

Genes, miRNA, isoforms, proteins

Find motif

fimo

Isoforms

Genome wide isoform analysis

Isoform SwitchAnalyzer

Contents