Comparing HISAT and STAR-based pipelines for RNA-Seq Data Analysis: A real experience

IRIS

One of the first step in RNA-Sequencing (RNA-Seq) data analysis consists of aligning (Next Generation Sequencing) reads to a reference genome. In literature, there are several tools implemented by practitioners and researchers for the alignment step. However, two tools are the de-facto-standard used by bioinformatics researchers in their pipelines: HISAT (version 2) and STAR (version 2). The aim of this study is to determine the impact of the alignment tool on the RNA-Seq analysis in terms of biological relevance of the results and computational time. The two implemented pipelines return different results on the biological side. This is due to assumptions the used tools made and to the specific characteristics of the underlying (statistical) models. The study provides valuable insights for researchers interested in optimizing their RNA-Seq pipelines and making informed decisions about which pipeline to use. As lesson learned, we suggest bioinformatics researchers to use more pipelines when make experiments to reduce the prediction errors induced by assumption of a specific tool or method.

Comparing HISAT and STAR-based pipelines for RNA-Seq Data Analysis: A real experience

Bianchi A.^Methodology;Di Marco A.^Supervision;Pellegrini C.^{Writing – Review & Editing}

2023-01-01

Abstract

One of the first step in RNA-Sequencing (RNA-Seq) data analysis consists of aligning (Next Generation Sequencing) reads to a reference genome. In literature, there are several tools implemented by practitioners and researchers for the alignment step. However, two tools are the de-facto-standard used by bioinformatics researchers in their pipelines: HISAT (version 2) and STAR (version 2). The aim of this study is to determine the impact of the alignment tool on the RNA-Seq analysis in terms of biological relevance of the results and computational time. The two implemented pipelines return different results on the biological side. This is due to assumptions the used tools made and to the specific characteristics of the underlying (statistical) models. The study provides valuable insights for researchers interested in optimizing their RNA-Seq pipelines and making informed decisions about which pipeline to use. As lesson learned, we suggest bioinformatics researchers to use more pipelines when make experiments to reduce the prediction errors induced by assumption of a specific tool or method.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2023
			
	Codice ISBN
	
				979-8-3503-1224-9
			
	Appare nelle tipologie:
	
				4.1 Contributo in Atti di convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11697/223526

Citazioni

ND

3

3

social impact