Binary and multi-class classification of Self-Admitted Technical Debt: How far can we go?

IRIS

Context: Aiming for a trade-off between short-term efficiency and long-term stability, software teams resort to sub-optimal solutions, neglecting the best software development practices. Such solutions may induce technical debt (TD), triggering maintenance issues. To facilitate future fixing, developers mark code with any issues using textual comments, resulting in Self-Admitted Technical Debt (SATD). Detecting SATD in source code is crucial since it helps programmers locate potentially erroneous snippets, allowing for suitable interventions, and improving code quality. There are two main types of SATD detection, i.e., binary classification and multi-class classification, grouping TD comments into SATD/Non-SATD categories, and multiple categories, respectively. Objective: We attempt to understand to which extent state-of-the-art research has addressed the issue of detecting SATD, both binary and multi-class classification. Based on this investigation, we also propose a practical approach for the detection of SATD using Large Language Models (LLMs). Methods: First, we conducted a literature review to understand to which extent the two types of classification have been tackled by existing research. Second, we developed SALA, a dual-purpose tool on top of Natural Language Processing (NLP) techniques and neural networks to deal with both types of classification. An empirical evaluation has been performed to compare SALA with state-of-the-art baselines. Results: The literature review reveals that while binary classification has been well studied, multi-class classification has not received adequate attention. The empirical evaluation shows that SALA obtains a promising performance, and outperforms the baselines with respect to various quality metrics. Conclusion: We conclude that more effort needs to be spent to tackle multi-class classification of SATD. To this end, LLMs hold the potential, albeit with more rigorous investigation on possible fine-tuning and prompt engineering strategies.

Binary and multi-class classification of Self-Admitted Technical Debt: How far can we go?

Arcelli Fontana, Francesca;Di Rocco, Juri;Di Ruscio, Davide;Di Salle, Amleto;Nguyen, Phuong

2025-01-01

Abstract

Context: Aiming for a trade-off between short-term efficiency and long-term stability, software teams resort to sub-optimal solutions, neglecting the best software development practices. Such solutions may induce technical debt (TD), triggering maintenance issues. To facilitate future fixing, developers mark code with any issues using textual comments, resulting in Self-Admitted Technical Debt (SATD). Detecting SATD in source code is crucial since it helps programmers locate potentially erroneous snippets, allowing for suitable interventions, and improving code quality. There are two main types of SATD detection, i.e., binary classification and multi-class classification, grouping TD comments into SATD/Non-SATD categories, and multiple categories, respectively. Objective: We attempt to understand to which extent state-of-the-art research has addressed the issue of detecting SATD, both binary and multi-class classification. Based on this investigation, we also propose a practical approach for the detection of SATD using Large Language Models (LLMs). Methods: First, we conducted a literature review to understand to which extent the two types of classification have been tackled by existing research. Second, we developed SALA, a dual-purpose tool on top of Natural Language Processing (NLP) techniques and neural networks to deal with both types of classification. An empirical evaluation has been performed to compare SALA with state-of-the-art baselines. Results: The literature review reveals that while binary classification has been well studied, multi-class classification has not received adequate attention. The empirical evaluation shows that SALA obtains a promising performance, and outperforms the baselines with respect to various quality metrics. Conclusion: We conclude that more effort needs to be spent to tackle multi-class classification of SATD. To this end, LLMs hold the potential, albeit with more rigorous investigation on possible fine-tuning and prompt engineering strategies.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2025
			
	Rivista
	
				INFORMATION AND SOFTWARE TECHNOLOGY
			
	Codice DOI
	
				https://dx.doi.org/10.1016/j.infsof.2025.107862
			
	Appare nelle tipologie:
	
				1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
1-s2.0-S0950584925002010-main.pdf accesso aperto Tipologia: Documento in Versione Editoriale Licenza: Creative commons Dimensione 1.85 MB Formato Adobe PDF Visualizza/Apri	1.85 MB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11697/275579

Citazioni

ND

0

0

social impact