Programming with ChatGPT: How far can we go?

IRIS

Artificial intelligence (AI) has made remarkable strides, giving rise to the development of large language models such as ChatGPT. The chatbot has garnered significant attention from academia, industry, and the general public, marking the beginning of a new era in AI applications. This work explores how well ChatGPT can write source code. To this end, we performed a series of experiments to assess the extent to which ChatGPT is capable of solving general programming problems. Our objective is to assess ChatGPT's capabilities in two different programming languages, namely C++ and Java, by providing it with a set of programming problem, encompassing various types and difficulty levels. We focus on evaluating ChatGPT's performance in terms of code correctness, run-time efficiency, and memory usage. The experimental results show that, while ChatGPT is good at solving easy and medium programming problems written in C++ and Java, it encounters some difficulties with more complicated tasks in the two languages. Compared to code written by humans, the one generated by ChatGPT is of lower quality, with respect to runtime and memory usage.

Programming with ChatGPT: How far can we go?

Bucaioni, Alessio;Ekedahl, Hampus;Helander, Vilma;Nguyen, Phuong

2024-01-01

Abstract

Artificial intelligence (AI) has made remarkable strides, giving rise to the development of large language models such as ChatGPT. The chatbot has garnered significant attention from academia, industry, and the general public, marking the beginning of a new era in AI applications. This work explores how well ChatGPT can write source code. To this end, we performed a series of experiments to assess the extent to which ChatGPT is capable of solving general programming problems. Our objective is to assess ChatGPT's capabilities in two different programming languages, namely C++ and Java, by providing it with a set of programming problem, encompassing various types and difficulty levels. We focus on evaluating ChatGPT's performance in terms of code correctness, run-time efficiency, and memory usage. The experimental results show that, while ChatGPT is good at solving easy and medium programming problems written in C++ and Java, it encounters some difficulties with more complicated tasks in the two languages. Compared to code written by humans, the one generated by ChatGPT is of lower quality, with respect to runtime and memory usage.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2024
			
	Rivista
	
				MACHINE LEARNING WITH APPLICATIONS
			
	Codice DOI
	
				https://dx.doi.org/10.1016/j.mlwa.2024.100526
			
	Appare nelle tipologie:
	
				1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
1-s2.0-S2666827024000021-main.pdf accesso aperto Tipologia: Documento in Versione Editoriale Licenza: Creative commons Dimensione 1.56 MB Formato Adobe PDF Visualizza/Apri	1.56 MB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11697/242219

Citazioni

ND

ND

55

social impact