GreenStableYolo: Optimizing Inference Time and Image Quality of Text-to-Image Generation

Gong, J.; Li, S.; D'Aloisio, G.; Ding, Z.; Ye, Y.; Langdon, W. B.; Sarro, F.

doi:10.1007/978-3-031-64573-0_7

Tuning the parameters and prompts for improving AI-based text-to-image generation has remained a substantial yet unaddressed challenge. Hence we introduce GreenStableYolo, which improves the parameters and prompts for Stable Diffusion to both reduce GPU inference time and increase image generation quality using NSGA-II and Yolo. Our experiments show that despite a relatively slight trade-off (18%) in image quality compared to StableYolo (which only considers image quality), GreenStableYolo achieves a substantial reduction in inference time (266% less) and a 526% higher hypervolume, thereby advancing the state-of-the-art for text-to-image generation.

GreenStableYolo: Optimizing Inference Time and Image Quality of Text-to-Image Generation

Gong J.;Li S.;d'Aloisio G.;Ding Z.;Ye Y.;Langdon W. B.;Sarro F.

2024-01-01

Abstract

Tuning the parameters and prompts for improving AI-based text-to-image generation has remained a substantial yet unaddressed challenge. Hence we introduce GreenStableYolo, which improves the parameters and prompts for Stable Diffusion to both reduce GPU inference time and increase image generation quality using NSGA-II and Yolo. Our experiments show that despite a relatively slight trade-off (18%) in image quality compared to StableYolo (which only considers image quality), GreenStableYolo achieves a substantial reduction in inference time (266% less) and a 526% higher hypervolume, thereby advancing the state-of-the-art for text-to-image generation.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2024
			
	Codice ISBN
	
				9783031645723
9783031645730
			
	Appare nelle tipologie:
	
				4.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
978-3-031-64573-0_7.pdf solo utenti autorizzati Tipologia: Documento in Versione Editoriale Licenza: Copyright dell'editore Dimensione 277.21 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	277.21 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11697/263040

Citazioni

ND

5

3

GreenStableYolo: Optimizing Inference Time and Image Quality of Text-to-Image Generation

Gong J.;Li S.;d'Aloisio G.;Ding Z.;Ye Y.;Langdon W. B.;Sarro F.

2024-01-01

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Pubblicazioni consigliate

Informazioni

Citazioni

social impact

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)