Taming latency at the edge: A user-aware service placement approach

IRIS

Modern network and computing infrastructures are tasked with addressing the stringent demands of today's applications. A pivotal concern is the minimization of latency experienced by end-users accessing services. While emerging network architectures provide a conducive setting for adept orchestration of microservices in terms of reliability, self-healing and resiliency, assimilating the awareness of the latency perceived by the user into placement decisions remains an unresolved problem. Current research addresses the problem of minimizing inter-service latency without any guarantee to the level of latency from the end-user to the cluster. In this research, we introduce an architectural approach for scheduling service workloads within a given cluster, prioritizing placement on the node that offers the lowest perceived latency to the end-user. To validate the proposed approach, we propose an implementation on Kubernetes (K8s), currently one of the most used workload orchestration platforms. Experimental results show that our approach effectively reduces the latency experienced by the end-user in a finite time without degrading the quality of service. We study the performance of the proposed approach analyzing different parameters with a particular focus on the size of the cluster and the number of replica pods involved to measure the latency. We provide insights on possible trade-offs between computational costs and convergence time.

Taming latency at the edge: A user-aware service placement approach

Centofanti C.;Tiberti W.;Marotta A.;Graziosi F.;Cassioli D.

2024-01-01

Abstract

Modern network and computing infrastructures are tasked with addressing the stringent demands of today's applications. A pivotal concern is the minimization of latency experienced by end-users accessing services. While emerging network architectures provide a conducive setting for adept orchestration of microservices in terms of reliability, self-healing and resiliency, assimilating the awareness of the latency perceived by the user into placement decisions remains an unresolved problem. Current research addresses the problem of minimizing inter-service latency without any guarantee to the level of latency from the end-user to the cluster. In this research, we introduce an architectural approach for scheduling service workloads within a given cluster, prioritizing placement on the node that offers the lowest perceived latency to the end-user. To validate the proposed approach, we propose an implementation on Kubernetes (K8s), currently one of the most used workload orchestration platforms. Experimental results show that our approach effectively reduces the latency experienced by the end-user in a finite time without degrading the quality of service. We study the performance of the proposed approach analyzing different parameters with a particular focus on the size of the cluster and the number of replica pods involved to measure the latency. We provide insights on possible trade-offs between computational costs and convergence time.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2024
			
	Rivista
	
				COMPUTER NETWORKS
			
	Codice DOI
	
				https://dx.doi.org/10.1016/j.comnet.2024.110444
			
	Appare nelle tipologie:
	
				1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
1-s2.0-S1389128624002767-main.pdf accesso aperto Tipologia: Documento in Post-print Licenza: Creative commons Dimensione 2.96 MB Formato Adobe PDF Visualizza/Apri	2.96 MB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11697/231599

Citazioni

ND

8

5

social impact