Enhancing Graph Edit Distance Computation: Stronger and Orientation-Based ILP Formulations

IRIS

The graph edit distance (GED) is among the most widely used graph similarity measures in practice. It asks for a minimum cost edit path between two given labeled graphs G and H, where the edit path is defined as a sequence of operations (e.g., node and edge insertions, deletions or substitutions) that successively transform the graph G into H. In this work, we suggest a new ILP formulation (FORI) based on orienting the corresponding edge variables. Moreover, we suggest enhancing two state-of-the-art ILP formulations by incorporating additional inequalities. We theoretically compare the strength of the formulations with respect to their Linear Programming relaxations. The result is a hierarchy with (FORI) at the top. Our extensive evaluation on widely used benchmark sets shows that our improved formulations run significantly faster than the previous ones. These allow to solve to proven optimality all the reference instances from common databases, such as the IAM Graph Database, many of which were prohibitive with state-of-the-art methods. Moreover, we are able to compute the GED of a small pattern and a large graph such as CORA and PUBMED, having up to 19,717 nodes and 44,327 edges.

Enhancing Graph Edit Distance Computation: Stronger and Orientation-Based ILP Formulations

D'Ascenzo, Andrea;Meffert, Julian;Mutzel, Petra;Rossi, Fabrizio

2025-01-01

Abstract

The graph edit distance (GED) is among the most widely used graph similarity measures in practice. It asks for a minimum cost edit path between two given labeled graphs G and H, where the edit path is defined as a sequence of operations (e.g., node and edge insertions, deletions or substitutions) that successively transform the graph G into H. In this work, we suggest a new ILP formulation (FORI) based on orienting the corresponding edge variables. Moreover, we suggest enhancing two state-of-the-art ILP formulations by incorporating additional inequalities. We theoretically compare the strength of the formulations with respect to their Linear Programming relaxations. The result is a hierarchy with (FORI) at the top. Our extensive evaluation on widely used benchmark sets shows that our improved formulations run significantly faster than the previous ones. These allow to solve to proven optimality all the reference instances from common databases, such as the IAM Graph Database, many of which were prohibitive with state-of-the-art methods. Moreover, we are able to compute the GED of a small pattern and a large graph such as CORA and PUBMED, having up to 19,717 nodes and 44,327 edges.

Scheda breve

Scheda completa

Scheda completa (DC)

Anno

2025

Appare nelle tipologie:

4.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
p4737-d_ascenzo.pdf solo utenti autorizzati Tipologia: Documento in Versione Editoriale Licenza: Creative commons Dimensione 1.04 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	1.04 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11697/270559

Citazioni

ND

0

1

social impact