The graph edit distance (GED) is among the most widely used graph similarity measures in practice. It asks for a minimum cost edit path between two given labeled graphs G and H, where the edit path is defined as a sequence of operations (e.g., node and edge insertions, deletions or substitutions) that successively transform the graph G into H. In this work, we suggest a new ILP formulation (FORI) based on orienting the corresponding edge variables. Moreover, we suggest enhancing two state-of-the-art ILP formulations by incorporating additional inequalities. We theoretically compare the strength of the formulations with respect to their Linear Programming relaxations. The result is a hierarchy with (FORI) at the top. Our extensive evaluation on widely used benchmark sets shows that our improved formulations run significantly faster than the previous ones. These allow to solve to proven optimality all the reference instances from common databases, such as the IAM Graph Database, many of which were prohibitive with state-of-the-art methods. Moreover, we are able to compute the GED of a small pattern and a large graph such as CORA and PUBMED, having up to 19,717 nodes and 44,327 edges.
Enhancing Graph Edit Distance Computation: Stronger and Orientation-Based ILP Formulations
D'Ascenzo, Andrea;Rossi, Fabrizio
2025-01-01
Abstract
The graph edit distance (GED) is among the most widely used graph similarity measures in practice. It asks for a minimum cost edit path between two given labeled graphs G and H, where the edit path is defined as a sequence of operations (e.g., node and edge insertions, deletions or substitutions) that successively transform the graph G into H. In this work, we suggest a new ILP formulation (FORI) based on orienting the corresponding edge variables. Moreover, we suggest enhancing two state-of-the-art ILP formulations by incorporating additional inequalities. We theoretically compare the strength of the formulations with respect to their Linear Programming relaxations. The result is a hierarchy with (FORI) at the top. Our extensive evaluation on widely used benchmark sets shows that our improved formulations run significantly faster than the previous ones. These allow to solve to proven optimality all the reference instances from common databases, such as the IAM Graph Database, many of which were prohibitive with state-of-the-art methods. Moreover, we are able to compute the GED of a small pattern and a large graph such as CORA and PUBMED, having up to 19,717 nodes and 44,327 edges.| File | Dimensione | Formato | |
|---|---|---|---|
|
p4737-d_ascenzo.pdf
solo utenti autorizzati
Tipologia:
Documento in Versione Editoriale
Licenza:
Creative commons
Dimensione
1.04 MB
Formato
Adobe PDF
|
1.04 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


