Ensuring data quality is critical for reliable decision-making, analytics, and machine learning applications. Traditional data validation methods often depend on manually defining quality rules, a process that is time-consuming, error-prone, and difficult to scale. Great Expectations (GEs) is a widely adopted framework for data validation; however, crafting its rules manually introduces challenges in scalability, domain adaptability, and syntactic complexity. This study explores the use of Large Language Models (LLMs) to automate the conversion of natural language data quality requirements into structured GEs validation rules. We fine-tune the LLaMA-3.2-3B-bnb-4bit model using Low-Rank Adaptation (LoRA) on real-world datasets sourced from the telecommunications and IT sectors. To evaluate the effectiveness of this approach, we apply standard NLP metrics ROUGE, BLEU, METEOR, and BERTScore, alongside practical QA metrics such as rule completeness and manual effort reduction. Our results demonstrate that the fine-tuned LLM significantly outperforms generic models, generating rules with greater fluency, accuracy, and domain alignment.

Quality by Prompt: LLM-Powered Transformation of Data Quality Requirements Into Great Expectations

Abughazala, Moamin;Muccini, Henry;Sharaf, Mohammad
2026-01-01

Abstract

Ensuring data quality is critical for reliable decision-making, analytics, and machine learning applications. Traditional data validation methods often depend on manually defining quality rules, a process that is time-consuming, error-prone, and difficult to scale. Great Expectations (GEs) is a widely adopted framework for data validation; however, crafting its rules manually introduces challenges in scalability, domain adaptability, and syntactic complexity. This study explores the use of Large Language Models (LLMs) to automate the conversion of natural language data quality requirements into structured GEs validation rules. We fine-tune the LLaMA-3.2-3B-bnb-4bit model using Low-Rank Adaptation (LoRA) on real-world datasets sourced from the telecommunications and IT sectors. To evaluate the effectiveness of this approach, we apply standard NLP metrics ROUGE, BLEU, METEOR, and BERTScore, alongside practical QA metrics such as rule completeness and manual effort reduction. Our results demonstrate that the fine-tuned LLM significantly outperforms generic models, generating rules with greater fluency, accuracy, and domain alignment.
2026
9783032041890
9783032041906
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11697/284163
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? 0
social impact