Understanding the factors that control tsunami-induced fatalities remains a major challenge due to the complex, nonlinear interplay between hazard intensity, evacuation dynamics and social vulnerability. Previous empirical approaches, largely based on simple regressions, have offered valuable insights but remain limited in their ability to represent threshold effects and context-specific interactions. In this study, an explainable machine learning framework is applied to the 2011 Tōhoku tsunami, leveraging a town-level fatality dataset, enriched with explicative variables that describe hazard and evacuation conditions, demographic structure and local effects affecting tsunami propagation. A Random Forest regression model is trained to explore the relative influence and conditional behavior of these predictors, using permutation-based importance and individual conditional expectation analyses. Results show that evacuation efficiency dominates the explanation of observed fatality ratios, followed by inundation depth and demographic composition. Nonlinear thresholds, beyond which fatalities rise sharply, emerge for both hazard and evacuation variables. Built-environment characteristics, such as shielding and debris generation potential, further modulate outcomes through complex interactions with hazard and mobility conditions. Overall, although based on a relatively small dataset, the analysis highlights that tsunami mortality is driven by multiple, interdependent processes rather than single controlling factors. The combination of machine learning and interpretability tools then provides new insights into the mechanisms governing loss of life and supports actionable priorities for risk reduction, including improved evacuation planning, targeted protection of aging communities and more resilient urban design.
Explainable machine learning for more robust models of tsunami-induced fatalities
Scorzini, Anna Rita
;
2026-01-01
Abstract
Understanding the factors that control tsunami-induced fatalities remains a major challenge due to the complex, nonlinear interplay between hazard intensity, evacuation dynamics and social vulnerability. Previous empirical approaches, largely based on simple regressions, have offered valuable insights but remain limited in their ability to represent threshold effects and context-specific interactions. In this study, an explainable machine learning framework is applied to the 2011 Tōhoku tsunami, leveraging a town-level fatality dataset, enriched with explicative variables that describe hazard and evacuation conditions, demographic structure and local effects affecting tsunami propagation. A Random Forest regression model is trained to explore the relative influence and conditional behavior of these predictors, using permutation-based importance and individual conditional expectation analyses. Results show that evacuation efficiency dominates the explanation of observed fatality ratios, followed by inundation depth and demographic composition. Nonlinear thresholds, beyond which fatalities rise sharply, emerge for both hazard and evacuation variables. Built-environment characteristics, such as shielding and debris generation potential, further modulate outcomes through complex interactions with hazard and mobility conditions. Overall, although based on a relatively small dataset, the analysis highlights that tsunami mortality is driven by multiple, interdependent processes rather than single controlling factors. The combination of machine learning and interpretability tools then provides new insights into the mechanisms governing loss of life and supports actionable priorities for risk reduction, including improved evacuation planning, targeted protection of aging communities and more resilient urban design.Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


