Discovering the top-k unexplained sequences in time-stamped observation data

IRIS

There are numerous applications where we wish to discover unexpected activities in a sequence of time-stamped observation data-for instance, we may want to detect inexplicable events in transactions at a website or in video of an airport tarmac. In this paper, we start with a known set A of activities (both innocuous and dangerous) that we wish to monitor. However, in addition, we wish to identify "unexplained" subsequences in an observation sequence that are poorly explained (e.g., because they may contain occurrences of activities that have never been seen or anticipated before, i.e., they are not in A. We formally define the probability that a sequence of observations is unexplained (totally or partially) w.r.t. A. We develop efficient algorithms to identify the top-k Totally and partially unexplained sequences w.r.t. A. These algorithms leverage theorems that enable us to speed up the search for totally/partially unexplained sequences. We describe experiments using real-world video and cyber-security data sets showing that our approach works well in practice in terms of both running time and accuracy. © 2014 IEEE.

Discovering the top-k unexplained sequences in time-stamped observation data

Albanese M.;Molinaro C.;Persia F.;Picariello A.;Subrahmanian V. S.

2014-01-01

Abstract

There are numerous applications where we wish to discover unexpected activities in a sequence of time-stamped observation data-for instance, we may want to detect inexplicable events in transactions at a website or in video of an airport tarmac. In this paper, we start with a known set A of activities (both innocuous and dangerous) that we wish to monitor. However, in addition, we wish to identify "unexplained" subsequences in an observation sequence that are poorly explained (e.g., because they may contain occurrences of activities that have never been seen or anticipated before, i.e., they are not in A. We formally define the probability that a sequence of observations is unexplained (totally or partially) w.r.t. A. We develop efficient algorithms to identify the top-k Totally and partially unexplained sequences w.r.t. A. These algorithms leverage theorems that enable us to speed up the search for totally/partially unexplained sequences. We describe experiments using real-world video and cyber-security data sets showing that our approach works well in practice in terms of both running time and accuracy. © 2014 IEEE.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2014
			
	Rivista
	
				IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
			
	Codice DOI
	
				https://dx.doi.org/10.1109/TKDE.2013.33
			
	Appare nelle tipologie:
	
				1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
J2 - TKDE14.pdf solo utenti autorizzati Tipologia: Documento in Versione Editoriale Licenza: Creative commons Dimensione 3.94 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	3.94 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11697/166564

Citazioni

ND

34

22

social impact