Time Series Distance Measures: Segmentation, Classification, and Clustering of Temporal Data

AutorStephan Spiegel
QuelleTechnische Universität Berlin (PhD Thesis) 

Time series can be found in domains as diverse as medicine, astronomy, geophysics, engineering, and quantitative finance. In general, a time series is a sequence of data points, measured at successive points in time and spaced at uniform time intervals. This thesis is concerned with time series mining, including segmentation, classification, and clustering of temporal data. Many algorithms for these tasks depend upon pairwise (dis)similarity comparisons of (sub)sequences, which accounts for the continued research on time series distance measures as an important subroutine. In the course of this work we introduce several novel distance measures, which describe time series characteristics that may distinguish the individual classes contained in the data. Our proposed time series distance measures address frequently encountered issues, such as the processing of multivariate data, the computational complexity of pairwise (dis)similarity comparisons, the invariance required for temporal data with distortions, the separation of mixed signals, and the analysis of nonlinear systems. Our work contributes to the time series community by introducing novel approaches to pattern recognition in temporal data, presenting miscellaneous sensor fusion techniques for multivariate measurements, offering efficient and robust distance measures for fast time series classification, introducing previously disregarded invariance and proposing corresponding distance measures, comparing various machine learning algorithms for signal separation, and providing nonlinear models for time series mining. In addition to our theoretical contributions, we furthermore demonstrate that our proposed time series distance measures are beneficial in real-world applications, including the optimization of vehicle engines with regard to exhaust emission and the optimization of heating control in terms of energy efficiency. Furthermore, we present several specifically developed time series mining tools, which implement our introduced distance measures and provide graphical user interfaces for straightforward parameter setting as well as exploratory data analysis.