Time series representation for classification : a motif-based approach ; Représentation de séries temporelles pour la classification : une approche basée sur la découverte automatique de motifs
In: https://theses.hal.science/tel-01922186 ; Data Structures and Algorithms [cs.DS]. Université Pierre et Marie Curie - Paris VI, 2017. English. ⟨NNT : 2017PA066593⟩, 2017
Online
Hochschulschrift
Zugriff:
Our research described in this thesis is about the learning of a motif-based representation from time series to perform automatic classification. Meaningful information in time series can be encoded across time through trends, shapes or subsequences usually with distortions. Approaches have been developed to overcome these issues often paying the price of high computational complexity. Among these techniques, it is worth pointing out distance measures and time series representations. We focus on the representation of the information contained in the time series. We propose a framework to generate a new time series representation to perform classical feature-based classification based on the discovery of discriminant sets of time series subsequences (motifs). This framework proposes to transform a set of time series into a feature space, using subsequences enumerated from the time series, distance measures and aggregation functions. One particular instance of this framework is the well-known shapelet approach. The potential drawback of such an approach is the large number of subsequences to enumerate, inducing a very large feature space and a very high computational complexity. We show that most subsequences in a time series dataset are redundant. Therefore, a random sampling can be used to generate a very small fraction of the exhaustive set of subsequences, preserving the necessary information for classification and thus generating a much smaller feature space compatible with common machine learning algorithms with tractable computations. We also demonstrate that the number of subsequences to draw is not linked to the number of instances in the training set, which guarantees the scalability of the approach. The combination of the latter in the context of our framework enables us to take advantage of advanced techniques (such as multivariate feature selection techniques) to discover richer motif-based time series representations for classification, for example by taking into account the relationships between the ...
Titel: |
Time series representation for classification : a motif-based approach ; Représentation de séries temporelles pour la classification : une approche basée sur la découverte automatique de motifs
|
---|---|
Autor/in / Beteiligte Person: | Renard, Xavier ; Learning, Fuzzy and Intelligent systems (LFI) ; Laboratoire d'Informatique de Paris 6 (LIP6) ; Université Pierre et Marie Curie - Paris 6 (UPMC)-Centre National de la Recherche Scientifique (CNRS)-Université Pierre et Marie Curie - Paris 6 (UPMC)-Centre National de la Recherche Scientifique (CNRS) ; Université Pierre et Marie Curie - Paris, VI ; Detyniecki, Marcin ; Rifqi, Maria |
Link: | |
Zeitschrift: | https://theses.hal.science/tel-01922186 ; Data Structures and Algorithms [cs.DS]. Université Pierre et Marie Curie - Paris VI, 2017. English. ⟨NNT : 2017PA066593⟩, 2017 |
Veröffentlichung: | HAL CCSD, 2017 |
Medientyp: | Hochschulschrift |
Schlagwort: |
|
Sonstiges: |
|