Classification and modeling of time series of astronomical data

Loading...
Thumbnail Image
Date
2018
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
We are living in the era of Big Data, where several tools have been developed to deal with large amount of data. These technological advances have allowed the rise of the astronomical surveys. These surveys are capable to take observations from the sky and from them generate information ready to be analyzed. Among the observations available there are light curves of astronomical objects, such as, variable stars, transients or supernovae. Generally, the light curves are irregularly measured in time, since it is not always possible to get observational data from optical telescopes. This issue makes the light curves analysis an interesting statistical challenge, because there are few statistical tools to analyze irregular time series. In addition, due to the large amount of light curves available in each survey, automated processes are also required to analyze all the information efficiently. Consequently, in this thesis two goals are addressed: the classification of the light curves from the implementation of data mining algorithms and the temporal modeling of them. Regarding the classification of light curves, our contribution was to develop a classifier for RR Lyrae variable stars in the Vista Variables in the Via Lactea (VVV) nearin frared survey. It is important to detect RR-Lyraes since they are essential to build a three-dimensional map of the Galactic bulge. In this work, the focus is on RRab type ab (i.e., fundamental-mode pulsators). The final classifier is built following eight key steps that include the choice of features, training set, selection of aperture, and family of classifiers. The best classification performance was obtained by the AdaBoost classifier which achieves an harmonic mean between false positives and false negatives of ≈ 7%. The performance is estimated using cross validation and through the comparison with two independent data sets that were classified by human experts. The classifier implemented has already made it possible to identify some RRab in the outer bulge and the southern galactic disk areas of the VVV. In addition, I worked on modeling light curves. I develop new models to fit irregularly spaced time series. Currently there are few tools to model this type of time series. One example is the Continuous Autoregressive model of order one, CAR(1), however some assumptions must be satisfied in order to use this model. A new alternative to fit irregular time series, that we call the irregular autoregressive model (IAR model), is proposed. The IAR model is a discrete representation of the CAR(1) model which provide more flexibility, since it is not limited by Gaussian time series. However, both the CAR(1) and IAR model are only able to estimate positive autocorrelations. In order to fit negatively correlated irregular time series a Complex irregular autoregressive model (CIAR model) was also developed. For both models maximum likelihood estimation procedures are proposed. Furthermore, the finite sample performance of the parameters estimation is assessed by Monte Carlo simulations. Finally, for both models some applications are proposed on astronomical data. Applications include the detection of multiperiodic variable stars and the verification of the correct estimation of the parameters in models commonly used to fit astronomical light curves.
Description
Tesis (Doctor en Estadística)--Pontificia Universidad Católica de Chile, 2018
Keywords
Citation