Spatiotemporal modeling of count data

Loading...
Thumbnail Image
Date
2021
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Modeling spatial and spatio-temporal data is a challenging task in statistics. In many applications, the observed data can be modeled using Gaussian, skew-Gaussian or even restricted random field models. However, in several fields, such as population genetics, epidemiology, aquaculture, among others, the data of interest are often count data, and therefore the mentioned models are not suitable for the analysis of this type of data. Consequently, there is a need for spatial and spatio-temporal models that are able to properly describe data coming from counting processes. Commonly two approaches are used to model this type of data: generalized linear mixed models (GLMMs) with Gaussian random field (GRF) effects, and copula models. Unfortunately, these approaches do not give an explicit characterization of the count random field such us their q-dimensional distribution or correlation function. It is important to stress that GLMMs models induces a discontinuity in the path. Therefore, the correlation function is not continuous at the origin and samples located nearby are more dissimilar than in the continuous case. Moreover, there are cases in which the copula representation for discrete distributions is not unique, so it is unidentifiable. Hence, to deal with the latter mentioned issues, we propose a novel approach to model spatial and spatio-temporal count data in an efficient and accurate manner. Briefly, starting from independent copies of a “parent” GRF, a set of transformations can be applied, and the result is a non-Gaussian random field. This approach is based on the characterization of count random fields that inherit some of the well-known geometric properties from GRFs. For instance, if one chooses an isotropic correlation function defined in the parent GFR, then the count random fields have an isotropic correlation function. Firstly, we define a general class of count random fields. Then, three particular count random fields are studied. The first one is a Poisson random field, the second one is a count random field that considers excess zeros and the last one is a count random field that considers over-dispersion. Additionally, a simulation study will be developed to assess the performance of the proposed models. In that way, we are going to evaluate them through several simulation scenarios, making variations in the parameters. The results show accurate estimations of the parameters for different scenarios. Additionally, we assess the performance of the optimal linear prediction of the proposed models and it is compared with GLMMs and copula models. The results show that the proposed models have a better performance than GLMMs models and a quite similar performance with copula models. Finally, we analyze two real data applications. The first one considers a zero inflated version of the proposed Poisson random field to deal with excess zeros and the second one considers an over-dispersed count random field.
Description
Tesis (Doctor in Statistics)--Pontificia Universidad Católica de Chile, 2021
Keywords
Citation