Browsing by Author "Quintana, Fernando A."
Now showing 1 - 15 of 15
Results Per Page
Sort Options
- ItemA new family of slash-distributions with elliptical contours(ELSEVIER SCIENCE BV, 2007) Gomez, Hector W.; Quintana, Fernando A.; Torres, Francisco J.We introduce a new family of univariate and multivariate slash-distributions. Our construction is based on elliptical distributions. We define the new family by means of a stochastic representation as the scale mixture of an elliptically distributed random variable with respect to the power,of a U(0, 1) random variable. The same idea is extended to the multivariate case. We study general properties of the resulting families, including their moments. We illustrate special cases of interest, such as Normal, Cauchy, Student-t, Type II Pearson and Kotz-type distributions. (c) 2007 Elsevier B.V. All rights reserved.
- ItemA predictive view of Bayesian clustering(ELSEVIER, 2006) Quintana, Fernando A.This work considers probability models for partitions of a set of n elements using a predictive approach, i.e., models that are specified in terms of the conditional probability of either joining an already existing cluster or forming a new one. The inherent structure can be motivated by resorting to hierarchical models of either parametric or nonparametric nature. Parametric examples include the product partition models (PPMs) and the model-based approach of Dasgupta and Raftery (J. Amer. Statist. Assoc. 93 (1998) 294), while nonparametric alternatives include the Dirichlet process, and more generally, the species sampling models (SSMs). Under exchangeability, PPMs and SSMs induce the same type of partition structure. The methods are discussed in the context of outlier detection in normal linear regression models and of (univariate) density estimation. (c) 2004 Elsevier B.V. All rights reserved.
- ItemA semiparametric Bayesian model for repeatedly repeated binary outcomes(WILEY-BLACKWELL, 2008) Quintana, Fernando A.; Mueller, Peter; Rosner, Gary L.; Relling, Mary V.We discuss the analysis of data from single-nucleotide polymorphism arrays comparing tumour and normal tissues. The data consist of sequences of indicators for loss of heterozygosity (LOH) and involve three nested levels of repetition: chromosomes for a given patient, regions within chromosomes and single-nucleotide polymorphisms nested within regions. We propose to analyse these data by using a semiparametric model for multilevel repeated binary data. At the top level of the hierarchy we assume a sampling model for the observed binary LOH sequences that arises from a partial exchangeability argument. This implies a mixture of Markov chains model. The mixture is defined with respect to the Markov transition probabilities. We assume a non-parametric prior for the random-mixing measure. The resulting model takes the form of a semiparametric random-effects model with the matrix of transition probabilities being the random effects. The model includes appropriate dependence assumptions for the two remaining levels of the hierarchy, i.e. for regions within chromosomes and for chromosomes within patient. We use the model to identify regions of increased LOH in a data set coming from a study of treatment-related leukaemia in children with an initial cancer diagnostic. The model successfully identifies the desired regions and performs well compared with other available alternatives.
- ItemBayesian first order auto-regressive latent variable models for multiple binary sequences(SAGE PUBLICATIONS LTD, 2011) Giardina, Federica; Guglielmi, Alessandra; Quintana, Fernando A.; Ruggeri, FabrizioLongitudinal clinical trials often collect long sequences of binary data monitoring a disease process over time. Our application is a medical study conducted in the US by the Veterans Administration Cooperative Urological Research Group to assess the effectiveness of a chemotherapy treatment (thiotepa) in preventing recurrence on subjects affected by bladder cancer. We propose a generalized linear model with latent auto-regressive structure for longitudinal binary data following a Bayesian approach. We discuss inference as well as sensitivity to prior choices for the bladder cancer data. We find that there is a significant treatment effect in the sense that treated patients have much smaller predicted recurrence probabilities than placebo patients.
- ItemBayesian modeling using a class of bimodal skew-elliptical distributions(ELSEVIER SCIENCE BV, 2009) Elal Olivero, David; Gomez, Hector W.; Quintana, Fernando A.We consider Bayesian inference using an extension of the family of skew-elliptical distributions studied by Azzalini [1985. A class of distributions which includes the normal ones. Scand. J. Statist. Theory and Applications 12 (2), 171-178]. This new class is referred to as bimodal skew-elliptical (BSE) distributions. The elements of the BSE class can take quite different forms. In particular, they can adopt both uni- and bimodal shapes. The bimodal case behaves similarly to mixtures of two symmetric distributions and we compare inference under the BSE family with the specific case of mixtures of two normal distributions. We study the main properties of the general class and illustrate its applications to two problems involving density estimation and linear regression. (C) 2008 Elsevier B.V. All rights reserved.
- ItemDPpackage: Bayesian Semi- and Nonparametric Modeling in R(JOURNAL STATISTICAL SOFTWARE, 2011) Jara, Alejandro; Hanson, Timothy E.; Quintana, Fernando A.; Mueller, Peter; Rosner, Gary L.Data analysis sometimes requires the relaxation of parametric assumptions in order to gain modeling flexibility and robustness against mis-specification of the probability model. In the Bayesian context, this is accomplished by placing a prior distribution on a function space, such as the space of all probability distributions or the space of all regression functions. Unfortunately, posterior distributions ranging over function spaces are highly complex and hence sampling methods play a key role. This paper provides an introduction to a simple, yet comprehensive, set of programs for the implementation of some Bayesian nonparametric and semiparametric models in R, DPpackage. Currently, DPpackage includes models for marginal and conditional density estimation, receiver operating characteristic curve analysis, interval-censored data, binary regression data, item response data, longitudinal and clustered data using generalized linear mixed models, and regression data using generalized additive models. The package also contains functions to compute pseudo-Bayes factors for model comparison and for eliciting the precision parameter of the Dirichlet process prior, and a general purpose Metropolis sampling algorithm. To maximize computational efficiency, the actual sampling for each model is carried out using compiled C, C++ or Fortran code.
- ItemFlexible Univariate Continuous Distributions(INT SOC BAYESIAN ANALYSIS, 2009) Quintana, Fernando A.; Steel, Mark F. J.; Ferreira, Jose T. A. S.Based on a constructive representation, which distinguishes between a skewing mechanism P and an underlying symmetric distribution F, we introduce two flexible classes of distributions. They are generated by nonparametric modelling of either P or F. We examine properties of these distributions and consider how they can help us to identify which aspects of the data are badly captured by simple symmetric distributions. Within a Bayesian framework, we investigate useful prior settings and conduct inference through MCMC methods. On the basis of simulated and real data examples, we make recommendations for the use of our models in practice. Our models perform well in the context of density estimation using the multimodal galaxy data and for regression modelling with data on the body mass index of athletes.
- ItemMultivariate Bayesian discrimination for varietal authentication of Chilean red wine(TAYLOR & FRANCIS LTD, 2011) Gutierrez, Luis; Quintana, Fernando A.; von Baer, Dietrich; Mardones, ClaudiaThe process through which food or beverages is verified as complying with its label description is called food authentication. We propose to treat the authentication process as a classification problem. We consider multivariate observations and propose a multivariate Bayesian classifier that extends results from the univariate linear mixed model to the multivariate case. The model allows for correlation between wine samples from the same valley. We apply the proposed model to concentration measurements of nine chemical compounds named anthocyanins in 399 samples of Chilean red wines of the varieties Merlot, Carmenere and Cabernet Sauvignon, vintages 2001-2004. We find satisfactory results, with a misclassification error rate based on a leave-one-out cross-validation approach of about 4%. The multivariate extension can be generally applied to authentication of food and beverages, where it is common to have several dependent measurements per sample unit, and it would not be appropriate to treat these as independent univariate versions of a common model.
- ItemMULTIVARIATE BAYESIAN SEMIPARAMETRIC MODELS FOR AUTHENTICATION OF FOOD AND BEVERAGES(INST MATHEMATICAL STATISTICS, 2011) Gutierrez, Luis; Quintana, Fernando A.Food and beverage authentication is the process by which foods or beverages are verified as complying with its label description, for example, verifying if the denomination of origin of an olive oil bottle is correct or if the variety of a certain bottle of wine matches its label description. The common way to deal with an authentication process is to measure a number of attributes on samples of food and then use these as input for a classification problem. Our motivation stems from data consisting of measurements of nine chemical compounds denominated Anthocyanins, obtained from samples of Chilean red wines of grape varieties Cabernet Sauvignon, Merlot and Carmenere. We consider a model-based approach to authentication through a semiparametric multivariate hierarchical linear mixed model for the mean responses, and covariance matrices that are specific to the classification categories. Specifically, we propose a model of the ANOVA-DDP type, which takes advantage of the fact that the available covariates are discrete in nature. The results suggest that the model performs well compared to other parametric alternatives. This is also corroborated by application to simulated data.
- ItemNonparametric Bayesian Modeling and Estimation of Spatial Correlation Functions for Global Data(INT SOC BAYESIAN ANALYSIS, 2021) Porcu, Emilio; Bissiri, Pier Giovanni; Tagle, Felipe; Soza, Ruben; Quintana, Fernando A.We provide a nonparametric spectral approach to the modeling of correlation functions on spheres. The sequence of Schoenberg coefficients and their associated covariance functions are treated as random rather than assuming a parametric form. We propose a stick-breaking representation for the spectrum, and show that such a choice spans the support of the class of geodesically isotropic covariance functions under uniform convergence. Further, we examine the first order properties of such representation, from which geometric properties can be inferred, in terms of Ho spacing diaeresis lder continuity, of the associated Gaussian random field. The properties of the posterior, in terms of existence, uniqueness, and Lipschitz continuity, are then inspected. Our findings are validated with MCMC simulations and illustrated using a global data set on surface temperatures.
- ItemNonparametric Bayesian modelling using skewed Dirichlet processes(ELSEVIER, 2009) Iglesias, Pilar L.; Orellana, Yasna; Quintana, Fernando A.We introduce a new class of discrete random probability measures that extend the definition of Dirichlet process (DP) by explicitly incorporating skewness. The asymmetry is controlled by a single parameter in such a way that symmetric DPs are obtained as a special case of the general construction. We review the main properties of skewed DPs and develop appropriate Polya urn schemes. We illustrate the modelling in the context of linear regression models of the capital asset pricing model (CAPM) type, where assessing symmetry for the error distribution is important to check validity of the model. (C) 2008 Elsevier B.V. All rights reserved.
- ItemRANDOM-SET METHODS IDENTIFY DISTINCT ASPECTS OF THE ENRICHMENT SIGNAL IN GENE-SET ANALYSIS(INST MATHEMATICAL STATISTICS, 2007) Newton, Michael A.; Quintana, Fernando A.; Den Boon, Johan A.; Sengupta, Srikumar; Ahlquist, PauiA prespecified set of genes may be enriched, to varying degrees, for genes that have altered expression levels relative to two or more states of a cell. Knowing the enrichment of gene sets defined by functional categories. such as gene ontology (GO) annotations, is valuable for analyzing the biological signals in microarray expression data. A common approach to measuring enrichment is by cross-classifying genes according to membership in a functional category and membership oil a selected list of significantly altered genes. A small Fisher's exact test P-value, for example, in this 2 x 2 table is indicative of enrichment. Other category analysis methods retain the quantitative gene-level scores and measure significance by referring a category-level statistic to a permutation distribution associated with the original differential expression problem. We describe a class of random-set scoring methods that measure distinct components of the enrichment signal. The class includes Fisher's test based on selected genes and also tests that average gene-level evidence across the category. Averaging and selection methods are compared empirically using Affymetrix data on expression in nasopharyngeal cancer tissue, and theoretically using a location model of differential, expression. We find that each method has a domain of superiority in the state space of enrichment problems, and that both methods have benefits in practice. Our analysis also addresses two problems related to multiple-category inference, namely, that equally enriched categories are not detected with equal probability if they are of different sizes, and also that there is dependence among category statistics owing to shared genes. Random-set enrichment calculations do not require Monte Carlo for implementation. They are made available in the R package allez.
- ItemSemiparametric Bayesian classification with longitudinal markers(BLACKWELL PUBLISHING, 2007) De la Cruz Mesia, Rolando; Quintana, Fernando A.; Mueller, PeterWe analyse data from a study involving 173 pregnant women. The data are observed values of the beta human chorionic gonadotropin hormone measured during the first 80 days of gestational age, including from one up to six longitudinal responses for each woman. The main objective in this study is to predict normal versus abnormal pregnancy outcomes from data that are available at the early stages of pregnancy. We achieve the desired classification with a semiparametric hierarchical model. Specifically, we consider a Dirichlet process mixture prior for the distribution of the random effects in each group. The unknown random-effects distributions are allowed to vary across groups but are made dependent by using a design vector to select different features of a single underlying random probability measure. The resulting model is an extension of the dependent Dirichlet process model, with an additional probability model for group classification. The model is shown to perform better than an alternative model which is based on independent Dirichlet processes for the groups. Relevant posterior distributions are summarized by using Markov chain Monte Carlo methods.
- ItemSimilarity analysis in Bayesian random partition models(ELSEVIER, 2011) Navarrete, Carlos A.; Quintana, Fernando A.This work proposes a method to assess the influence of individual observations in the clustering generated by any process that involves random partitions. We call it Similarity Analysis. It basically consists of decomposing the estimated similarity matrix into an intrinsic and an extrinsic part, coupled with a new approach for representing and interpreting partitions. Individual influence is associated with the particular ordering induced by individual covariates, which in turn provides an interpretation of the underlying clustering mechanism. We present applications in the context of Species Sampling Mixture Models (SSMMs), including Bayesian density estimation and dependent linear regression models. (C) 2010 Elsevier B.V. All rights reserved.
- ItemThe Semi-Hierarchical Dirichlet Process and Its Application to Clustering Homogeneous Distributions(INT SOC BAYESIAN ANALYSIS, 2021) Beraha, Mario; Guglielmi, Alessandra; Quintana, Fernando A.Assessing homogeneity of distributions is an old problem that has received considerable attention, especially in the nonparametric Bayesian literature. To this effect, we propose the semi-hierarchical Dirichlet process, a novel hierarchical prior that extends the hierarchical Dirichlet process of Teh et al. (2006) and that avoids the degeneracy issues of nested processes recently described by Camerlenghi et al. (2019a). We go beyond the simple yes/no answer to the homogeneity question and embed the proposed prior in a random partition model; this procedure allows us to give a more comprehensive response to the above question and in fact find groups of populations that are internally homogeneous when I >= 2 such populations are considered. We study theoretical properties of the semi hierarchical Dirichlet process and of the Bayes factor for the homogeneity test when I = 2. Extensive simulation studies and applications to educational data are also discussed.