COMPARISON OF EXPONENTIAL COVARIANCE FUNCTIONS FOR BIVARIATE GEOSTATISTICAL DATA

In the analysis of multivariate spatial random fields, it is essential to define a covariance structure that adequately models the relationship between the variables under study. We propose a covariance structure with exponential correlation function for bivariate random fields, the SEC model. We compare the SEC model fits with the bivariate separable exponential model and the bivariate exponential model with constraints, which are particular cases of the full bivariate Matérn model, presented in the literature. A simulation study assess characteristics of the proposed model. The model is fitted to a weather data set from Brazil, bearing in mind the importance of analyzing climate data to predict adverse environmental conditions. Predictive measures are used to compare the models under study. The satisfactory results compared to the models considered and the simpler structure makes the SEC model an alternative for the analysis of bivariate spatial fields.


Introduction
Spatial data analysis is of interest in a diversity of contexts including monitoring of environmental indicators. The definition of a covariance structure that jointly models multiple responses over a region of space is challenging, especially when the number of response variables increases, which can inflate the number of parameters on the model structure. In such cases, finding a valid covariance structure which can be reasonably estimated from data is a non trivial task.
Different covariance structures for random fields have been proposed and evaluated by several authors, among them, the linear model of corregionalization (GOULARD and VOLTZ, 1992), which is built from univariate models and the Matérn covariance model (APANASOVICH and GENTON, 2010;GNEITING et al., 2010;GUTTORP and GNEITING, 2005). Genton and Kleiber (2015) present a review of the main constructions of cross-covariance functions addressed in the literature. Other recent works on covariance functions for multivariate spatial data can be found at Alegría et al. (2018), Bevilacqua et al. (2020), Kleiber (2017), Ip and Li (2017), Prause et al. (2018), Schlather et al. (2015) and Vallejos et al. (2020), to name a few. Gneiting et al. (2010), provides necessary and sufficient conditions for the covariance structure of a bivariate random field with Matérn correlation function to be positive definite. In addition, sufficient conditions for the covariance structure of a multivariate Matérn random field are also defined. Such models have restrictions in the parametric space, which makes estimation more difficult, especially when the number of response variables increases. More robust covariance models for random fields are still a challenge.
The purpose here is to investigate a covariance structure to model bivariate isotropic Gaussian random fields. The presented structure, called SEC model, is built upon the Kronecker products reformulation of covariance matrices as simple matrix products (MARTINEZ-BENEITO, 2013), for continuous spatial processes. Attractive characteristics of this approach is that the covariance matrices resulting are always valid and the conditions imposed on the parameter values are less restrictive.
Section 2 presents some basic concepts on cross covariance functions. Section 3 summarizes the Matérn model discussed in Gneiting et al. (2010). Section 4 presents the proposed covariance structure for bivariate random fields: the simpler exponential covariance (SEC) model, along with its main characteristics. Section 5 presents a simulation exercise to illustrate some features of the proposed model. Section 6 presents the analysis of a real data set to illustrate the proposed model, as well as the calculation of accuracy measures to compare the proposed model with the bivariate separable exponential and the constrained bivariate exponential models. The main conclusions are summarized in Section 7. All the analysis reported are performed using the computational statistical software R (R CORE TEAM, 2020).
for the amount x at the location s k . Therefore, it is usual to represent the infinite possible values of the quantity x at each location s k , by a random variable X(s k ). In addition, these random variables can be seen as a subset of an infinite family of random variables (WACKERNAGEL, 2013) which is called random function (RF) and it is represented by X(s), for any s ∈ R d .
Chiles and Delfiner (2012) present a RF as a stochastic process when the quantity of interest varies in a one-dimensional space, which can be interpreted as time, for example, and as a random field when the quantity of interest varies in a space of a dimension greater than one.
A RF is called Gaussian, if for any finite collection of locations s 1 , s 2 , ..., s k , with s k ∈ R d , the probability distribution of X(s 1 ), X(s 2 ), ..., X(s k ) is a Gaussian multivariate distribution. Considering that the first two moments are sufficient to characterize the Gaussian distribution, a Gaussian random field is also completely characterized by its mean and covariance function (CHILÈS and DELFINER, 2012;DIGGLE and RIBEIRO, 2007). It is also called isotropic, if its covariance function does not depend on the h vector orientation, but only on its length |h|.
The notation X(s) = (X 1 (s), X 2 (s)) will be used to denote a bivariate Gaussian random field at s ∈ R d , with mean vector µ(s) = (µ 1 (s), µ 2 (s)) , where µ i (s) = E(X i (s)), for i = 1, 2 and covariance matrix: where C ij (h) in (1) is the cross covariance function between the variables X i (s) and X j (s + h), which, for a stationary process, is given by: that is, the covariance function of a stationary process depends only on the distance h between the locations. When i = j, the functions in (2) become the univariate covariance functions C ii (h), and if, in addition, h = 0, then C ii (0) = V ar(X i (s)) = σ 2 i . From the covariance function, it is possible to obtain the correlation function, If the variables are zero-centered, the covariance functions in (1), can be defined by: The covariance structure (1) must meet the positive definite condition, that is: for all n ∈ N, s 1 , s 2 , ..., s n ∈ R 2 and a 1 , a 2 , ..., a n ∈ R 2 .
Other properties on cross covariance functions can be found in Diggle and Ribeiro (2007), Chiles and Delfiner (2012) and Wackernagel (2013). Section 3 presents an important class of covariance functions discussed in the literature, the Matérn covariance function.

Matérn Covariance Function
The Matérn covariance model proposed by Gneiting et al. (2010), uses the Matérn correlation function to build the covariance matrix defined in (1). Thus, for a second-order bivariate Gaussian random process with zero mean vector, the ij-th , is the Matérn correlation function at distance |h|, K ν is the modified Bessel function of order ν, σ > 0 is the standard deviation parameter, ν > 0 is the smoothing parameter and φ > 0 is the correlation length. For i = j, ρ ii = 1, the expression reduces to the univariate covariance function, whereas for i = j, to the cross covariance function. According to Gneiting et al. (2010), a challenge is to determine the conditions for the parameters ρ ij , ν ij and φ ij , so that the C matrix is positive definite. Necessary and sufficient conditions are established for the parameters of a bivariate random field, and also proved that the structure (4) is valid if, and only if, (1 + t 2 /φ 2 12 ) ν12+d/2 (1 + t 2 /φ 2 1 ) ν1/2+d/4 (1 + t 2 /φ 2 2 ) ν2/2+d/4 , for some β 12 ∈ [−1, 1]. This model is called full bivariate Matérn model. As a complex model, the full bivariate Matérn model can, in some cases, be difficult to estimate and interpret due to the restrictions in the parametric space. In this sense, more parsimonious proposals of this model were presented. For example, a particular case of this model is a construction called separable. A covariance function is called separable if it assumes that its components share the same correlation structure (BEVILACQUA et al., 2015;BEVILACQUA et al., 2016). Thus, for a bivariate Gaussian random field with the Mátern correlation function, the covariance function will take the form: i, j = 1, 2 and ρ ii = 1.
The model has some limitations because it cannot capture the different scales and smoothness of the variables under study. The model in 6 is called bivariate separable Matérn model (BEVILACQUA et al., 2018;VALLEJOS et al., 2020).
We consider here the bivariate separable Matérn model (ExpGneiting sep) and the bivariate Matérn model with constraints (ExpGneiting contr), taking the particular case where the exponential correlation function is used.
A covariance structure for bivariate stationary Gaussian random fields with exponential correlation function is proposed in Section 4.

The SEC model
In this section, we present in details the proposed covariance structure for bivariate isotropic stationary Gaussian random fields, where each component is an exponential process. We called it as simpler exponential covariance (SEC) model and it is based upon Martinez-Beneito (2013). Unlike Martinez-Beneito (2013) proposal, which presents a covariance structure for multivariate mapping diseases problems, our work differs mainly in the fact that our covariance structure is built for continuous space data and it is applied for bivariate geostatistical problems. Thus, the covariance structure takes the form: where,C ii denotes the Cholesky decomposition of the matrix C ii , B diag represents the matrix in diagonal blocks of the matricesC 11 andC 22 , I n is the identity matrix of dimension n and Σ b denotes the correlation matrix, which for simplicity, we will consider: For univariate covariance functions, C ii , exponential covariance functions will be considered, that is, where σ 2 i > 0 is the variance parameter and φ i > 0 is the correlation length. The SEC model is the structure specified by (7), (8) and (9). This structure is always positive definite for any positive definite matrices choices for C 11 , C 22 and Σ b , and considering the usual restrictions for the variance and correlation parameters (MARTINEZ-BENEITO, 2013). For the bivariate case, the matrix C(h) in (7) is positive definite if, and only if, |ρ 12 | < 1. To illustrate the behavior of the proposed model, we simulate a bivariate stationary Gaussian process with SEC model on a regular grid. We set the parameter values as σ 1 = 0.5, σ 2 = 1.0, φ 1 = 0.1, φ 2 = 0.3 and ρ 12 = −0.8, 0.8, aiming to illustrate situations with positive and negative correlations between the variables. Figure 1 shows corresponding simulations.
The likelihood function is given by: We consider for the function in (10) the covariance structure of the SEC model, defined by equations (7), (8) and (9), where Ψ = (φ 1 , φ 2 , σ 1 , σ 2 , ρ 12 ) is the parameters vector to be estimated. We set the following parameter values: φ 1 = 0.5, φ 2 = 0.15, σ 1 = 0.5, σ 2 = 1.0, ρ 12 = −0.6, 0.0, 0.6. These values were considered in order to illustrate different variability, correlation lengths and different correlations that the data can present, ranging from strong negative to strong positive values. Figure 2 illustrates the results for the centered maximum likelihood estimates. Figure 3 illustrates the results for the maximum likelihood estimates for each parameter. Table 1 presents the mean, the bias and the mean square error (MSE) for all estimated parameters.  The given results suggests that the estimates are unbiased, considering that the estimated values are very close to the true parameter values for all parameters under analysis. We also noticed that the variability of the estimates remains practically the same, when we vary the ρ 12 parameter values.
On the next Section 6, we illustrate the application of the proposed SEC model to a real data set and we compare it with the classic ExpGneiting sep and ExpGneiting contr models.

Data analysis
To illustrate the proposed model we analyzed meteorological data from Brazil, in the winter of 2019, available on the INMET page (Instituto Nacional de Meteorologia, website: https://portal.inmet.gov.br/, accessed on 21/11/2020). We consider 556 automatic stations and the variables pressure (kj/m) and air temperature ( • C). The geographical coordinates considering both variables are shown in Figure 4 below. Figure 4, shows a positive linear correlation between pressure and temperature with Pearson coefficient of 0.41. Standard deviations are 3.16 for the pressure and 4.60 for the temperature.
When fitting the models, we consider the residuals of the variables as a bivariate isotropic Gaussian random field X(s) = (X P , X T ), representing the pressure and temperature variables, respectively, such that, for the 556 locations, we have to X(s) ∼ MN M 1112 (0, C), where C is the covariance matrix of each model to be estimated.
The models considered here, SEC, ExpGneiting contr and ExpGneiting sep were parametrized by Ψ = (φ 1 , φ 2 , σ 2 1 , σ 2 2 , ρ 12 ) for non-separable models and, φ 1 = φ 2 , for the ExpGneiting sep model. Table 2 presents the values of the loglikelihood, Akaike information criterion (AIC) and Bayesian information criterion (BIC) with models fitted to all 556 data. The parameters estimates considering each model are given in Table 3. The ExpGneiting contr and ExpGneiting sep models were estimated using functions from the GeoModels package (BEVILACQUA et al., 2018).    The predictive behavior is also assessed by a random training selection of 445 locations (80% of the data), from which we estimate the models under study and compute the mean absolute error (MAE), the root mean square error (RMSE) and the normalized mean square error (NMSE) measures for each model using cokriging predictor for the 111 remaining locations (20% of the data). These measures are defined as, whereX i (s k ) is the cokriging predictor of the variable X i (s k ), with i = P, T representing pressure and temperature, respectively. We repeated the same process 200 times, calculating the values of the M AE i , RM SE i and N M SE i measures for each variable each time. Table 4 presents the mean and standard deviation for the likelihood values and the calculated errors considering each model estimated for each variable.  The SEC model presented the smallest values of prediction errors when compared to ExpGneiting sep and ExpGneiting contr models. In addition, the SEC model also presented the smallest means of estimation times when compared to ExpGneiting contr model. When compared to the ExpGneiting sep model, the means of estimation times of the SEC model were close.

Conclusions
This work presents results for the proposed covariance structure with an exponential correlation function to model problems involving bivariate isotropic stationary Gaussian spatial data, considering that this is an area of great interest and modeling multivariate structures is still an active area in spatial data analysis. The SEC model is relatively simpler and presents promising results when compared to other models in the literature when applied to bivariate data. The simulation presented here produced unbiased estimates and low mean square error. We also observed that the variability of estimates seems to be invariant to the distinct correlation parameter values.
For the data analysis, the SEC model presented better AIC and BIC and prediction errors measures. The SEC model also presented the smallest means of computing times when compared to ExpGneiting contr model, which has the same number of parameters and more similar characteristics. The estimates for the common parameters of both models were similar. When compared to the ExpGneiting sep model, the means of computing times of the SEC model were similar.
We conclude that, despite its simpler structure, the proposed covariance structure, presented results at least as good as the classic ExpGneiting sep and ExpGneiting contr models. Furthermore, its generalization to more than two variables might be easier to handle than other more complex models, for example, the full bivariate Mátern model, since the usual parametric restrictions for the variance and correlation parameters are sufficient to ensure that the resulting matrix is positive definite. Therefore, the proposed model is an alternative for modeling bivariate spatial data with potential to be extended to higher dimensions, something to be further investigated. RESUMO: Na análise de campos aleatórios espaciais multivariados,é essencial definir uma estrutura de covariâncias que modele adequadamente a relação entre as variáveis em estudo. Nesse sentido, propomos uma estrutura de covariância com função de correlação exponencial para campos aleatórios bivariados, o modelo SEC. Também comparamos os ajustes do modelo SEC com o modelo exponencial bivariado separável e o modelo exponencial bivariado com restrições, que são casos particulares do modelo full bivariado Matérn, apresentado na literatura. Um estudo de simulação avalia as características do modelo proposto. O modeloé ajustado a um conjunto de dados meteorológicos do Brasil, considerando a importância da análise de dados climáticos para prever condições ambientais adversas. Medidas preditivas são utilizadas para comparar os modelos em estudo. Os resultados satisfatórios comparados com as alternativas consideradas e a estrutura simples fazem do modelo SEC uma alternativa para análise de campos aleatórios bivariados. PALAVRAS-CHAVE: Campos aleatórios; Funções de covariância; Modelo exponencial bivariado; Dados geoestatísticos.