AN APPLICATION OF THE POISSON-WEIBULL MODEL FOR CENSURED DATA

ABSTRACT: In this paper, we proposed the Poisson-Weibull distribution for the modeling of survival data. The motivation to study this model since, in addition to generalizing the Weibull distribution, which is widely used in several areas of knowledge among them the Survival and Reliability analysis, it presents great flexibility in the forms of the hazard function. The Poisson-Weibull distribution was created in a composition of discrete and continuous distributions where there is no information about which factor was responsible for the component failure, only the minimum lifetime value among all risks is observed. The maximum likelihood approach was used to estimate the parameters of the model. Also was conducted a simulation study to examine the mean, the bias, and the root of the mean square error of the maximum likelihood estimates of the proposed model according to the censoring percentages and sample sizes. The model selection criteria were also applied, in addition to graphic techniques such as TTT-Plot and Kaplan-Meier. Application to the real data set was used to illustrate the usefulness of the distribution.


Introduction
In several applications of statistics, the response variable consists of the time until the occurrence interest event, and this time is generally called failure time or survival time. In studies of failure time, is common the presence censored observations and which consists of incomplete observation of the time of interest. Data with censored observations are present in several areas, such as public health, actuarial science, biomedical studies, demography, reliability, among others. For example, in animal science, according to Cardoso et al. (2009), several characteristics of importance economic show censored data, that is, that are not fully observed for all animals at the time of genetic evaluation. Characteristics such as longevity, prolificacy, and total female productivity are examples of censored data, because many animals are still reproducing at the time of evaluation and, therefore, only the lower limit of their phenotypic value is known. Records of these characteristics are considered as censoring and excluding these records from analysis or manipulating them as uncensored would yield biased results (GUEDES, 2014).
Some statistical techniques used in animal science, such as linear or generalized linear mixed model, may not consider the censored observations during the modeling, which can affect in the inference process. In survival analysis data modeling, the most of procedures lead to consideration of the non-informative censoring, i.e, censoring time distribution do not carry any information about the failure time distribution to parameters estimation (DOS SANTOS JUNIOR, 2012). The non-informative censoring condition was considered in this work.
Several probability distributions can be used to model the survival time distribution, but despite the existence in the literature of several of these distributions, the demand for new distributions is justified by the fact that the usual models such as the exponential, Weibull (WEIBULL, 1958), log-logistic (VERHULST, 1938), and Gamma (THOM, 1947) distributions, which are an of the bigger distributions used in survival and reliability area, often do not fit well with the actual data set under analysis. In this context, new distributions arise from distributions most used in the literature, as the Weibull distribution, because this one does not provide a reasonable fit for modeling of data with non-monotone hazard functions such as the bathtub shaped and the unimodal hazard functions which are common in reliability and biological studies (BARRETO-SOUZA et al., 2008).
Various methods have been studied in the literature to generate new distributions. Marshall and Olkin (1997) presented a method for adding a parameter with application to the exponential and Weibull distributions. Another form to create new distributions is through the composition of discrete with continuous distributions. This procedure to create new probability distributions starts from the fact that it is not always possible to observe the exact value of the survival time, but the minimum or maximum value of this times. This occurs, for example, when the interest is to observe the lifetime of a system in series or in parallel where the observed time depends on the duration of a component set.
In recent years, several authors considered this approach to create new distributions, among them we can cite Adamidis and Loukas (1998) that proposed the exponential geometric distribution (EG) which was created through the duration of a series system. Specifically, it is considered that both the number and the identification of the components that caused the failure are not observable, but only the value of the minimum lifetime among them, where the number of components follows a geometric distribution and the time of duration of each component follows an exponential distribution. Following the same idea as EG distribution, Kus (2007) introduced the exponential-Poisson distribution (EP) inserted also in a serial system. Chahkandi and Ganjali (2009) introduced exponential power series (EPS), which contains the distributions cited EG, EP, and logarithmic exponential as special cases. Barreto-Souza et al. (2008) propose the Weibull-geometric distribution (WG) that generalizes the EG and Weibull distributions. Louzada et al. (2011) introduced the complementary exponential geometric distribution (CEG) that generalizes the EG distribution where the motivation is the duration of a system in parallel on what only the maximum lifetime for all risks is known. Yamachi et al. (2013) presented the exponentiated complementary exponential geometric (ECEG) distribution which is a generalization of the CEG distribution. Louzada et al. (2012) and Louzada et al. (2014) propose the CEG and ECEG distributions in long-term scenarios, respectively. Recently, Louzada et al. (2018) introduced the beta exponentiated Weibull geometric distribution, which contains some distributions as particular cases.
Thus, trough the considerations mentioned we used in this paper the Poisson-Weibull distribution (PW) defined by Bereta et al. (2011), presenting as an original contribution the modeling of data with censored observations via a simulation study to some levels of censuring besides the application of these data. The choice of PW model, in addition to generalizing the Weibull distribution, which is one of the most used distributions in the area of survival and recent years is applied to data in the unit interval (MAZUCHELI et al., 2020), presents great flexibility in the forms of the hazard function. This paper is organized as follows. In Section 2, we present the PW distribution. The inferential procedure based on the maximum likelihood approach is introduced in Section 3. In Section 4 we shown the selection criteria. A simulation study to assess the performance of the PW model parameters estimates is presented in Section 5. In Section 6, the data set is analyzed, and the final remarks appear in Section 7.

The Poisson-Weibull distribution (PW)
The PW distribution proposed by Bereta et al. (2011) was created in the scenario in which it is considered that both the number of components and the identification of the component that caused the failure are not observable. On the other hand, the value of the minimum lifetime among them is registered. In this case, the number of components follows a Poisson distribution and the time of duration of each component follows a Weibull distribution. However, the PW distribution can be used in any other situation that fits well with the data according to the criteria used. The density function is defined by where t > 0, β > 0 is scale parameter, α > 0 and γ > 0 are shape parameters. The quantile, survival and hazard functions corresponding to (1) are given by, respectively, and

Maximum likelihood estimation
Several methods can be used to estimate the probabilistic models parameters, and the most common is the maximum likelihood. One of the characteristics of this method is which allows the inclusion of censoring in its estimation process, which is not always possible with other estimation methods, for example, the least-squares method.

Maximum likelihood for PW distribution
Given a random sample of the size n composed by (t 1 , δ 1 ), (t 2 , δ 2 ), . . . , (t n , δ n ) where t i = min(T i , C i ), T represents the survival time and C consists of the censoring time, which is independent of T . For each i = 1, . . . , n, the variable T i has PW distribution with vector of parameters θ = (α, β, γ) T and δ i a random variable of censoring indicator, where the values 0 or 1 represent the censured or observed data, respectively. Consider the density and survival functions given by (1) and (3) respectively, the log-likelihood function (θ) for the parameter vector θ, where the survival and censure times are independent and censure is non-informative, can be expressed as Maximum likelihood estimates (MLEs) for parameter vector θ can be obtained by maximizing (5) from the resolution of the system of equations U (θ) = ∂l(θ) ∂θ = 0. The components of the score vector U (θ) of the PW distribution are given by Since there is no closed analytical way to find these estimators, we can use numerical methods for solving the system of equations. Thus, estimates of these parameters were obtained by numerical methods, using an iterative process. We used the command optim in software R through of the method BFGS.
For the selection of the model that best fits the data, were used the criterion's: Global deviance (GD); AIC (Akaike's information criterion), and BIC (Bayesian information criterion). The GD, AIC, and BIC measures are defined by where (θ) is the maximized log-likelihood function, p is the number of parameters of the model and n is the sample size. Besides these criterions, hypothesis test, such as the likelihood ratio test (LR), can be taken into account due the WP distribution has other distributions such as particular cases. For test nested distributions, we compute the maximum values of the restricted (H 0 ) and unrestricted (H 1 ) loglikelihoods to construct the test statistic. Under H 0 and some regularity conditions, the distribution of the statistical likelihood ratio converges to a χ 2 distribution with degrees of freedom equal to the difference between the numbers of parameters of the unrestricted and restricted models.
For the comparison of the models in the boundary of the parameter space, for example, H 0 : α → 0 and H 1 : α > 0, the distribution of the statistical test ω n is a mixture with a weight (0.5 and 0.5) of distribution χ 2 with one degree of freedom, with a discrete distribution and with mass concentrated in the value 0, this is, P (ω n ≤ w) = 1 2 + 1 2 P (χ 2 1 ≤ w). Large positive values of ω n give favorable evidence to the unrestricted model. For example, for a significance level of 5%, H 0 is rejected if ω n > 2.7055. More details in Maller et al. (2011).

Simulation study
To examine the performance of MLE's for the parameters of the PW distribution, a simulation study was made for different values of n and censored observations in each sample. In this study, were generating 1000 random samples simulated with the support of the software R. The values of the PW variable were generated from the inverse transformation method, where the parameter values were fixed in α = 1.0, β = 2.0, γ = 3.0 and the censoring times C were sampled from the Uniform distribution in the interval (0, τ ), where the τ assisted in the control of the censoring observations. Besides, the censoring percentages were considered equal to 0% (τ = 50.0), 10% (τ = 3.6) and 20% (τ = 2.0) and the survival time observed of the variable in the simulation was considered through t * i = min(t i , c i ). The process of this simulation is as follows: 5. If t i < c i then δ i = 1, otherwise, δ i = 0, to i = 1, ..., n.
For each combination of n, B = 1000 samples were generated and obtained the maximum likelihood estimates of the PW distribution by optim via the method

BFGS.
The bias and the square root of the mean-squared error (RMSE) of the maximum likelihood estimates were also calculated. The formulas of bias and RMSE are defined as where θ j represents the maximum likelihood estimative of the parameter vector θ for the j-th replication, j = 1, . . . , B. From the simulation results, shown in Table 1, it was observed that the estimates of the parameters of the PW distribution were close to the true value of the parameters and the RMSEs decay toward zero when the sample size n increases, as expected. In addition, the RMSE increased as the censoring percentages increased.    As the results of the maximum likelihood estimates using the BFGS algorithm were satisfactory, we did not verify other iterative methods to estimate parameters of the PW distribution, such as the EM algorithm.

Application
The purpose of this section was to verify the applicability of the PW distribution on a dataset extracted from the literature where there are censored observations. The data set was extracted from the book of the Lee and Wang (2003, p.231) and refers to the survival time of 137 bladder cancer patients in which the interest is to study the time until their remission, where 7% of the patients are censored. Bladder cancer is one of the most common cancers of the urinary tract and one of the most incident in the world. In addition to smoking, white and elderly man present the most probability to develop this disease 1 .
Initially, to obtain more information on the survival time of the bladder cancer patients, it was made an analysis of these times without considering the censored observations. In this case, see results in Table 2. Note that the median time of remission of the patients was approximate of 6 months, which indicates that approximately 50% of patients had the survival time larger than 6 months and its mean survival time was 9 months. It can also be observed that 25% of patients had a lifetime approximately less than 3 months or greater than 11 months. Furthermore, the lifetime of the patients was between 0.0800 and 79.0500 months, which implies a greater variability of the lifetime.  Table 3 gives a descriptive summary of these data based on Kaplan-Meier method. This table presents the values of the median and its respective 95% confidence interval. The percentage of censored observations for survival times is 7%. In addition, we present in Figure 3 an empirical analysis of the behavior of the hazard function of the bladder cancer data. The graphical method based on the total time test (TTT), called of TTT-Plot (AARSET, 1987), was used to verify the hazard function on observed times of the cancer patients. This method is advantageous when there is information on the hazard function of the studied variable. According to AARSET (1987), the empirical version of the TTT plot is defined as G(r/n) = [( r i=1 Y i:n ) − (n − r)Y r:n ]/( r i=1 Y i:n ), where r = 1, . . . , n and Y i:n represent the order statistics of the sample. AARSET (1987) showed that the hazard function is constant if the TTT plot is presented graphically as a straight diagonal, the hazard function is increasing (or decreasing) if the TTT plot is concave (or convex). The hazard function is U-shaped if the TTT plot is convex and then concave, otherwise, the hazard function is unimodal. The TTT-Plot related to bladder cancer data in Figure 3 (b) reveals a unimodal shape since the plot shows a first concave and then convex curvature. So, we can try using the PW distribution for the modeling of data because this distribution assumes some forms of the hazard function, among them, the unimodal. (a) q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q  Table 4 shows the maximum likelihood estimates (MLE's) with selection criteria (GD, AIC and BIC) and standard errors for the parameters of PW distribution with their particulars cases: exponential-Poisson (γ = 1); Weibull (α → 0); exponential (γ = 1 and α = 1). It can be observed that the PW distribution had the lowest value of the GD and AIC in relation to the other distributions, indicating that this model is more appropriate to the data. As the PW model is reduced in sub-models, the likelihood ratio (LR) test was used to verify if any sub-models fit better than the PW distribution. The values of LR test are presented in Table 5. Analyzing this Table, it can be observed that the PW distribution was favorable in all three hypothesis tests, thus this distribution was considered to be the most appropriate to the data. These results are corroborated by the plot in Figure 4, which shows the fitted survival function of PW distribution closed to the empirical Kaplan-Meier ( Figure 4a) and fitted hazard function (Figure 4b).   From the considerations mentioned, the PW model was considered as the most appropriate for the bladder cancer data. Thus, the final model has the survival function given by where t > 0 represents the lifetime of the patients with bladder cancer. According to equation (6), it is possible to get some information for the bladder cancer data: • The probability that a patient with bladder cancer will not have a remission of the disease after 4 months is given by This value indicates that the chance of a patient present evidence of the disease after 4 months is approximately 70%.
• The probability that a patient with bladder cancer will not have a remission of the disease after 12 months is given by S(12) = exp 3.939 exp − (0.038 (12)) This value indicates that the chance of a patient present evidence of the disease after 12 months is approximately 27%.
• The median time , which can be estimated via the quantile function, is given by equation (2)

Final remarks
In this paper, the Poisson-Weibull distribution with the presence of censored observations was used to analyzis data. This distribution was created in a composition of discrete and continuous distributions, where there is no information about which factor was responsible for the component failure, only the minimum lifetime value among all risks is observed. Different simulation studies were adopted to study the means, the biases, and the root of mean squared error of the ML estimates of the proposed model for different values of n and censored observations where it was verified good results. Finally, an application of the PW distribution was presented as an alternative for the fit the bladder cancer data. Through graphical analysis of the TTT-Plot and Kaplan-Meier, the observed values of the GD, AIC, and BIC criteria, and the likelihood ratio test, it can be noted that the PW model fitted well to the data. Thus, it is expected that this model is useful for fitting other datasets that present some forms of the hazard function, among them, the unimodal. VIGAS, V. P.; PRATAVIERA, F.; SILVA, G. O. Uma aplicação do modelo Poisson-Weibull para dados censurados. Rev. Bras. Biom., Lavras, v.39, n.4, p.505-521, 2021.