Using the Box-Cox family of distributions to model censored data:a distributional regression approach

Main Article Content

Luiz R. Nakamura
https://orcid.org/0000-0002-7312-2717
Thiago G. Ramires
Ana J. Righetto
Viviane C. Silva
Andréa C. Konrath

Abstract

The study of the expected time until an event of interest is a recurring topic in different fields, such as medical, economics and engineering. The Kaplan-Meier method and the Cox proportional hazards model are the most used methodologies to deal with such kind of data. Nevertheless, in recent years, the generalised additive models for location, scale and shape (GAMLSS) models – which can be seen as distributional regression and/or beyond the mean regression models – have been standing out as a result of its highly flexibility and ability to fit complex data. GAMLSS are a class of semi-parametric regression models, in the sense that they assume a distribution for the response variable, and any and all of its parameters can be modelled as linear and/or non-linear functions of a set of explanatory variables. In this paper, we present the Box-Cox family of distributions under the distributional regression framework as a solid alternative to model censored data.

Article Details

How to Cite
Nakamura, L. R., Ramires, T., Righetto, A., Silva, V., & Konrath, A. (2022). Using the Box-Cox family of distributions to model censored data:a distributional regression approach. Brazilian Journal of Biometrics, 40(4), 407–414. https://doi.org/10.28951/bjb.v40i4.625
Section
Articles

References

Akaike, H. A new look at the statistical model identification. IEEE Transactions on Automatic Control 19, 716–723 (1974).

Alizadeh, M, Ramires, T. G., MirMostafaee, S.-K., Samizadeh, M & Ortega, E. M. M. A new useful four-parameter extension of the Gumbel distribution: Properties, regression model and applications using the GAMLSS framework. Communications in Statistics – Simulation and Computation 48, 1746–1767 (2019).

Buuren, S & Fredriks, M. Worm plot: a simple diagnostic device for modelling growth reference curves. Statistics in Medicine 20, 1259–1277 (2001).

Castro, M, Cancho, V. G. & Rodrigues, J. A hands-on approach for fitting long-term survival models under the GAMLSS framework. Computer Methods and Programs in Biomedicine 97, 168–177 (2010).

Cole, T. J. & Green, P. J. Smoothing reference centile curves: the lms method and penalized likelihood. Statistics in Medicine 11, 1305–1319 (1992).

Cox, D. R. Regression Models and Life-Tables. Journal of the Royal Statistical Society: Series B (Methodological) 34, 187–202 (1972).

Dunn, P. K. & Smyth, G. K. Randomized quantile residuals. Journal of Computational and Graphical Statistics 5, 236–244 (1996).

Emmert-Streib, F. & Dehmer, M. Introduction to Survival Analysis in Practice. Machine Learning and Knowledge Extraction 1, 1013–1038 (2019).

Fabrizi, F., Donato, F. M. & Messa, P. Association Between Hepatitis B Virus and Chronic Kidney Disease: a Systematic Review and Meta-analysis. Annals of Hepatology 16, 21–47 (2017).

Gijbels, I. Censored data. Wiley Interdisciplinary Reviews: Computational Statistics 2, 178–188 (2010).

Hastie, T. J. & Tibshirani, R. J. Generalized Additive Models (Chapman and Hall/CRC, 1990).

Heller, G. Z., Robledo, K. P. & Marschner, I. C. Distributional regression in clinical trials: treatment effects on parameters other than the mean. BMC Medical Research Methodology 22, 56 (2022).

Kaplan, E. L. & Meier, P. Nonparametric Estimation from Incomplete Observations. Journal of the American Statistical Association 53, 457–481 (1958).

Kneib, T. Beyond mean regression. Statistical Modelling 13, 275–303 (2013).

Lee, J. D., Sun, D. L., Sun, Y & Taylor, J. E. Exact post-selection inference, with application to the lasso. The Annals of Statistics 44, 907–927 (2016).

Nakamura, L. R., Cerqueira, P. H. R., Ramires, T. G., Pescim, R. R., Rigby, R. A. & Stasinopoulos, D. M. A new continuous distribution on the unit interval applied to modelling the points ratio of football teams. Journal of Applied Statistics 46, 416–431 (2019).

Nakamura, L. R., Rigby, R. A., Stasinopoulos, D. M., Leandro, R. A., Villegas, C & Pescim, R. R. Modelling location, scale and shape parameters of theBirnbaum-Saunders generalized t distribution. Journal of Data Science 15, 221–238 (2017).

Nelder, J. A. & Wedderburn, R. W. M. Generalized Linear Models. Journal of the Royal Statistical Society: Series A (General) 135, 370–384 (1972).

R Core Team. R: A Language and Environment for Statistical Computing R Foundation for Statistical Computing (Vienna, Austria, 2022). https://www.R-project.org/.

Ramires, T. G., Nakamura, L. R., Righetto, A. J., Carvalho, R. J., Vieira, L. A. & Pereira, C. A. B. Comparison between highly complex location models and GAMLSS. Entropy 23, 469 (2021a).

Ramires, T. G., Nakamura, L. R., Righetto, A. J., Ortega, E. M. M. & Cordeiro, G. M. Predicting survival function and identifying associated factors in patients with renal insufficiency in the metropolitan area of Maringá, Paraná State, Brazil. Cadernos de Saúde Pública 34, e00075517 (2018).

Ramires, T. G., Nakamura, L. R., Righetto, A. J., Pescim, R. R., Mazucheli, J & Cordeiro, G. M. A new semiparametric Weibull cure rate model: fitting different behaviors within GAMLSS. Journal of Applied Statistics 46, 2744–2760 (2019).

Ramires, T. G., Nakamura, L. R., Righetto, A. J., Pescim, R. R., Mazucheli, J, Stasinopoulos, D. M. & Rigby, R. A. Validation of stepwise-based procedure in GAMLSS. Journal of Data Science 19, 96–110 (2021b).

Rigby, R. A. & Stasinopoulos, D. M. Generalized additive models for location, scale and shape. Journal of the Royal Statistical Society: Series C (Applied Statistics) 54, 507–554 (2005).

Rigby, R. A. & Stasinopoulos, D. M. Smooth centile curves for skew and kurtotic data modelled using the Box-Cox power exponential distribution. Statistics in Medicine 23, 3053–3076 (2004).

Rigby, R. A. & Stasinopoulos, D. M. Using the Box-Cox t distribution in GAMLSS to model skewness and kurtosis. Statistical Modelling 6, 209–229 (2006).

Rigby, R. A., Stasinopoulos, D. M., Heller, G. Z. & De Bastiani, F. Distributions for Modeling Location, Scale, and Shape: Using GAMLSS in R (CRC Press, 2019).

Schwarz, G. Estimating the Dimension of a Model. The Annals of Statistics 6, 461–464 (1978).

Sprangers, B, Nair, V, Launay-Vacher, V, Riella, L. V. & Jhaveri, K. D. Risk factors associated with post–kidney transplant malignancies: an article from the Cancer-Kidney International Network. Clinical Kidney Journal 11, 315–329 (2018).

Stasinopoulos, D. M. & Rigby, R. A. Generalized additive models for location, scale and shape (GAMLSS) in R. Journal of Statistical Software 23, 1–46 (2007).

Stasinopoulos, D. M., Rigby, R. A., Heller, G. Z., Voudouris, V & De Bastiani, F. Flexible Regression and Smoothing: Using GAMLSS in R (CRC Press, 2017).

Tangri, N. et al. Multinational Assessment of Accuracy of Equations for Predicting Risk of Kidney Failure: A Meta-analysis. The Journal of the American Medical Association 315, 164–174 (2016).