CHRONIC FATIGUE SYNDROME AND ITS RELATION WITH ABSENTEEISM: ELASTIC-NET AND STEPWISE APPLIED TO BIOCHEMICAL AND ANTHROPOMETRIC CLINICAL MEASUREMENTS

▪ ABSTRACT: Characterized by persistent fatigue, pain, cognitive impairment and sleep difficulties, Chronic Fatigue Syndrome (CFS) has been common in clinical practice. Studies indicate multiple factors contributing to CFS development: poor sleep, dehydration, psychological stress, hormonal dysfunction, nutrient deficiencies, among others. In risk work conditions, like the shift work of mines, CFS significantly increases the chance of fatal accidents. Work environments of mines suggest the presence of factors that increase the risk of developing CFS. Considering the severity/implications of CFS’s symptoms on the social and professional lives as well as on the economy, efforts are targeting its characterization and prevention. This study aims to assess the risk of CFS by studying cross-sectional data on absenteeism of 621 shift workers, measuring 8 anthropometric and 11 biochemical variables as well as age and gender, amounting 21 variables. After imputation, logistic regression was fitted by Stepwise selection, Lasso and Elastic-Net regularization. Results suggest that the models do not discriminate very well due to noise inherent to the dependent variable. However, all models agree on the effects of Sodium and Total Cholesterol on the risk of absenteeism. The Stepwise model also indicates LDL and Triglycerides as significant factors, both Lasso and Elastic-Net show effects for LDL instead. The Elastic-Net model suggests an effect of Potassium, though inconclusive according to the literature.


Introduction
Chronic Fatigue Syndrome (CFS), also referred to as myalgic encephalomyelitis (ME), is a disease that has been commonly present in clinical practices in the last decades (AFARI and BUCHWALD, 2003). An epidemiological study conducted by Jason et al. (1999) estimated a CFS prevalence of 410 cases per 100,000 individuals, which suggests 1 million cases in the United States alone. As estimated by another study conducted by Jason and Njoku (2006), 850 thousand Americans suffer from the syndrome. The disease's first case definition is attributed to the United States Center for Disease Control and Prevention (CDC) (HOLMES et al., 1988) as a means of standardization for epidemiological studies. There have been some revisions of that definition since then (WILLIAMS et al., 2014). According to Fukuda et al. (1994), a CFS case is defined by the presence of chronic (or relapsing) fatigue during 6 months or more accompanied of 4 out of the following symptoms: memory or concentration impairment, sore throat, tender lymph nodes, muscle pain, joint pain, headaches, unrefreshing sleep, and post-exertional malaise. Despite its use by most of the academic community, there is criticism to this definition case since it does not demand the presence of important symptoms, like post-exertional malaise (WILLIAMS et al., 2014). The most recently used definition, proposed by the United States Centre for Disease Control and Prevention (CDC) in 2015, requires the presence of profound fatigue persisting or relapsing for more than 6 months, which should be accompanied by postexertional malaise, unrefreshing sleep and cognitive impairment.
The factors that contribute to the development of CFS still are aimed by the academic community, amongst them are irregular sleep, psychological stress, hormonal dysfunctions, nutrient deficiency, immunological dysfunctions and infections. In a study conducted with nurses by Samaha et al. (2007) sleep quality was identified as a significant factor related to CFS. A mass spectroscopy study showed that patients with CFS presented altered levels of phospholipids, cholesterol, branched amino acids, vitamins and mitochondrial metabolites (NAVIAUX et al., 2016). According to Litleskare et al. (2018), the prevalence of CFS after 10 years of the Giardia infection was 2.22 to 4.08 times the prevalence presented by the control group. Another case-control study, performed by Nagy-Szakal et al. (2018), indicates low levels of betaine and complex lipids as well as elevated triglycerides and phenylacetylglutamine in individuals with CFS. Evidence presented by Maloney et al. (2010) suggest a relation between CFS and metabolic syndrome, which includes hypertension, elevated levels of blood sugar, waist fat and abnormal levels of cholesterol. According to Bjørklund et al. (2019), individuals with CFS have nutrient deficiencies, like vitamin C, vitamin B, sodium, magnesium, folic acid and fatty acids which also seems to have importance in the CFS severity. A paper written by Bou-Holaigah et al. (1995) suggests the relations between CFS and neurally mediated hypotension and shows that its treatment, which includes moderate sodium consumption, was efficient in reducing CFS symptoms in a subgroup of individuals. The efforts to define efficient CFS markers are clearly in course, being a result of the recent recognition of CFS as an impactful disease. According to Kennedy et al. (2010), the life quality of children with CFS compares to the one experienced by children with type 1 diabetes mellitus or asthma.
Shift workers are naturally more susceptible to CFS development due to their unusual sleep and rest habits. According to Costa (2010), the night shift work is one of the most studied conditions since it disturbs the sleep cycle modifying rest patterns, resulting in significant stress in the biological circadian rhythms regulation of humans, naturally diurnal beings. As shown by Shen et al. (2006), shift frequency has a positive correlation with the intensity of the fatigue experienced by the shift workers. The disruption of body functions' circadian rhythms is responsible for the shift-lag syndrome, which is characterized by feelings of fatigue, sleepiness, insomnia, digestive difficulties, irritability, reduced mental agility and reduced performance (COSTA, 2010). In risk work conditions, CFS development potentially increases the chance of fatal accidents. As shown by Useche et al. (2017), structural equation modelling (SEM) results indicate a significant relationship between fatigue and risky behaviours on bus conduction. The work on alternated shifts of the mining industry not only fits the definition of risky conditions but also includes the factors of irregular sleep and psychological stress. Regardless of the results from this paper, it is important to underline the importance of current health practices to support mining industry workers given the hard and perilous work they are exposed to.
Considering the severity of CFS's symptoms and its implications in the social and professional lives, efforts have been made not only to its characterization but also to its effective prevention. According to Murphy et al. (2011), predictive modelling can be an effective tool in the prevention of the syndrome. A study conducted by Huang et al. (2009) compared three methods for CFS prediction using gene expression data and the Naive Bayes classifier accomplished 0.7 for the area under the ROC curve (AUC). Searching for CFS relevant factors in women suffering from breast cancer, Servaes et al. (2002) used linear regression to examine the contribution of physical, psychological and cognitive factors in the severity of fatigue in patients. By using decision trees, Bronikowski et al. (2011) obtained an accuracy of 71.88% predicting CFS based on answers to a medical questionnaire applied in a community based CFS study.
Shift workers frequently complain about irritability, anxiety and stressful work conditions. Sleep deficit and persistent circadian rhythm alteration may lead to CFS, neuroticism, chronic anxiety and/or depression, resulting in an augmented risk of absenteeism and the need of psychotropic medications (COLQUHOUN and SENN, 2000;NAKATA et al., 2004). Aiming to search for evidence that contributes to the prevention of CFS as well as for related factors, this study aimed at the relation of anthropometric and biochemical variables with the risk of absenteeism in shift workers of the mining industry. The main objectives of the study are: (i) Search for a descriptive model that possibly show some level of discrimination power of absenteeism and (ii) Relate the independent variables' effects on absenteeism with the risk factors of CFS. In order to achieve the first objective Logistic regression will be fitted to the data with the Stepwise approach to variable selection. The Elastic-Net and Lasso regularization methods will also be used to adjust the Logistic model in order to verify whether the predictive performance improves due to their flexibility. The second objective will be pursued by relating the effects found by the descriptive model with the ones found as risk factors for CFS in the literature. In the next section, the dataset and all the methods used to achieve these objectives are discussed in more detail.

Materials and methods
A cross-sectional study was performed in 2012 on 621 shift workers of a mine located in the Inconfidentes region of the Minas Gerais state, Brazil. The study, entitled "Sindrome Metabólica em Trabalhadores da Mineração do Estado de Minas Gerais", was approved by the Ethics Committee from Universidade Federal de Ouro Preto (CAAE: 0018.0.238.000-11). The individuals work on shifts of 6 hours operating off-road trucks, which are followed by 12 hours rest periods. The data collected consists of 22 variables divided into 8 anthropometric variables, 11 biochemical variables, Sex, Age and the variable Skipped, which is the variable that indicates whether the individual was absent in any giving day in the year of 2012. The Skipped variable is the dependent variable of this study conditioned on the fact that it is the result of interviews of each individual by a medical doctor from the mining company screening for signs of Chronic Fatigue Syndrome, therefore the absenteeism in this study does not include causes such as other common diseases, or holidays, or problems with the employee's family. A summary for the variables is presented in Table 1. After the descriptive analysis all the variables were standardized, the resulting dataset was used in the rest of the analysis, starting from the missing values' imputation. The next subsections will focus on brief reviews and choices of methods for the imputation of missing cases, fitting Logistic Regression, measuring model performance, performing model selection and validation. A study on similar data, measured under the same circumstances, was performed by Souza et al. (2015). The study evaluated the association between a lifetime of shift work on mines and blood pressure, fasting glucose, anthropometric variables, body composition and heart rate variability.

Missing data and missForest imputation
Missing values have been an issue since the beginning of field research, mostly due to the fact that the analytical procedures used, many of which were developed in the early 20th century, aimed to be used on complete datasets (GRAHAM, 2009). The missingness in biology and medicine is usually caused by sample mishandling, measurement errors or nonresponse which frequently lead to missing cases' removal by the researcher, which is called the complete-case analysis (STERNE et al., 2009). According to Rubin (1976) in his description of the missing mechanisms, there are three classifications: Missing Completely at Random (MCAR), Missing at Random (MAR) and Missing Not at Random (MNAR). Since the reason that the data is missing in this study is due to some randomly lost measurements, it is assumed that the data is at least MAR.
Many imputation methods for missing data have been proposed, a considerable group of them being based on the mean of observed data, like k-nearest neighbours (KNN), Bayesian principal component analysis (bPCA), random forests (missForest) and multiple imputations by chained equations (MICE) (SCHMITT et al., 2015). The missForest method outperformed other methods in a number of comparative studies using laboratory, medical and biometric datasets (CIHAN, 2018;STEKHOVEN and BÜHLMANN, 2012;WALJEE et al., 2013). The missForest method was proposed by Stekhoven and Buhlmann (2012), as a non-parametric method that copes with different types of variables simultaneously by using the random forests model to predict missing values. Basically, a random forest is fitted to the observed data and used to estimate the missing cases, the process is repeated for each of the variables with missing cases. More details on the method and the algorithm are described by Stekhoven and Bühlmann (2012). Because they can introduce significant bias both to analysis and to imputation, variables with more than 20% missing cases (Phosphorus, Vitamin D and PTH Hormone) were removed from this study at the beginning while the remaining cases were imputed with the missForest method.

Logistic regression, variable selection and regularization
Since the dependent variable of absenteeism is coded as a binary outcome, the Logistic Regression is a natural choice for a descriptive model, therefore being used in this study with multiple fitting procedures. The logistic regression, a generalized linear model assuming a binomial distribution for the response variable, is frequently applied to classification problems and risk scores generation in the medical field. The parameters are estimated with the maximum likelihood method considering the observations as independent and identically distributed (KULOTHUNGAN et al., 2014). A common challenge in the regression with multiple variables is selecting the best model to describe the data and possibly predict new observations.
A common approach to the model selection is the Stepwise method, which consists of an algorithm for automatic variable selection based on a predefined performance metric (HOCKING, 1976). It was proposed as an alternative to the best-subset selection algorithm which evaluates every possible combination of subsets of size s from the p independent variables ( = 1, . . . , ) and is computationally costly. All variations of Stepwise are based on the metric of choice and the two approaches to variable selection: forward selection and backwards elimination. The forward approach consists of starting on a null model (with no variables) and increasing the number of variables by evaluating every candidate variable at each model size. The backwards elimination is very similar to the forward, only it begins with the full model and each step removes one variable. A commonly used variation is called the Bidirectional Stepwise, which starts with the null model but at each step also considers the removal of a variable (EFROYMSON, 1960). One possible limitation of Stepwise is the case where the events-per-variable ratio (EPV) is lower than 10. According to Heinze et al. (2018), the EPV quantifies how balanced is the information provided by the data and the number of parameters to be estimated. The dataset of this study has EPV = 8.6875 and, since the Stepwise does consider all variables in the process of choosing the final model, the Logistic Regression fitted to it might suffer from this limitation. Consequently, as shown by Pavlou et al. (2015), EPV < 10 might result in poorer calibration and therefore prejudice the predictive performance of the model. The authors also show that variable regularization like Ridge, Lasso and Elastic-Net might mitigate such limitations by means of performing regularization on the independent variables.
An alternative approach to the Stepwise is the Elastic-Net method, that generalizes the Ridge and Lasso methods in order to perform variable selection as well as regularization. The Ridge method, proposed by Hoerl and Kennard (1988), fits the model using an L2-type penalized residual sum of squares: which shrinks the parameters towards zero according to . The hyperparameter is usually chosen with cross-validation in order to optimize some model-performance metric. However, the Ridge method never shrinks a parameter to zero, therefore always returning a full model and not performing variable selection. Aiming to tackle this characteristic, Tibshirani (1996) proposed the Lasso, which introduces a L1-type penalization: that performs regularization and eventual variable selection by effectively setting parameters to zero. Despite being more promising, the Lasso method also showed some limitations: (i) When p >> n Lasso will choose a maximum of n variables; (ii) In cases where n > p with multicollinearity then Ridge will dominate the Lasso; (iii) When there is a group of variables with high correlation the Lasso will pick just one of them, not caring which one it chooses. After considering those limitations Zou and Hastie (2005) proposed the Elastic-Net, combining the L1 and L2-type penalizations into: with another hyperparameter that, conjointly with , is chosen in order to optimize some performance metric. As considered by Zou and Hastie (2005), the Elastic-Net is a generalization to the Lasso that performs well in the situations where it has limitations. Methods that perform both variable selection and regularization allow further improvement of performance not possible by the Stepwise approach. Despite regularization methods lacking interpretability due to their natural bias towards zero introduced by the regularization (HEINZE et al., 2018), they can also be used as further evidence of whether the results from Stepwise are consistent. The Logistic regression was fitted by using Stepwise, Lasso and Elastic-Net for comparison of predictive performance.

Model performance measurement
All the methods used in this study depend on a model selection procedure. The most used method for model selection is Cross-Validation (CV) based on a performance metric. Among the most frequent performance metrics for classification models are Accuracy, Sensitivity, Specificity and the area under the receiving operating characteristic (ROC) curve. This study used stratified repeated -fold Cross-Validation with the AUC metric to perform model selection, both methods will be explained in more detail in subsections. Also called True Predictive Rate, the Accuracy measures the overall performance of the classifier, i.e. the proportion of predictions that matched the true class. However, it does not consider the intra-class error, if one chooses to predict every case with positive probability as the target class then the accuracy would be its proportion in the data and every class other than the target would be classified wrongly. Therefore, when used alone Accuracy may lead to misleading conclusions, which is why it is commonly accompanied by Sensitivity and Specificity (PROVOST et al., 1998). Sensitivity (True Positive Rate) measures the performance of the model conditioned on the cases where the class is the target of prediction, while Specificity (True Negative Rate) conditions the performance on the cases other than the target class allowing more information for the decision making based on the model (ALTMAN and BLAND, 1994). As pointed out by Provost et al. (1998), there might be greater interest in optimizing the Sensitivity than in optimizing Specificity or Accuracy of a model depending on the research field and problem at hand.
For models that predict a continuous probability of the target event, there are infinite possible classifiers based on the chosen probability cut-off, each of them resulting in different measurements of performance. A method that allows choosing a good probability cut-off for a classifier is the Receiving Operating Characteristic curve (ROC curve), which is built upon the Specificity and Sensitivity measures allowing to visualize their trade-off. Also, a metric that allows to measure a classifier's discrimination power is the area under the receiving operating characteristic curve (AUC) since it is directly related to the curve's proximity to the perfect or the random classifiers ( Figure 1). A perfect classifier would have AUC = 1 while the random classifier has AUC = 0.5 characterizing a random-guessing model. An important property of the AUC metric is its equivalence to the probability that a classifier ranks a positive chosen value over negative chosen value, which corresponds to the Wilcoxon rank test (Hanley and Mcneil, 1982). There is also a relation between AUC and the Gini coefficient (Gini + 1 = 2 × AUC) as shown by Hand and Till (2001).

Model selection and evaluation
Cross-validation (CV) is a method that is used to estimate the performance of a given model in predicting values for new data. The most commonly used CV methods are thefold and the leave-one-out, usually applied repeatedly (repeated k-fold) in order to reduce the high variability (VANWINCKELEN and BLOCKEEL, 2012). In the classification context, it is also common to perform the stratified cross-validation, which guarantees class balance like the original data's balance in each fold. Despite these improvements, however, the -fold methods in general result in underestimation of the true performance as a result from a bias introduced by using only a proportion of ( − 1)/ from the whole data set to fit the model (VANWINCKELEN and BLOCKEEL;2012). Despite the improvement of accuracy provided by the repeated -fold it is still biased and therefore is suggested to model selection but not to estimate model performance (KOHAVI, 1995).
The non-parametric bootstrap, proposed by Efron (1979), is utilized to estimate the variability of any measure of interest that is a function from a representative random sample. Considering a data set with n observations, bootstrap's general idea consists of obtaining m resamples with replacement of size n from the original data and evaluating the measure of interest in these m resamples resulting in an empirical distribution of the measurement. From the empirical distribution, it is possible to obtain estimates (average), confidence intervals and standard errors for the estimator. The most frequent method for bootstrap confidence intervals is the quantile, in which a 1 − × 100% confidence interval is obtained by the /2 and (1 − )/2 quantiles from the bootstrap empirical distribution. A study by Efron (1987) shows that a considerably small coefficient of variability (9%) is obtained when generating 200 bootstrap measurements which reduce to 1% when m is set to 1.000, this last value is considered as a sufficiently large number of resamples. As shown by Kohavi (1995), while the bootstrap usually results in lesser variability than k-fold ( = 10 and = 20), neither of them dominates in terms of relative bias and whether one outperforms the other depends on the data set.
When considering the performance of prediction rules trained on the data, Efron and Tibshirani (1997) proposed the 632+ bootstrap as an attempt to correct the bias inherent to the classical bootstrap and as an improvement on cross-validation. It maintains the reduced variability in relation to -fold and improves on the bias, therefore turning the 632+ bootstrap into a more appealing method for model performance estimation. As described by Witten et al. (2016), at any bootstrap resample, as n grows the proportion of cases not picked will tend to: (1 − 1/ ) ≈ −1 = 0.368, which is the chance of a particular observation not being picked at all. The performance from the model fitted to the resample will give rather optimistically biased estimations if evaluated in the resample itself since it estimated the coefficients from the very same data set. The training set (resample) has only 0.632 × 100% of the original cases, therefore the model originated from it will result in a pessimistically biased performance estimate when evaluated in the 0.368 remaining cases (test), despite it having size n. The main idea of the 632+ bootstrap is to evaluate both the training ( ) and the test ( ) performance measures and to obtain the weighted performance estimate: 632 = 0.632 × + 0.368 × by combining the pessimist test performance with the optimist training performance. This study performs stratified repeated cross-validation for model selection in order to guarantee the target-class balance in the model selection and reduces variability in the measurements of performance. In order to obtain the empirical distributions for the parameters, classical non-parametric bootstrap is used to allow further investigation of the selected factors' effects on the absenteeism. Finally, in order to compare the models' performance in terms of Accuracy, Sensitivity and Specificity, the 632+ bootstrap is used to obtain an estimate of the empirical distributions for model performance. Confidence intervals are obtained by using the quantile approach using the empirical distributions estimated by bootstrap, this method is used since it allows to obtain intervals for different models by the same non-parametric procedure.

Software and packages
All the analysis and plots present and discussed in this paper were produced using the R Programming language v3.5.1 (R CORE TEAM, 2020) and the RStudio IDE v1.3.125. The packages used were: caret for model fitting and selection (KUHN et al., 2019); ggplot2 for plots and graphics generation (WICKHAM, 2016); missForest for imputation of missing values (STEKHOVEN, 2013); tibble (MÜLLER and WICKHAM, 2019), dplyr  and tidyr (WICKHAM and HENRY, 2019) for data manipulation and cleansing; purrr (HENRY and WICKHAM, 2019) for efficient and readable iterations as well as furrr (VAUGHAN and DANCHO, 2018) for iterations' parallel processing in R. The model evaluation by non-parametric bootstrap and 632+ bootstrap methods was implemented by the author using a portion of the packages cited above.

Results and discussion
The variables with more than 15% missing cases were removed at the start of the analysis, namely Phosphorus, Vitamin D and the PTH hormone measurements. The missForest method was used on the standardized remaining 19 variables for missing cases imputation. Afterwards, Weight and Height were removed since BMI already accounts for most of their variability and, therefore, their removal is made to avoid multicollinearity and results in 17 variables in the whole dataset. Stepwise and grid-search of parameters for Lasso and Elastic-Net were done by using stratified 10-fold cross-validation repeated 10 times, which used the area under the ROC curve as the optimization metric. The resulting models were used to perform non-parametric bootstrap on the coefficients as well as 632+ bootstrap on the measurements of Accuracy, Sensitivity and Specificity. The subsequent subsections present the results of performance and estimated coefficients from the resulting models as well as the discussion in the context of this paper's objectives. It is important to underline that the significance level for this study is set to = 0.10 given the fact that it is an early, cross-sectional study that has the aim of detecting possible relations for further investigation. Also, discussion takes into consideration that the absenteeism occurrences were evaluated by a medical doctor through interviews with the individuals as a means to rule out causes clearly unrelated to Chronic Fatigue Syndrome.

Resulting models and effect sizes
The resulting models (Stepwise, Lasso with = 0.02409091 and Elastic-net with = 0.5868687 and = 0.03787879), selected based on the imputed data, were fitted to both the imputed and complete-cases datasets for comparison. The models from imputed and complete-case data sets were the same except for the fact that the imputed data had more precise confidence intervals due to it having more observations (621 as opposed to 501). Also, the coefficient for potassium was shrunk to zero in the complete-cases Elastic-Net, which was not the case for the model from imputed data. Given that, the results for the complete-cases data set's models were omitted for the sake of brevity. The relative effects of each model are presented in Figure 2 as an attempt to first compare the effects of the three different methods. As shown by the relative effects in Figure 2, accounting for the fact that there is a different number of variables for each model the coefficients for HDL and Sodium had similar relevance for both Lasso and Elastic-Net. Total Cholesterol was the third place in effect size for both models though, while the Stepwise had it as the coefficient with the highest relevance. Also, the Lasso Total Cholesterol relevance showed a high disparity when compared to the other two models, also explained by the higher relevance of HDL and Sodium in the Lasso as shown in Table 2. The fact that both Triglycerides and LDL had similar effects on the Stepwise model and contrary to the effects of HDL in the regularized models might be explained by their complementary property as decompositions of the Total Cholesterol. The coefficients are presented in Table 2 with the bootstrap estimates and confidence intervals.
Despite the confidence intervals for Lasso and Elastic-Net having limits relatively close to zero, all confidence intervals indicated significant effects for the selected variables in the models. All models agreed in negative effects of Total Cholesterol and Sodium in the risk of skipping work, meaning that individuals which are 1 standard deviation above the average in Total Cholesterol and Sodium would have a decrease of -0.6458 and -0.1711 in the log-odds of skipping work respectively, according to the Stepwise coefficients. However, coefficients for LDL and Triglycerides from the stepwise indicate an increase in the log-odds of absenteeism. Despite the regularization models not suggesting significant effects for LDL and Triglycerides, the effect of HDL is significant and inversely proportional to LDL since both are components of Total Cholesterol together with Triglycerides. The Elastic-Net fitted to the imputed data was the only model to suggest effects of Potassium and therefore, despite it being statistically significant with = 0.10 according to the CI, it was considered the variable with least evidence of significance. The bootstrap densities that originated the confidence intervals of Table 2 are shown in Figure  3 together with the intervals and bootstrap estimates for further understanding.  The densities on Figure 3 underline the main reason for the confidence intervals of regularized regression being so close to zero, namely the fact that the regularization introduces a bias towards zero in the estimates of the coefficients as stated by Heinze et al. (2018). One should note the difference between regularized estimates for Total Cholesterol and their bootstrap counterparts in Table 2 meaning that, according to the bootstrap estimates, the regularization techniques (Lasso and Elastic-Net) estimated a coefficient far from the average of its empirical distribution. However, the bootstrap estimates are simply the average and the Figure 3 densities underline that the regularized estimates were closer to the highest density area of the empirical distribution which does not match the mean like it would for symmetric distributions. This effect in the other regularized coefficients, despite softer, is present, nonetheless.
Other than being a tool for variable selection and improvement of predictions, the regularized regressions do not allow for clear interpretation of the coefficients, therefore remaining only as a concordance measurement in this context.
Despite the counter-intuitiveness of higher-than-average levels of Sodium as a beneficial factor, there are studies that agree with these results in the context of Chronic Fatigue Syndrome, therefore, suggesting the relation with absenteeism. According to Rowe and Calkins (1998), there is a substantial body of clinical evidence supporting the relationship between various forms of hypotension (including the neurally mediated) with CFS and idiopathic fatigue. Most recently, a pilot study conducted by Comhaire (2018) also suggests the benefits of sodium dichloroacetate treatment for patients with the syndrome. As for the cholesterol variables (Total, HDL, LDL and Triglycerides), there are also studies supporting the evidence found in the analysis presented in this paper. A clinical study performed by De Lorenzo et al. (1998) indicated that patients with CFS had higher levels of Triglycerides and lower levels of HDL when compared with patients without the syndrome. Also, the ratio HDL/Total cholesterol was significantly lower in CFS patients suggesting that higher Total Cholesterol conditioned on lower levels of Triglycerides and HDL were associated with a lower risk of CFS. More recently, a study conducted by Tomic et al. (2012) with a female group of patients as subjects also found higher levels of Triglycerides and lower levels of HDL in the CFS group of patients as opposed to the control group, the study found no evidence of difference for total and LDL cholesterol between groups. Lastly, for Potassium, Dechene (1993) suggests the relation of low levels of potassium with increased risk of CFS. However, studies that address this potential relation found in the literature were inconclusive: a study by Nijs et al. (2003) showed that while some patients with CFS had low levels of Potassium, others showed high levels, therefore, concluding that they presented abnormal levels of the mineral; another study, conducted by Lerner et al. (1997) found no evidence of difference in Potassium levels between CFS and control groups.

Performance evaluation
All models showed significant effects for variables relating them to the risk of absenteeism and, with support by the literature, indirectly relating absenteeism with the risk of developing Chronic Fatigue Syndrome. That considered, the possibility of discriminating between groups with high and low risk of absenteeism becomes of interest to increase the success of CFS prevention. The ROC curve for each of the three models obtained in this study are presented in Figure 4 (a) as the first assessment of the models' discrimination power. The area under the curve (AUC) obtained by the model selection was 0.5843, 0.5697 and 0.5746 for Stepwise, Lasso and Elastic-Net respectively. Not only the AUC but also the ROC curves for the models were very similar, the results slightly higher than 0.50 (Random classifier AUC) suggests poor discrimination. The fact that the AUC was also the metric of model selection leads to the necessity of measuring the variability in the performance measures. In order to assess the performance and its variability, Figure 4 (b) presents the 632+ bootstrap estimates, confidence intervals and densities for the measurements of Accuracy (ACC), Sensitivity (SNS) and Specificity (SPC) for all three models. As suggested by the low AUC, the results in Figure 4 (b) confirms that not only the performance is poor but also it is not statistically different from a random classifier at α = 0.10. The AUC is omitted from the plot since the fact of it being used as optimization metric biases it upwards forcing statistical significance according to the non-parametric confidence interval, therefore, inducing miss-interpretation. Despite not showing significant results in terms of performance, the empirical densities obtained by the 632+ bootstrap were very similar between the three models. This lack of discrimination power despite the detected variables might relate to the noise inherent to the dependent variables nature since it measures just whether there was at least one occurrence of absenteeism during the whole year of 2012. One might notice the lower variability in Accuracy as opposed to Sensitivity and Specificity, however, that is caused because Accuracy is nothing more than a weighted average of the latter two.

Strengths and limitations
This study is strengthened by the fact that the database has 621 observations of individuals, a size not commonly seen in most clinical trials. Also, the clinical measurements were assessed by appropriate techniques performed in a laboratory by trained professionals and researchers of the medical field. On the other hand, the study is limited by the cross-sectional design which assesses the individuals in a specific point in time and therefore not being able to confidently assume causation nor strong evidence based on the data observed. Also, the confounding or noise inherent to the measurement of the dependent variable of absenteeism (Skipped) since it is only the indicator of the occurrence of absenteeism in the whole year of 2012. Such limitation results in limited strength of detected relations and might hide additional relations due to confounding factors. Additional studies with more precise measurement of absenteeism, such as the number of occurrences, are necessary for further investigation as well as a longitudinal framework to allow causal inference.

Conclusions
Chronic Fatigue Syndrome (CFS) or myalgic encephalomyelitis (ME) is a critical disease due to its severity being comparable to type 1 diabetes mellitus or asthma. Yet the corpus of evidence of its causes as well as its relations with other conditions is still to be consolidated as the efforts are ongoing in the clinical scientific community. This study aimed to contribute to the corpus of evidence of CFS/ME by assessing indirectly the relations between CFS and absenteeism in shift workers of the mining industry, individuals that are inserted in an environment susceptible to a higher risk of CFS than usual. The models obtained in this study had no discrimination power between individuals with a higher and lower risk of absenteeism despite showing significant effects for several variables. However, the detected effects of 5 out of 6 significant variables were found to be related to the factors present in cases of Chronic Fatigue Syndrome according to the reviewed literature. These findings amount to some evidence of a relation between absenteeism and CFS/ME and the need for further investigation. The lack of discrimination power despite the presence of significant variables might happen due to the noise which is inherent to the dependent variable's nature, being just the indicator of whether absenteeism occurred in the whole year of 2012. This study's inferences are not enough to suggest interventions aiming at the prevention of disasters related to the mining work. However, we hope to draw the attention from direct and indirect agents to important relations identified that might affect the health and life quality of these workers. Future studies with more precise measurements of absenteeism and use of longitudinal frameworks might reveal stronger effects of the selected variables as well as significant discrimination power.