Sensorial analysis of categorized data of special coffee to identify similar crop seasons pairs using Kappa
Main Article Content
Abstract
This paper presents the proposal of a statistical method to analyse dependent agreement data with categorical ordinal responses for a longitudinal study in sensorial analysis of special coffee. The assessment of sensory attributes of special coffees were carried out by certified raters using a continuous scale of grades. The approach aimed to applying data categorization methods commonly used in machine learning which generated not only a concise summary of continuous attributes to describe the data but also allowed to maximize the agreement grades in a longitudinal study. A previous analysis was carried out to identify the similarity of grades in all sample unities. The categorization allowed the construction of marginal models for all distinct pairs time observed in the longitudinal study for modeling the concordance correlations
kappa. It also enabled to conclude that samples of harvests related to yellow grain fruits have similar sensorial characteristics. Higher altitudes are significantly favorable to obtain samples with similar sensorial characteristics identifying the set of covariates which contributed either in positive or negative way while estimating kappa.
Article Details

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).
References
Akanda, M. A. S.; Khanam, M. et al. Goodness-of-fit tests for gee models using kappa-like
statistic to diabetes mellitus study. Journal of Applied Sciences, 5 (9), 1597–1601 (2005).
Avelino J.; Barboza, B.; Araya, J. C.; Fonseca, C.; Davrieux, F.; Guyout, B., & Cilas, C. Effects
of slope exposure, altitude and yield on coffee quality in two altitude terroirs of costa rica, orosi
and santa maría de dota. Journal of the Science of Food and Agriculture, 85 (11), 1869–1876 (2005).
Borém, F. M. Projeto protocolo de identidade, qualidade e rastreabilidade para embasamento da indicacao
geografica dos cafes da mantiqueira. [S. I.], (2007).
Boré, F. M.; Cirillo, M. A.; De Carvalho Alves, A. P.; Dos Santos, C. M.; Liska, G. R.; Ramos,
M. F.; & Lima, R. R. Coffee sensory quality study based on spatial distribution in the mantiqueira
mountain region of brazil. Journal of Sensory Studies, 35 (2), e12552 (2020).
Bor’em, F. M.; Luz, M. P. S.; Sáfadi, T.; Volpato, M. M. L.; Alves, H. M. R.; Borém, R. A. T.; &
Maciel, D. A. Meteorological variables and sensorial quality of coffee in the mantiqueira region
of minas gerais. Coffee Science, 14 (1), 38-47 (2019).
Borém, F. M. and Shuler, J. Handbook of Coffee Post-harvest Technology: A Comprehensive
Guide to the Processing, Drying, and Storage of Coffee. Gin Press. 282p. (2014)
Cohen, J. A coefficient of agreement for nominal scales. Educational and psychological measurement,
(1), 37–46 (1960).
Decazy,F.; Avelino, J.; Guyot, B.; Perriot, J.-J.; Pineda, C.; & Cilas, C. Quality of different
honduran coffees in relation to several environments. Journal of food science, 68 (7), 2356–2361
(2003).
Donner, A. & Klar, N. The statistical analysis of kappa statistics in multiple samples. Journal of
clinical epidemiology, 49 (9), 1053–1058 (1996).
Donner, A.; Shoukri, M. M.; Klar, N.; and Bartfay, E. Testing the equality of two dependent
kappa statistics. Statistics in Medicine, 19 (3), 373–387 (2000).
Duarte, G. S., Pereira, A. A., and Farah, A. Chlorogenic acids and other relevant compounds
in brazilian coffees processed by semi-dry and wet post-harvesting methods. Food Chemistry,
(3), 851–855 (2010).
Fleiss, J. L. Measuring nominal scale agreement among many raters. Psychological bulletin, 76,
(5), 378-382 (1971).
Giomo, G. and Borém, F. Cafés especiais no brasil: opção pela qualidade. Informe Agropecuário,
Belo Horizonte, 32 (261), 7–16 (2011).
Gonin, R.; Lipsitz, S. R.; Fitzmaurice, G. M.; and Molenberghs, G. Regression modelling of
weighted κ by using generalized estimating equations. Journal of the Royal Statistical Society: Series
C (Applied Statistics), 49 (1), 1–18 (2000).
Heagerty, P. J. and Zeger, S. L. Marginal regression models for clustered ordinal measurements.
Journal of the American Statistical Association, 91 (435), 1024–1036 (1996).
Joet, T.; Laffargue, A.; Descroix, F.; Doulbeau, S.; Bertrand, B.; Dussert, S. Influence of environmental
factors, wet processing and their interactions on the biochemical composition of
green arabica coffee beans. Food chemistry, 118 (3), 693–701 (2010).
Kerber, R. Chimerge: Discretization of numeric attributes. pages 123–128 (1992).
Klar, N.; Lipsitz, S. R.; and Ibrahim, J. G. An estimating equations approach for modelling
kappa. Biometrical Journal: Journal of Mathematical Methods in Biosciences, 42 (1), 45–58 (2000).
Knopp, S.; Bytof, G.; and Selmar, D. (2006). Influence of processing on the content of sugars
in green arabica coffee beans. European Food Research and Technology, 223 (2), 195-201 (2006).
Kurgan, L. A. and Cios, K. J. Caim discretization algorithm. IEEE transactions on Knowledge
and Data Engineering, 16 (2), 145–153 (2004).
Liang, K.-Y; and Zeger, S. L. Longitudinal data analysis using generalized linear models.
Biometrika, 73 (1), 13–22 (1986).
Liang, K.-Y; and Zeger, S. L.; and Qaqish, B. Multivariate regression analyses for categorical
data. Journal of the Royal Statistical Society: Series B (Methodological), 54 (1), 3–24 (1992).
Lingle, T. R. (2011). The coffee cupper’s handbook: a systematic guide to the sensory evaluation of
coffee’s flavor. Specialty Coffee Association of America Long Beach. 2011.
Ma, Y.; Tang, W.; Feng, C.; and Tu, X. M. Inference for kappas for longitudinal study
data:applications to sexual health research. Biometrics, 64 (3), 781–789 (2008).
Scholz, M. B.; Kitzberger, C. S. G.; Prudencio, S. H., et al. The typicity of coffees from
different terroirs determined by groups of physico-chemical and sensory variables and multiple
factor analysis. Food Research International, 114, 72–80 (2008).
Tolessa, K.; D’heer, J.; Duchateau, L.; and Boeckx, P. Influence of growing altitude, shade and
harvest period on quality and biochemical composition of ethiopian specialty coffee. Journal of
the Science of Food and Agriculture, 97 (9), 2849–2857 (2017).
Tsai, C.-J.; Lee, C.-I.; and Yang, W.-P. A discretization algorithm based on class-attribute
contingency coefficient. Information Sciences, 178 (3), 714–731 (2008).
Williamson, J. M.; Kim, K.; and Lipsity, S. R. Analyzing bivariate ordinal data using a global
odds ratio. Journal of the American Statistical Association, 90 (432), 1432–1437 (1995).
Williamson, J. M.; Lipsity, S. R; and Manatunga, A. K. Modeling kappa for measuring dependent
categorical agreement data. Biostatistics, 1 (2), 191–202 (2000).
Zeger, S. L.; and Liang, K.-Y. Longitudinal data analysis for discrete and continuous outcomes.
Biometrics, 42 (1), 121–130 (1986).