Sensorial analysis of categorized data of special coffee to identify similar crop seasons pairs using Kappa

Main Article Content

jackelya Silva
Marcelo Angelo Cirilo
Flávio Meira Borém
Diego Egídio Ribeiro
Loureço Manuel


This paper presents the proposal of a statistical method to analyse dependent agreement data with categorical ordinal responses for a longitudinal study in sensorial analysis of special coffee. The assessment of sensory attributes of special coffees were carried out by certified raters using a continuous scale of grades. The approach aimed to applying data categorization methods commonly used in machine learning which generated not only a concise summary of continuous attributes to describe the data but also allowed to maximize the agreement grades in a longitudinal study. A previous analysis was carried out to identify the similarity of grades in all sample unities. The categorization allowed the construction of marginal models for all distinct pairs time observed in the longitudinal study for modeling the concordance correlations
kappa. It also enabled to conclude that samples of harvests related to yellow grain fruits have similar sensorial characteristics. Higher altitudes are significantly favorable to obtain samples with similar sensorial characteristics identifying the set of covariates which contributed either in positive or negative way while estimating kappa.

Article Details

How to Cite
Silva, jackelya, Cirilo, M. A., Borém, F. M., Ribeiro, D. E., & Manuel, L. (2023). Sensorial analysis of categorized data of special coffee to identify similar crop seasons pairs using Kappa. Brazilian Journal of Biometrics, 41(1), 30–43.


Akanda, M. A. S.; Khanam, M. et al. Goodness-of-fit tests for gee models using kappa-like

statistic to diabetes mellitus study. Journal of Applied Sciences, 5 (9), 1597–1601 (2005).

Avelino J.; Barboza, B.; Araya, J. C.; Fonseca, C.; Davrieux, F.; Guyout, B., & Cilas, C. Effects

of slope exposure, altitude and yield on coffee quality in two altitude terroirs of costa rica, orosi

and santa maría de dota. Journal of the Science of Food and Agriculture, 85 (11), 1869–1876 (2005).

Borém, F. M. Projeto protocolo de identidade, qualidade e rastreabilidade para embasamento da indicacao

geografica dos cafes da mantiqueira. [S. I.], (2007).

Boré, F. M.; Cirillo, M. A.; De Carvalho Alves, A. P.; Dos Santos, C. M.; Liska, G. R.; Ramos,

M. F.; & Lima, R. R. Coffee sensory quality study based on spatial distribution in the mantiqueira

mountain region of brazil. Journal of Sensory Studies, 35 (2), e12552 (2020).

Bor’em, F. M.; Luz, M. P. S.; Sáfadi, T.; Volpato, M. M. L.; Alves, H. M. R.; Borém, R. A. T.; &

Maciel, D. A. Meteorological variables and sensorial quality of coffee in the mantiqueira region

of minas gerais. Coffee Science, 14 (1), 38-47 (2019).

Borém, F. M. and Shuler, J. Handbook of Coffee Post-harvest Technology: A Comprehensive

Guide to the Processing, Drying, and Storage of Coffee. Gin Press. 282p. (2014)

Cohen, J. A coefficient of agreement for nominal scales. Educational and psychological measurement,

(1), 37–46 (1960).

Decazy,F.; Avelino, J.; Guyot, B.; Perriot, J.-J.; Pineda, C.; & Cilas, C. Quality of different

honduran coffees in relation to several environments. Journal of food science, 68 (7), 2356–2361


Donner, A. & Klar, N. The statistical analysis of kappa statistics in multiple samples. Journal of

clinical epidemiology, 49 (9), 1053–1058 (1996).

Donner, A.; Shoukri, M. M.; Klar, N.; and Bartfay, E. Testing the equality of two dependent

kappa statistics. Statistics in Medicine, 19 (3), 373–387 (2000).

Duarte, G. S., Pereira, A. A., and Farah, A. Chlorogenic acids and other relevant compounds

in brazilian coffees processed by semi-dry and wet post-harvesting methods. Food Chemistry,

(3), 851–855 (2010).

Fleiss, J. L. Measuring nominal scale agreement among many raters. Psychological bulletin, 76,

(5), 378-382 (1971).

Giomo, G. and Borém, F. Cafés especiais no brasil: opção pela qualidade. Informe Agropecuário,

Belo Horizonte, 32 (261), 7–16 (2011).

Gonin, R.; Lipsitz, S. R.; Fitzmaurice, G. M.; and Molenberghs, G. Regression modelling of

weighted κ by using generalized estimating equations. Journal of the Royal Statistical Society: Series

C (Applied Statistics), 49 (1), 1–18 (2000).

Heagerty, P. J. and Zeger, S. L. Marginal regression models for clustered ordinal measurements.

Journal of the American Statistical Association, 91 (435), 1024–1036 (1996).

Joet, T.; Laffargue, A.; Descroix, F.; Doulbeau, S.; Bertrand, B.; Dussert, S. Influence of environmental

factors, wet processing and their interactions on the biochemical composition of

green arabica coffee beans. Food chemistry, 118 (3), 693–701 (2010).

Kerber, R. Chimerge: Discretization of numeric attributes. pages 123–128 (1992).

Klar, N.; Lipsitz, S. R.; and Ibrahim, J. G. An estimating equations approach for modelling

kappa. Biometrical Journal: Journal of Mathematical Methods in Biosciences, 42 (1), 45–58 (2000).

Knopp, S.; Bytof, G.; and Selmar, D. (2006). Influence of processing on the content of sugars

in green arabica coffee beans. European Food Research and Technology, 223 (2), 195-201 (2006).

Kurgan, L. A. and Cios, K. J. Caim discretization algorithm. IEEE transactions on Knowledge

and Data Engineering, 16 (2), 145–153 (2004).

Liang, K.-Y; and Zeger, S. L. Longitudinal data analysis using generalized linear models.

Biometrika, 73 (1), 13–22 (1986).

Liang, K.-Y; and Zeger, S. L.; and Qaqish, B. Multivariate regression analyses for categorical

data. Journal of the Royal Statistical Society: Series B (Methodological), 54 (1), 3–24 (1992).

Lingle, T. R. (2011). The coffee cupper’s handbook: a systematic guide to the sensory evaluation of

coffee’s flavor. Specialty Coffee Association of America Long Beach. 2011.

Ma, Y.; Tang, W.; Feng, C.; and Tu, X. M. Inference for kappas for longitudinal study

data:applications to sexual health research. Biometrics, 64 (3), 781–789 (2008).

Scholz, M. B.; Kitzberger, C. S. G.; Prudencio, S. H., et al. The typicity of coffees from

different terroirs determined by groups of physico-chemical and sensory variables and multiple

factor analysis. Food Research International, 114, 72–80 (2008).

Tolessa, K.; D’heer, J.; Duchateau, L.; and Boeckx, P. Influence of growing altitude, shade and

harvest period on quality and biochemical composition of ethiopian specialty coffee. Journal of

the Science of Food and Agriculture, 97 (9), 2849–2857 (2017).

Tsai, C.-J.; Lee, C.-I.; and Yang, W.-P. A discretization algorithm based on class-attribute

contingency coefficient. Information Sciences, 178 (3), 714–731 (2008).

Williamson, J. M.; Kim, K.; and Lipsity, S. R. Analyzing bivariate ordinal data using a global

odds ratio. Journal of the American Statistical Association, 90 (432), 1432–1437 (1995).

Williamson, J. M.; Lipsity, S. R; and Manatunga, A. K. Modeling kappa for measuring dependent

categorical agreement data. Biostatistics, 1 (2), 191–202 (2000).

Zeger, S. L.; and Liang, K.-Y. Longitudinal data analysis for discrete and continuous outcomes.

Biometrics, 42 (1), 121–130 (1986).