Interrater Agreement and Combining RatingsInterrater Agreement • November 29th, 2005
Contract Type FiledNovember 29th, 2005Some behaviors such as smiles require human raters for their measurement. A model of the rating process is explored that assumes that the probability distribution of overt rating responses depends on which of several underlying or latent responses occurred. The ideal of theoretically identical raters is considered and also departures from such identity. Methods for parameter estimation and assessing goodness of fit of the model are presented. A test of the hypothesis of identical raters is provided. Simulated data are used to explore different measures of agreement, optimal numbers of raters, how the ratings from multiple raters should be used to arrive at a final score for subsequent analysis, and the consequences of departures from the basic assumptions of identical raters and constant underlying response probabilities. The results indicate that often using two or three raters to rate all of the data, assessing the quality of their ratings by assessing their pairwise correlations, an