个人形象holsti信度公式
Some Coding Notes (extended) from Respons to Sexism journal
Procedural Memo: 10/22/97 Unitizing Reliability: Figures and Formula Holsti, O.R. (1969). Content analysis for the social sciences and
humanities. Reading, MA: Addison-Wesley.
SUMMARY OF SOURCES
HOLSTI (pp. 138-141) provides a couple of pretty basic formulas for determining IC reliability; One follows the formula:
维生素a作用
电池英语C.R. = 2M/N1 + N2,where “M is the number of coding decisions on which the two judgees are in agreement, and N1 and N2 refer to the number of coding decisions made by judges 1 and 2, respectively” (p. 140).
“This formula has been criticized, however, becau it does not take into account the extent
of inter-coder agreement which may result from chance (Bennett, Alpert, and Goldstein, 1954). By chance alone, agreement should increa as the number of categories decreas” (p. 140).
Scott’s pi corrects for the number of categories in the category t, but also for the probable frequency with which each is ud (reword this before put in press; veral phras are word for word):
pi= % obrved agreement - % expected agreement / 1 - % expected agreement
% agreement is found by “finding the proportion of items falling into each category of a category t, and summing the square of t ho proportions” (p. 140). Holsti gives an example, but the example only ems to reflect the categories of one of the two coders in his comparison—or it might reflect both, but in this example, both have ud each of 4 categories the same number of times; I don’t think this would always happen in real life. Holsti (1969) gives a third method, but it does not em to apply to the ca at hand.
While there are veral other methods cited (from 1940s and 1950s), Holsti (1969) es Scott’s pi as a good conr vative estimate.
However, note that the problem with the first formula (henceforth called Holsti’s formula) is that it capitalizes on chance *when there are a small number of categories*. Guetzkow, H. (1950). Unitizing and categorizing problems in coding
qualitative data. Journal of Clinical Psychology, 6, 47-58.
CODER A CODER B Totals
289 312
This figure was determined by adding the total number of units each saw across 5 questions for 20 surveys.
U = (O1 - O2) / (O1 + O2)
where O1 is the number of units Obrver 1 es in a text, and O2 is the number of ob
rvations Obrver 2 es in a text.
太宰治为什么自杀U = (289-312) / (289 + 312)
= -23 / 601
= .038
Since U is actually a measure of disagreement (Folger, Hewes, & Poole) we could say that there is an agreement of .962 agreement.
One of the problems with this figure is that it can obscure many differences. For example, in our data t there were veral occasions when Coder A saw one more unit than Coder B did, and others where Coder B saw one more than Coder A. Depending on how the units of a “text” are calculated, the differences can cancel each other out, giving an inflated reliability.
Folger et al. state the problem clearly:
“Although Guetzkow’s index is certainly uful, it falls short of being ideal. To be ideal, an index of unitizing reliability should estimate the degree of agreement between two or more coders in identifying specific gments of text. That is, an ideal index should quanitify the unit-by-unit agreement between two or more coders. Neither U or his more sophisticaed index bad on U does this. Guetzkow’s indices only show the degree to which two coders identify the same number of units in a text of fixed length, not whether tho units were in fact the same units.” (p. 120).
At the same time, Folger et al. suggest the index may be appropriate in certain situations. They suggest (but do not demonstrate) a way of looking at agreement in each objective gment (in our ca, the amount of agreement for each question?) and then calculating across gments. They refer the reader to:我与地坛赏析
犬字组词
Hewes et al. (1980)
Newtson & Engquist (1976)
Newtson et al. (1977) and
Ebbeson & Allen (1979)
for examples.
I will later check on the cites. One of them is:
十个优点Hewes, D.E., Planalp, S. K., & Streibel, M. (1980) Analyzing social interaction: Some excruciating models and exhilarating results. Communication Yearbook 4,
123-144.
有趣的小故事Folger et al. ask:
“Is it always necessary to go to so much work to provide evidence of unitizing reliability? Probably not in all cas. If one is using an exhaustive coding system, i.e., a coding system in which each and every act is coded, and Guetzkow’s U is quite low, perhaps .10 or below, it may prove unnecessary to perform a unit-by-unit [gment by gment?] analysis. Similarly, if the actual unit is relatively objective and easily coded, Guetzkow’s in
dices may suffice. On the other hand, if the units are subjective, the coding scheme is not exhaustive or the data arre to be ud for quential analysis (lagged-quential analyisi, Markov process, et.), unit-by-unit analysis is esntial. In any even some measure of unitizing relaibility should be reported in any quantative study of social interact.” (p. 121).