On probabilistic notions of precision as a function of recall
Two problems that arise when recall and precision are used to evaluate information retrieval systems are due to the weak ordering of the documents generated by the system and evaluation with multiple queries. Although several alternative stopping criteria are available, our emphasis in this paper is...
Saved in:
Published in | Information processing & management Vol. 28; no. 3; pp. 291 - 315 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
Oxford
Elsevier Ltd
1992
Elsevier Science Pergamon Press Elsevier Science Ltd |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Two problems that arise when recall and precision are used to evaluate information retrieval systems are due to the weak ordering of the documents generated by the system and evaluation with multiple queries. Although several alternative stopping criteria are available, our emphasis in this paper is on defining precision when recall is used as the stopping criterion. A number of different probabilistic notions of precision for handling the problem of weak ordering have been proposed in the past, including
PRECALL, probability of relevance given retrieval (
PRR), and expected precision (
EP). Recently Raghavan
et al. provided a comparative analysis of
PRECALL,
PRR, and
EP. They showed that previous usages of
PRECALL for dealing with the problem of weak ordering and interpolation, which involved the application of
ceiling operation, are inconsistent, and the results obtained are not easy to interpret. Consequently, they introduced an interpolation scheme, termed intuitive interpolation, that leads to consistent and meaningful handling of averaging results given by
PRR over multiple queries. A simple way of calculating
PRR was also given. However, a comparable analysis of precision defined as
EPhas not been provided. Furthermore, given that several alternative ways of defining precision in a probabilistic sense are available, no theoretical basis for deciding which alternative to use in a specific situation exists. This paper initially investigates an efficient way of calculating
EP and an interpolation scheme for averaging
EP that are consistent with the intuitive interpolation scheme proposed for
PRR. In addition,
PRECALL with intuitive interpolation is termed
R-B Precision, and is shown to have interpretation as the value of
PRR and
EP, in the limit. From this result,
PRR and
EP are shown to be attractive in their ability to present experimental results in a descriptive sense. In contrast, in situations where experimental tests are intended for predictive use,
R-B Precision is shown to be a better choice. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
ISSN: | 0306-4573 1873-5371 |
DOI: | 10.1016/0306-4573(92)90077-D |