Applying informetric characteristics of databases to ir system file design, part I: Informetric models
This study examines how informetric characteristics of information retrieval (IR) system databases can be used to help the systems designer decide what types of file structures would provide the best performance for a given type of information system environment. In this first of two papers, the dev...
Saved in:
Published in | Information processing & management Vol. 28; no. 1; pp. 121 - 133 |
---|---|
Main Author | |
Format | Journal Article |
Language | English |
Published |
Oxford
Elsevier Ltd
1992
Elsevier Science Pergamon Press Elsevier Science Ltd |
Subjects | |
Online Access | Get full text |
ISSN | 0306-4573 1873-5371 |
DOI | 10.1016/0306-4573(92)90098-K |
Cover
Loading…
Summary: | This study examines how informetric characteristics of information retrieval (IR) system databases can be used to help the systems designer decide what types of file structures would provide the best performance for a given type of information system environment. In this first of two papers, the development of appropriate models describing database contents, to be used later in a simulation study, are dealt with. Database characteristics for which data were collected include: the index term frequency distribution, the distribution of terms used per query, and the distribution of term frequency selections. A shifted generalized Waring distribution was found to provide the best fit for the index term distributions with the large data sets used. For the terms used per query, a shifted negative binomial was found to provide a reasonable fit. A complex relationship was observed for the term selection distribution data, for which the empirical distribution is used. As well, four other hypothetical term selection relationships are presented. With this information, a simulation study examining system performance under different informetric environments can be undertaken. |
---|---|
Bibliography: | ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 14 ObjectType-Article-1 ObjectType-Feature-2 content type line 23 |
ISSN: | 0306-4573 1873-5371 |
DOI: | 10.1016/0306-4573(92)90098-K |