Extracting Know-Who/Know-How Using Development Project-Related Taxonomies
Product developers frequently discuss topics related to their development project with others, but often use technical terms whose meanings are not clear to non-specialists. To provide non-experts with precise and comprehensive understanding of the know-who/know-how being discussed, the method propo...
Saved in:
Published in | IEICE Transactions on Information and Systems Vol. E93.D; no. 10; pp. 2717 - 2727 |
---|---|
Main Authors | , , , , , |
Format | Journal Article |
Language | English |
Published |
The Institute of Electronics, Information and Communication Engineers
2010
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Product developers frequently discuss topics related to their development project with others, but often use technical terms whose meanings are not clear to non-specialists. To provide non-experts with precise and comprehensive understanding of the know-who/know-how being discussed, the method proposed herein categorizes the messages using a taxonomy of the products being developed and a taxonomy of tasks relevant to those products. The instances in the taxonomy are products and/or tasks manually selected as relevant to system development. The concepts are defined by the taxonomy of instances. That proposed method first extracts phrases from discussion logs as data-driven instances relevant to system development. It then classifies those phrases to the concepts defined by taxonomy experts. The innovative feature of our method is that in classifying a phrase to a concept, say C, the method considers the associations of the phrase with not only the instances of C, but also with the instances of the neighbor concepts of C (neighbor is defined by the taxonomy). This approach is quite accurate in classifying phrases to concepts; the phrase is classified to C, not the neighbors of C, even though they are quite similar to C. Next, we attach a data-driven concept to C; the data-driven concept includes instances in C and a classified phrase as a data-driven instance. We analyze know-who and know-how by using not only human-defined concepts but also those data-driven concepts. We evaluate our method using the mailing-list of an actual project. It could classify phrases with twice the accuracy possible with the TF/iDF method, which does not consider the neighboring concepts. The taxonomy with data-driven concepts provides more detailed know-who/know-how than can be obtained from just the human-defined concepts themselves or from the data-driven concepts as determined by the TF/iDF method. |
---|---|
ISSN: | 0916-8532 1745-1361 |
DOI: | 10.1587/transinf.E93.D.2717 |