“Text” and “Text Mining” in the Academic Field of Natural Language Processing
Text mining is used to discover new knowledge or verify hypotheses based on a large collection of electronic text and has become one of standard methods used in various academic fields involving sociology. Natural language processing (NLP), which researches a computer-based means of processing natur...
Saved in:
Published in | Japanese Sociological Review Vol. 68; no. 3; pp. 351 - 367 |
---|---|
Main Author | |
Format | Journal Article |
Language | Japanese |
Published |
The Japan Sociological Society
2017
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Text mining is used to discover new knowledge or verify hypotheses based on a large collection of electronic text and has become one of standard methods used in various academic fields involving sociology. Natural language processing (NLP), which researches a computer-based means of processing natural languages such as Japanese and English, is an interdisciplinary field involving disciplines such as computer science, linguistics, and cognitive science. NLP is also one of essential components of text mining that needs to process large collection of text. This paper provides an overview of NLP and its model of “text”, and discusses “text mining” in the anticipation that it will become increasingly common.NLP drastically approximates a language and its texts by employing formal, mathematical, and simple models to develop new techniques. Consequently, considerable linguistic information such as that related to context is inevitably lost during text mining when using these general techniques of NLP. Furthermore, acquired fragments of knowledge and their interpretation as the last phase of text mining are affected. Experts must complement them with their knowledge of the object domain. Conversely, text mining is also an important field to which NLP has been applied. NLP not only provides generic analyses of texts but also tries to develop issue-based methods that aid the entire process of text mining. |
---|---|
ISSN: | 0021-5414 1884-2755 |
DOI: | 10.4057/jsr.68.351 |