XML 구조와 처리 단위 기반 유사 데이터 추출
XML has established itself as the format for data exchange on the internet and the volume of its instance is large scale. Therefore, to extract similar information from XML instance is one of research topics but is insufficient. In this paper, we extract similar information from various kind of XML...
Saved in:
Published in | 한국컴퓨터정보학회논문지, 22(4) pp. 59 - 65 |
---|---|
Main Author | |
Format | Journal Article |
Language | Korean |
Published |
한국컴퓨터정보학회
01.04.2017
|
Subjects | |
Online Access | Get full text |
ISSN | 1598-849X 2383-9945 |
DOI | 10.9708/jksci.2017.22.04.059 |
Cover
Abstract | XML has established itself as the format for data exchange on the internet and the volume of its instance is large scale. Therefore, to extract similar information from XML instance is one of research topics but is insufficient.
In this paper, we extract similar information from various kind of XML instances according to the same goal. Also we use only the structure information of XML instance for information extraction because some of XML instance is described without its schema.
In order to efficiently extract similar information, we propose a minimum unit of processing and two approaches for finding the unit. The one is a structure-based method which uses only the structure information of XML instance and another is a measure-based method which finds a unit by numerical formula. Our two approaches can be applied to any application that needs the extraction of similar information based on XML data. Also the approach can be used for HTML instance. KCI Citation Count: 0 |
---|---|
AbstractList | XML has established itself as the format for data exchange on the internet and the volume of its instance is large scale. Therefore, to extract similar information from XML instance is one of research topics but is insufficient.
In this paper, we extract similar information from various kind of XML instances according to the same goal. Also we use only the structure information of XML instance for information extraction because some of XML instance is described without its schema.
In order to efficiently extract similar information, we propose a minimum unit of processing and two approaches for finding the unit. The one is a structure-based method which uses only the structure information of XML instance and another is a measure-based method which finds a unit by numerical formula. Our two approaches can be applied to any application that needs the extraction of similar information based on XML data. Also the approach can be used for HTML instance. KCI Citation Count: 0 |
Author | 박종현 |
Author_xml | – sequence: 1 fullname: 박종현 organization: (충남대학교) |
BackLink | https://www.kci.go.kr/kciportal/ci/sereArticleSearch/ciSereArtiView.kci?sereArticleSearchBean.artiId=ART002218303$$DAccess content in National Research Foundation of Korea (NRF) |
BookMark | eNrjYmDJy89LZWCQMTTQszQ3sNDPyi5OztQzMjA01zMy0jMw0TMwtWRi4DQytjDWtbQ0MWVh4DQ0tbTQtTCxjOBg4C0uzkwyMDYzMrc0MjXjZLCK8PVReLV1zZuFG97MbFB4s2nG62VrFF53r3gzp0Xh1Y4NrzfMUHgzZ8GbJqBg74Y3c7e8bdmg8GbblDfb5vAwsKYl5hSn8kJpbgZNN9cQZw_dvKK0-OzkzPj8xEwwnZ4fn10U7xgU4hlvCLTZxNzEmBS1AFilVNg |
ContentType | Journal Article |
DBID | ACYCR |
DOI | 10.9708/jksci.2017.22.04.059 |
DatabaseName | Korean Citation Index |
DatabaseTitleList | |
DeliveryMethod | fulltext_linktorsrc |
DocumentTitleAlternate | Extraction of similar XML data based on XML structure and processing unit |
EISSN | 2383-9945 |
EndPage | 65 |
ExternalDocumentID | oai_kci_go_kr_ARTI_1362474 |
GroupedDBID | ACYCR ALMA_UNASSIGNED_HOLDINGS M~E |
ID | FETCH-nrf_kci_oai_kci_go_kr_ARTI_13624743 |
ISSN | 1598-849X |
IngestDate | Tue Nov 21 21:49:59 EST 2023 |
IsPeerReviewed | false |
IsScholarly | false |
Language | Korean |
LinkModel | OpenURL |
MergedId | FETCHMERGED-nrf_kci_oai_kci_go_kr_ARTI_13624743 |
Notes | G704-001619.2017.22.4.003 |
ParticipantIDs | nrf_kci_oai_kci_go_kr_ARTI_1362474 |
PublicationCentury | 2000 |
PublicationDate | 2017-04 |
PublicationDateYYYYMMDD | 2017-04-01 |
PublicationDate_xml | – month: 04 year: 2017 text: 2017-04 |
PublicationDecade | 2010 |
PublicationTitle | 한국컴퓨터정보학회논문지, 22(4) |
PublicationYear | 2017 |
Publisher | 한국컴퓨터정보학회 |
Publisher_xml | – name: 한국컴퓨터정보학회 |
SSID | ssib036279256 ssib012146333 ssib001107257 ssib044738270 ssib008451689 |
Score | 3.2350304 |
Snippet | XML has established itself as the format for data exchange on the internet and the volume of its instance is large scale. Therefore, to extract similar... |
SourceID | nrf |
SourceType | Open Website |
StartPage | 59 |
SubjectTerms | 컴퓨터학 |
Title | XML 구조와 처리 단위 기반 유사 데이터 추출 |
URI | https://www.kci.go.kr/kciportal/ci/sereArticleSearch/ciSereArtiView.kci?sereArticleSearchBean.artiId=ART002218303 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
ispartofPNX | 한국컴퓨터정보학회논문지, 2017, 22(4), 157, pp.59-65 |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1Pi9NAFB9214sXWVHxP0Gcg5Ss6WSmmfGWpJFVXE8r9FaaNhUttFC6Fw-LyN7Em4cVKuhFPOwhiC5-pm38Dr73kjRxWXEVL8Pr9M3LvHmTvN8kb-YxdrsnlcKESLZMEseWYqBscMOubUySuHI4gDlFAbKPW5tP5MOO6qysurWopZ1ZvNF_ceK-kn-xKtSBXXGX7F9YdikUKoAG-0IJFobyVDbubD1q8MjngeJ-yKOQ-00eOEgYw7XTQCoQ3GgeBdxvAVMDKQ0_NHGFXMtcgqaGAZZGN4o_fZKlxbJhuxQPhORRG5pDTX6dFjeyJMI65kU-o1Be0dU2sQWFCONSbwpZpIWD_NgZt-RRqBESPtchdUXxgAhQPCBlfA-jNmDGCAGoWdZecZR6KWIT2BZFaZ6njStfeoAjrWJlcJr-947XPYDRtpaU3xccJNUBrMHZK1XtSV-cY55jhjzdxXFvZDzaYfF8BGAGgwi9DSHoXN2ybf3w72NO-Zfjv0f9Z92nk-5o2oVFzoNuE0CH9OQqOyM8j2ITtnajCgPDgl7UPmVrTMVcpRRoYiZ3t8L4IMszojpQSUrP1YKSLC6HIt9yiurcPUkZgGXj6bAGy7bX2bliPWX5-c1xnq2MJhfYPbgxrKNvB9nHNHv30sq-7C8-HViL15-z-Z519D1dpPtWNv-QvYLKN2n2_uuPvdTKDt9mh_OL7M79aDvctOFaNCK_Hxn3ElsbT8bJZWb1YieOtTPotwA395KB6Q97SR9wLTygvGairrBbf5Z39TRM19jZaqZeZ2uz6U5yAwDrLL5JBvoJw2dvJw |
linkProvider | ISSN International Centre |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=XML+%EA%B5%AC%EC%A1%B0%EC%99%80+%EC%B2%98%EB%A6%AC+%EB%8B%A8%EC%9C%84+%EA%B8%B0%EB%B0%98+%EC%9C%A0%EC%82%AC+%EB%8D%B0%EC%9D%B4%ED%84%B0+%EC%B6%94%EC%B6%9C&rft.jtitle=%ED%95%9C%EA%B5%AD%EC%BB%B4%ED%93%A8%ED%84%B0%EC%A0%95%EB%B3%B4%ED%95%99%ED%9A%8C%EB%85%BC%EB%AC%B8%EC%A7%80%2C+22%284%29&rft.au=%EB%B0%95%EC%A2%85%ED%98%84&rft.date=2017-04-01&rft.pub=%ED%95%9C%EA%B5%AD%EC%BB%B4%ED%93%A8%ED%84%B0%EC%A0%95%EB%B3%B4%ED%95%99%ED%9A%8C&rft.issn=1598-849X&rft.eissn=2383-9945&rft.spage=59&rft.epage=65&rft_id=info:doi/10.9708%2Fjksci.2017.22.04.059&rft.externalDBID=n%2Fa&rft.externalDocID=oai_kci_go_kr_ARTI_1362474 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1598-849X&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1598-849X&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1598-849X&client=summon |